Monday 8 March 2010

recover text from docx

A friend got into trouble with a broken usb stick and sole copy of notes in a now unopenable .docx file. Oh dear.

If you google for apps to recover corrupt .docx files there are a phletora of tools! Yeesh! Make backups people. You have to get burned once I suppose before you learn to make backups properly. I remember in the good old days, (well, the old days, (well, 1993ish, (not so old))) when I got a bit overconfident and put tons of content into a .doc for my final year project ... AND then discovered that the files I had worked on for hours were unopenable despite a few backups along the way. Argh!

So the winner of this impromptu recover text from .docx file IS:

http://sourceforge.net/projects/docx2txt/

 Tadahhhhh!




 Congrats to Sandeep Kumar.

Extra especially pleased I am that it is a perl script :) Heh heh :)
ANd actually looking inside it's nice and straightforward. Open up internals of docx as xml and parse out text

I initially tried xxd |less   
Then had a look with emacs. Murghh not good :-7
Then some .exe shareware tools, docXConvertor.exe  Damaged-DOCX2TXT.exe meh meh

No comments: