OpenOffice.org has become more and more suited for replacing Microsoft Office. Sadly, porting lots of Microsoft Office documents to OpenOffice format (aka OpenDocument Format) is not easy. If you have only one, five, or ten files you could just open them one by one and resave in OpenOffice.org, but how about hundreds or thousands of multi-megabyte documents in a corporate environment?
This little program batch / mass convert Microsoft Office documents (*.doc, *.xls, and .ppt) to their OpenOffice equivalent (.odt, *.ods, and *.odp (aka OpenDocument Format)). The nice thing about this script is that it converts all documents specified in the input, including subdirectories. The result can be put in another directory, and it retains its original directory structure (the script takes care of keeping the relative path). Use it at your own risk!
- Windows or Linux (tested in Windows XP SP2 and Ubuntu Linux 6.10)
- OpenOffice.org (tested in OpenOffice.org 2.0)
- Python (I think version >= 2.4). If you are in Windows, you need to download it (any recent version will do). Most Linux distributions have Python already in it, if not, don’t ask me how to install it ;).
Download mso2ooo (15.88 kB)
How to Use
I wanted it to be as easy as possible as my skill allows but oh well ;)
Configuration was done by editing the file
source_dir is the directory where mso2ooo will look for MS Office documents (including subdirectories).
dest_dir is the directory where mso2ooo will put the converted documents in (retaining directory structure). The rest of the configuration is not very important (and I am a lazy documentor, yes). Talking about directories (or folders, in WIndows term) here would be too long if you don’t understand about it. (the default configuration starts looking for documents from where the script is located, btw)
For the impatient:
- configure anything in
Conversion is done in two steps: listing files that need to be converted and converting them (for real). The list generation is done by python (
mso2ooo.py), while the rest of the work is done by OpenOffic.org Basic macros inside
mso2ooo.py. In Windows, you can run
mso2ooo.py by opening / clicking / double-clicking it in Windows Explorer. In (my Ubuntu) Linux, you need to run
python ./mso2ooo.py from the command line. I have created
mso2ooo.sh that does this (so you only need to open / click / double-click it just like Windows’).
If you don’t mess up with your configuration and you have Python, a window will appear for a brief moment (depending on how long the scan takes place). After the windows disappear (or the script have finished its job) there will be a new file,
_listoffice.txt. You don’t need to do anything with that file.
mso2ooo.odt. Be sure to enable macros! The conversion needs macros enabled. I do not install any malware in the macro (but third parties can, so use it at your own risk). You should be able to see some kind of progress (i.e. that a file is being converted). When it finishes (it could take a very long time, especially if your presentations are hundreds of Mb) it will say “xxx file(s) converted”, and a list of documents that failed to be converted (if any).
You may then close the document. You may save it but I don’t recommend you to do so (messages from the last conversion will be left behind if you do that).
The Python script (not the script in
mso2ooo.odt!) tries not to overwrite already existing documents. So running mso2ooo twice will be OK.
Almost all of the conversion code (and the idea, as well) was taken from an XML.com article, which was taken from an OpenOffice.org forum posting.
On Nov 5 Adam Goodfriend wrote me an email:
Thank you for your work on the mso2ooo converter, I have found it very helpful. I was also looking for a tool to take ODF files and convert them to Microsoft office files so I edited your macro to allow for this. Attached are my changes to your file, please feel free to edit it or post it on your site.
Thanks Adam! I am happy to help :) you can download his attachment.