OpenOffice.org has become more and more suited for replacing Microsoft Office. Sadly, porting lots of Microsoft Office documents to OpenOffice format (aka OpenDocument Format) is not easy. If you have only one, five, or ten files you could just open them one by one and resave in OpenOffice.org, but how about hundreds or thousands of multi-megabyte documents in a corporate environment?
This little program batch / mass convert Microsoft Office documents (*.doc, *.xls, and .ppt) to their OpenOffice equivalent (.odt, *.ods, and *.odp (aka OpenDocument Format)). The nice thing about this script is that it converts all documents specified in the input, including subdirectories. The result can be put in another directory, and it retains its original directory structure (the script takes care of keeping the relative path). Use it at your own risk!
Requirements
- Windows or Linux (tested in Windows XP SP2 and Ubuntu Linux 6.10)
- OpenOffice.org (tested in OpenOffice.org 2.0)
- Python (I think version >= 2.4). If you are in Windows, you need to download1 it (any recent version will do). Most Linux distributions have Python already in it, if not, don’t ask me how to install it ;).
Download
Download mso2ooo (15.88 kB)
How to Use
I wanted it to be as easy as possible as my skill allows but oh well ;)
Configuration
Configuration was done by editing the file mso2ooo_config.txt. source_dir is the directory where mso2ooo will look for MS Office documents (including subdirectories). dest_dir is the directory where mso2ooo will put the converted documents in (retaining directory structure). The rest of the configuration is not very important (and I am a lazy documentor, yes). Talking about directories (or folders, in WIndows term) here would be too long if you don’t understand about it. (the default configuration starts looking for documents from where the script is located, btw)
Running mso2ooo
For the impatient:
- configure anything in
mso2ooo_config.txt - run
mso2ooo.py - run
mso2ooo.odt
Conversion is done in two steps2: listing files that need to be converted and converting them (for real). The list generation is done by python (mso2ooo.py), while the rest of the work is done by OpenOffic.org Basic macros inside mso2ooo.odt.
Firstly, run mso2ooo.py. In Windows, you can run mso2ooo.py by opening / clicking / double-clicking it in Windows Explorer. In (my Ubuntu) Linux, you need to run python ./mso2ooo.py from the command line. I have created mso2ooo.sh that does this (so you only need to open / click / double-click it just like Windows’).
If you don’t mess up with your configuration and you have Python, a window will appear for a brief moment (depending on how long the scan takes place). After the windows disappear (or the script have finished its job) there will be a new file, _listoffice.txt. You don’t need to do anything with that file.
Then, open mso2ooo.odt. Be sure to enable macros! The conversion needs macros enabled. I do not install any malware in the macro (but third parties can, so use it at your own risk). You should be able to see some kind of progress (i.e. that a file is being converted). When it finishes (it could take a very long time, especially if your presentations are hundreds of Mb) it will say “xxx file(s) converted”, and a list of documents that failed to be converted (if any).
You may then close the document. You may save it but I don’t recommend you to do so (messages from the last conversion will be left behind if you do that).
Etc
The Python script (not the script in mso2ooo.odt!) tries not to overwrite already existing documents. So running mso2ooo twice will be OK.
Almost all of the conversion code (and the idea, as well) was taken from an XML.com article, which was taken from an OpenOffice.org forum posting.
UPDATE
On Nov 5 Adam Goodfriend wrote me an email:
Thank you for your work on the mso2ooo converter, I have found it very helpful. I was also looking for a tool to take ODF files and convert them to Microsoft office files so I edited your macro to allow for this. Attached are my changes to your file, please feel free to edit it or post it on your site.
Thanks Adam! I am happy to help :) you can download his attachment.
1 I know that OpenOffice.org everywhere includes a Python implementation by default, including in Windows, but I can’t make it work. Sorry but currently you need to download the whole Python for Windows.
2 If someone could make the equivalent of mso2ooo.py in OpenOffice.org Basic, it would be just one step. Or integrate mso2ooo.py in mso2ooo.odt (OpenOffice.org documents can contain Python scripts, it’s just I can’t do it). This also solves the problem of Python in Windows.
| Attachment | Size |
|---|---|
| mso2ooo.zip | 15.88 KB |
| mso2ooo (from Adam Goodfriend).zip | 16.89 KB |
convert to txt or xhtml
Hi,
Can I change the macro to convert to another formats??
Changing the ms2000_config.txt to
NOT DO THIS.
Thanks!
I never tried to use it but
I never tried to use it but supposedly it can be. If "txt" does not work, try "text" or something similar. If you really need the functionality, check OOo documentation (I'm not aware where it is).
doc2txt convertion
I try "Text" sugested by rpmforge OO formats list, and many others... it produced a binary file, and, if I change extension to odt, IT IS A ODT file!
We need to change de MACRO at mso2ooo.odt (to add a new SaveAsTxt sub) , but the mso2ooo IS NOT EDITABLE.
UPDATE: see the lower
UPDATE: see the lower portion of the post for update. Thanks Adam!
odt ods TO doc xls
Can I do:
source: odt ods
to: doc xls
can this be used to convert
can this be used to convert to pdf ?
thanks.
Post new comment