Date: prev next · Thread: first prev next last
2014 Archives by date, by thread · List index


Hello Tom,

Thanks for being so nice and offering your help.

Yes, I am running Windows 7 64 bit.

My cmd.exe does not recognize "lowriter", but it does recognize "soffice".
When I typed "soffice --help", a window pops up called "Help Message"  It
lists the LibreOffice version I am using.  Then it says :
"Usage: soffice [options] [documents...]"
Then there is a list of options flags.   But the list is quite short...only
a page.  There's nothing in it about batch processing.  I really didn't
glean anything from it.

I followed the link you gave me to
https://wiki.documentfoundation.org/Documentation/Other_Documentation_and_Resources#Programmersand
...well it's pretty overwhelming.  I then followed the link for
"Andrew
Pitonyak's macro page", hoping that he might have some pre-baked code to
run as a macro in LibreOffice to batch convert files between different
formats.  I downloaded his "Useful Macro Information" file, but it's 518
pages!  I did look through the contents to try to find something about
batch processing or file format changing, but there is nothing.

So...I'm stumped on what to do next.

Secondly, I should clarify more the files I'm starting with.  I believe
they were written in Microsoft Word from...whatever version existed in the
90s.  Then at some later point, the files were somehow converted to this
crazy Microsoft XML format, but saved with an ".htm" file extension.  The
files are full of bizarre Microsoft Server-specific instructions that just
totally break the webpage.  I'm using Apache as my server, not Microsoft
Server, and Apache can't understand all those weird Microsoft
Server-specific commands like:
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"

I understand that file attachments are not allowed.  If it would help to
get a look at the source code for one of these files, just so you have some
inkling of what I'm talking about, I can try to post it somewhere for you
to take a gander at.

Thanks,
Joe
paperbag76@gmail.com






On Thu, Apr 10, 2014 at 8:28 AM, Tom Davies <tomcecf@gmail.com> wrote:

Hi :)
Ahh, just spotted the give-away ".exe" so it sounds like you are using
Windows.  It is still worth trying the "--help" tag to see if you do get a
quick-help cheat-sheet.

Let us know either way! :)
Regards from
Tom :)


On 10 April 2014 14:26, Tom Davies <tomcecf@gmail.com> wrote:

Hi :)
We call it "headless mode".  Errr, which OS are you using?  Is it a
Windows or a Gnu&Linux or Mac?

Headless mode can be scripted and there might even be a thread in the
archives that shows a decent script worth copying.  I think the better way
is to try using LibreOffice on the command-line and get it doing more and
more until you've figured it out.  For example does
soffice
or
lowriter
work from the command-line?  On my Gnu&Linux both work but some OSes
might be limited to using just 1 of those.  Then try, for example
lowriter --help
to get a quick cheat-sheet of options.

Hopefully people on this list can help but there might also be
documentation at

https://wiki.documentfoundation.org/Documentation/Other_Documentation_and_Resources#Programmers
or scroll up a bit to see what is in the "Corporate Users" section of the
page.


Attachments don't get to the mailing-list anyway!  You can use Nabble to
upload them to a central place so that people can choose to look if they
want.


I would try to keep the original documents in MS format so that if there
is any problem with some tiny subset of all the ones being converted then
you can focus on those and do them with a bit more finesse.  However from
Doc, Xls etc to Odt, Ods etc should work reasonably well.

It's the DocX, XlsX etc that is a bit more unpredictable thanks to MS's
constant changing of that format (currently on at least 3 different
"transistional" versions and at least 1 "strict" none of which seem to
fully comply with their ISO promise).  Even with those i think a
batch-process using a scripted headless mode is the best plan and then deal
with individual oddities later.

Regards from
Tom :)



On 10 April 2014 13:30, Joe B <paperbag76@gmail.com> wrote:

Hello all,

This is my first post.

I am working on migrating a website.  I am trying to convert many files
written in an old version of MS Word, which were then saved as old
Microsoft 2002/2003 XML files.  The files were saved using an .htm
extension.  The files are filled with Microsoft xml crud. (I will just
refer to them as .htm files for the rest of this e-mail)

I found a simple solution, in simply opening the file in LibreOffice
Writer, and re-saving the file in HTML Document (Writer) (.html) format.
Now the files work great.

I don't want to do this one file at a time obviously, as there are
hundreds
of these .htm files.  I am trying to figure out a way to do this for
multiple files in a folder...I think the term is "batch processing".

In other words, have a script that will:
1. iterate through each .htm file in a folder
2. open the file in LibreOffice Writer
3. save the .htm file in HTML Document (Writer)(.html) format
4. close the file
5. iterate over all the remaining files in the folder until all files
have
had their formats changed

Is there a way to do this via a command line script.  Or by creating a
batch file?

I'm sorry, I'm a bit of a novice when it comes to the command line or
batch
files.  I know how to open LibreOffice Writer.exe from the command line
with one argument, which will open that document, but that's about it.

I have some experience in other scripting languages, like Python, Perl,
etc, but not windows scripting.  I am having a very difficult time
getting
this to work in Python, so I thought I would come here and try to ask for
guidance.

I could attach a copy of one of the .htm files that I am converting if
that
would help, but don't want to attach a file in my very first e-mail.

thank you,
Joe
paperbag76@gmail.com

--
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems?
http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be
deleted





-- 
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.