Date: prev next · Thread: first prev next last
2014 Archives by date, by thread · List index


Hi Joe,

I'm also running Win 7 64 bit, and my LO (4.1.1.2) executable sits under

c:\Program Files (x86)\LibreOffice 4\program\soffice.exe

running that with the "--help" parameter gives a dialog window with a
bunch of options, too many for the dialog window, which doesn't have a
scroll bar, so I can't see them all. It even writes over the close
button. There seems to be a bug...
Also, the close button cannot actually be clicked...

Luckily, at the bottom just before it gets cut off I see the
following, which, as I can't actually copy any of the text, is typed
out and may contain a spelling error or two, plus is badly formatted:

--convert-to output_file_extension[:output_filter_name] [--outdir
output_dir] files
    Batch convert files.
    If --outdir is not specified then current working dir is used as
    output_dir.
    Eg. --convert-to pdf *.doc
    --convert-to pdf:writer_pdf_Export --outdir /home/user *.doc

This seems to be what you need. You should be able to put all the files
in one directory and run LO with the parameter "--convert-to pdf
*.htm", possibly giving another directory as outdir, and possibly with
the --headless parameter.

Hope this helps.

Paul




On Thu, 10 Apr 2014 15:56:32 -0500
Joe B <paperbag76@gmail.com> wrote:

Hello Tom,

Thanks for being so nice and offering your help.

Yes, I am running Windows 7 64 bit.

My cmd.exe does not recognize "lowriter", but it does recognize
"soffice". When I typed "soffice --help", a window pops up called
"Help Message"  It lists the LibreOffice version I am using.  Then it
says : "Usage: soffice [options] [documents...]"
Then there is a list of options flags.   But the list is quite
short...only a page.  There's nothing in it about batch processing.
I really didn't glean anything from it.

I followed the link you gave me to
https://wiki.documentfoundation.org/Documentation/Other_Documentation_and_Resources#Programmersand
...well it's pretty overwhelming.  I then followed the link for
"Andrew
Pitonyak's macro page", hoping that he might have some pre-baked code
to run as a macro in LibreOffice to batch convert files between
different formats.  I downloaded his "Useful Macro Information" file,
but it's 518 pages!  I did look through the contents to try to find
something about batch processing or file format changing, but there
is nothing.

So...I'm stumped on what to do next.

Secondly, I should clarify more the files I'm starting with.  I
believe they were written in Microsoft Word from...whatever version
existed in the 90s.  Then at some later point, the files were somehow
converted to this crazy Microsoft XML format, but saved with an
".htm" file extension.  The files are full of bizarre Microsoft
Server-specific instructions that just totally break the webpage.
I'm using Apache as my server, not Microsoft Server, and Apache can't
understand all those weird Microsoft Server-specific commands like:
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"

I understand that file attachments are not allowed.  If it would help
to get a look at the source code for one of these files, just so you
have some inkling of what I'm talking about, I can try to post it
somewhere for you to take a gander at.

Thanks,
Joe
paperbag76@gmail.com






On Thu, Apr 10, 2014 at 8:28 AM, Tom Davies <tomcecf@gmail.com> wrote:

Hi :)
Ahh, just spotted the give-away ".exe" so it sounds like you are
using Windows.  It is still worth trying the "--help" tag to see if
you do get a quick-help cheat-sheet.

Let us know either way! :)
Regards from
Tom :)


On 10 April 2014 14:26, Tom Davies <tomcecf@gmail.com> wrote:

Hi :)
We call it "headless mode".  Errr, which OS are you using?  Is it a
Windows or a Gnu&Linux or Mac?

Headless mode can be scripted and there might even be a thread in
the archives that shows a decent script worth copying.  I think
the better way is to try using LibreOffice on the command-line and
get it doing more and more until you've figured it out.  For
example does soffice
or
lowriter
work from the command-line?  On my Gnu&Linux both work but some
OSes might be limited to using just 1 of those.  Then try, for
example lowriter --help
to get a quick cheat-sheet of options.

Hopefully people on this list can help but there might also be
documentation at

https://wiki.documentfoundation.org/Documentation/Other_Documentation_and_Resources#Programmers
or scroll up a bit to see what is in the "Corporate Users" section
of the page.


Attachments don't get to the mailing-list anyway!  You can use
Nabble to upload them to a central place so that people can choose
to look if they want.


I would try to keep the original documents in MS format so that if
there is any problem with some tiny subset of all the ones being
converted then you can focus on those and do them with a bit more
finesse.  However from Doc, Xls etc to Odt, Ods etc should work
reasonably well.

It's the DocX, XlsX etc that is a bit more unpredictable thanks to
MS's constant changing of that format (currently on at least 3
different "transistional" versions and at least 1 "strict" none of
which seem to fully comply with their ISO promise).  Even with
those i think a batch-process using a scripted headless mode is
the best plan and then deal with individual oddities later.

Regards from
Tom :)



On 10 April 2014 13:30, Joe B <paperbag76@gmail.com> wrote:

Hello all,

This is my first post.

I am working on migrating a website.  I am trying to convert many
files written in an old version of MS Word, which were then saved
as old Microsoft 2002/2003 XML files.  The files were saved using
an .htm extension.  The files are filled with Microsoft xml crud.
(I will just refer to them as .htm files for the rest of this
e-mail)

I found a simple solution, in simply opening the file in
LibreOffice Writer, and re-saving the file in HTML Document
(Writer) (.html) format. Now the files work great.

I don't want to do this one file at a time obviously, as there are
hundreds
of these .htm files.  I am trying to figure out a way to do this
for multiple files in a folder...I think the term is "batch
processing".

In other words, have a script that will:
1. iterate through each .htm file in a folder
2. open the file in LibreOffice Writer
3. save the .htm file in HTML Document (Writer)(.html) format
4. close the file
5. iterate over all the remaining files in the folder until all
files have
had their formats changed

Is there a way to do this via a command line script.  Or by
creating a batch file?

I'm sorry, I'm a bit of a novice when it comes to the command
line or batch
files.  I know how to open LibreOffice Writer.exe from the
command line with one argument, which will open that document,
but that's about it.

I have some experience in other scripting languages, like Python,
Perl, etc, but not windows scripting.  I am having a very
difficult time getting
this to work in Python, so I thought I would come here and try to
ask for guidance.

I could attach a copy of one of the .htm files that I am
converting if that
would help, but don't want to attach a file in my very first
e-mail.

thank you,
Joe
paperbag76@gmail.com

--
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems?
http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more:
http://wiki.documentfoundation.org/Netiquette List archive:
http://listarchives.libreoffice.org/global/users/ All messages
sent to this list will be publicly archived and cannot be deleted







-- 
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.