Date: prev next · Thread: first prev next last
2015 Archives by date, by thread · List index


Gary Collins wrote:
I've just tried this in firefox. A couple of things:

1) access to the file save option seems to be indirect - but maybe there is a way to customise 
this, I haven't got time to check at the moment; but

2) much more importantly - it does indeed save as a single file, but the formatting is *awful* 
nothing like the original page, and pictures, graphics, etc are not there - just links. That's not 
what I want to see when  I open a webpage file, I want to see the page (more or less) the same as 
it was originally when I opened it online.

I think Firefox is similar, but in SeaMonkey (based on Firefox) under File > Save As there are a couple of options:

- "Web Page, complete" saves referenced files images, stylesheets, etc. in a folder alongside the HTML file and changes the references in the HTML file to refer to the saved copies. It still sometimes misses some, I guess if the references are generated dynamically by a script it can't reliably predict what might be needed.

- "Web Page, HTML only" just saves the HTML file without all the other resources. In that case, I think the references are left as they were originally, so you'll see them if you have Internet access and they're still available at the same URLs (or your browser has cached them). If not, you won't get the images, stylesheets, etc.

Which raises another issue: I save pages for later viewing on a machine that doesn't have an 
internet connection. I'm assuming that an internet connection will be essential to follow any of 
the hyperlinks in the file, which would render the format useless for my purposes.

You might want to look at wget:
  https://www.gnu.org/software/wget/
It's a command-line utility which can not only download web pages along with referenced resources, just as the "complete" option above does, but also recursively follow links - so you would be able to follow them offline. I believe it mirrors the structure of resources from the server, so for example if the same image is used on every page you only download it once.

I guess it would still suffer the same limitations with dynamically generated references.

Mark.


--
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.