Date: prev next · Thread: first prev next last
2017 Archives by date, by thread · List index


On 10/03/2017 05:26 PM, Krunose wrote:

Writer shouldn't show different word count for odt file and different for docx file. Same for 
plain text and HTML.

This gets into when presentation markup, structural markup, and
syntactical markup are treated as words that are explicitly content, and
when they are ignored, because they are background noise.

Think of it this way.
* Is white space background noise, or significant content?

If the algorithm treats white space as background noise, it will produce
a different count, than if treats white space as significant content.

One other potential issue, is legitimate, explicit content, being
flagged as background noise, because it contains a sub-string that can
be confused with markup.

jonathon

-- 
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.