keith.clarke at me.com
Thu Sep 2 07:53:20 EDT 2021
I may be wrong but I thought Mac’s ‘Plain Text’ just meant it’s a ‘text.txt’ MIME type file, which could be encoded as ASCII, UTF-8, UTF-16 or UTF-32, rather than a 'text.rtf’ rich text MIME type file, with the embedded markup for styling, such as bold, italic, etc.
The '<U+FEFF>’ at the start of the document is the Byte Order Mark, suggesting UTF-16 in ‘little-endian’ order - see https://en.wikipedia.org/wiki/Byte_order_mark
> On 2 Sep 2021, at 12:12, Alex Tweedly via use-livecode <use-livecode at lists.runrev.com> wrote:
> Sorry to drag us off the interesting topic of licensing :-) into some Livecode question.
> I know little or nothing about Unicode, text encodings, etc. - so my question is indeed naive.
> I have a text file (War & Peace from Project Gutenberg), about 3.4Mb. The Mac describes it simply as "Plain text".
> When I read that into a variable, and then do
> replace tChar by SPACE in tWholeText
> it takes between 1000 and 4000 millisecs - versus the 8-10 msecs I had expected from other samples.
> If I put in
> put textEncode(tWHoleText, "UTF8") into tWholeText
> before the replace then it does indeed tae 8-10 msecs.
> Q1. What (if anything) am I losing by doing that ?
> Q2. Is this the best alternative ?
> Additional info - I just discovered that according to 'more' command line, the file start with :
> <U+FEFF>The Project ....
> if that is useful.
> Many thanks,
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
More information about the use-livecode