Text encoding.

David Bovill david.bovill at gmail.com
Thu Sep 2 08:01:13 EDT 2021


Thanks for the question Alex, I’m wrestling with the same issues - but so far got no responses from encoding gurus here :)

This is my understanding:

1) Yes its recommended to textEncode text that comes from outside into Livecode’s internal native format (which is utf16).  Livecode handles everything internally “transparently” from then on - which I guess means all usual language and control operations expect this utf16 internal format. My guess is this is why a few things have got slower as compared with early versions of Livecode.
2) Without doing textEncode the engine tries to guess the encoding (duck-typing?) and does this in a platform specific way? Again exactly what is going on there is a bit opaque to me, but the take-home message is that this is slower and less robust. So yes -losing nothing (assuming the original file is utf8, and yes its the best alternative.

I thing the hard thing to find out is exactly what type of encoding some files are - would be great if there was a duck-typing service where we could paste text or upload files and it would say - hey this looks like utf8 - but that’s asking too much

📆    Schedule a call with me
On 2 Sep 2021, 12:12 +0100, Alex Tweedly via use-livecode <use-livecode at lists.runrev.com>, wrote:
> Sorry to drag us off the interesting topic of licensing :-) into some
> Livecode question.
>
> I know little or nothing about Unicode, text encodings, etc. - so my
> question is indeed naive.
>
> I have a text file (War & Peace from Project Gutenberg), about 3.4Mb.
> The Mac describes it simply as "Plain text".
>
> When I read that into a variable, and then do
>     replace tChar by SPACE in tWholeText
> it takes between 1000 and 4000 millisecs - versus the 8-10 msecs I had
> expected from other samples.
>
> If I put in
>     put textEncode(tWHoleText, "UTF8") into tWholeText
> before the replace then it does indeed tae 8-10 msecs.
>
> Q1. What (if anything) am I losing by doing that ?
>
> Q2. Is this the best alternative ?
>
> Additional info - I just discovered that according to 'more' command
> line, the file start with :
>
> <U+FEFF>The Project ....
>
> if that is useful.
>
> Many thanks,
>
> Alex.
>
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode



More information about the use-livecode mailing list