Text encoding: summary of results and times.

Bob Sneidar bobsneidar at iotecdigital.com
Tue Sep 7 12:22:20 EDT 2021


This makes sense to me (I think) because if I am not mistaken, UTF16 is Unicode, and UTF8 is simple ASCII. The slowdown from 6.7 to 7.0 was precicely the support for Unicode text. Someone will correct me if I am wrong about this. As a hobbyist, I try and stay away from localization issues. But I am interested in the idea that all text incoming should be text decoded and outgoing the inverse. (Did I get that right??) 

Bob S


> On Sep 3, 2021, at 17:29 , Alex Tweedly via use-livecode <use-livecode at lists.runrev.com> wrote:
> 
> I went back and re-did the tests, checking on the results.
> 
> The file *is* UTF8, so I need to textDecode() it; if I don't, the result are simply wrong, and so the times are irrelevant.
> 
> 1. Once it has been textDecoded(), i.e. is in internal format, and I run my algorithm it gets the correct results, taking 115.1 seconds.
> 
> 2. BUT, if just before the algorithm is run, I do a textEncode(tStr, "UTF8") , it gets the correct results (identical to the above), but in only 3.3 seconds.
> 
> The code, in a zip file containing the test stack, SpellCheck Library, and the 'bible' and "war&peace" sample textfiles, can be downloaded from
> 
>     https://www.tweedly.org/Downloads/SpellLib.gz
> 
> if anyone wants to look at it.
> 
> Alex.
> 
> 
> 
> On 03/09/2021 13:38, Alex Tweedly via use-livecode wrote:
>> 
>> On 03/09/2021 11:07, David V Glasgow via use-livecode wrote:
>> 
>>> Alex states that put textEncode(tWHoleText, "UTF8") into tWholeText speeds replace up, but David B says LC internal format is UTF16.  Doesn’t the 8 vs 16 difference matter?  Or matters less than other encodings?
>> 
>> I would regard that timing comparison with much suspicion. I was textEncoding() it inappropriately - I had just read it in from a file, so I *should* have been textDecoding() it. Therefore it is unclear whether the times I was seeing then are meaningful.
>> 
>> Alex.
>> 
>> 
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode



More information about the use-livecode mailing list