why one char in UTF8 (3 bytes) converted to UTF16 becomes 6 bytes?

Kee Nethery kee at kagi.com
Wed Mar 30 12:43:02 EDT 2011


Dave it appears that you are absolutely correct. The "language" in the uniencode() function is what you have, not what you want it converted into.

I added a note to the uniencode function in the dictionary to try to make that extra clear. 

In Python, uniencode takes unicode and encodes it into something else. 
In LiveCode, uniencode takes something and encodes it into unicode (UTF16).
Two functions with the same name that do the exact opposite. Both uses are justifiable function names depending upon whether you view the world as normally unicode (Python) or normally not unicode (LiveCode).

In my ideal programming language world, there would be no uniencode or unidecode, there would just be encode and you would specify the from and the to. Then there would never be any confusion although you'd probably need a language format of "any" when you want the parser to figure it out.

encode <inputText> from <sourceEncoding> to <destinationEncoding>

Thanks,

Kee


On Mar 30, 2011, at 12:05 AM, Dave Cragg wrote:

> 
> On 30 Mar 2011, at 02:30, Kee Nethery wrote:
> 
>> I have the don't sign symbol (Combining enclosing circle backslash) in a text file that I read into livecode. For grins, the character between "Petro" and "Max" seen below.
>> 
>> Petro⃠Max
>> 
>> When I scan the bytes, in UTF8, this is encoded as: 226 131 160 also known as E2 83 A0. This is the correct UTF8 encoding for this character.
>> 
>> When I convert this to UTF16 using
>> 
>> uniencode(theUtf8Text) or uniencode(theUtf8Text,"UTF16") the byte values are: 26 32 201 0 32 32
> 
> 
> Shouldn't that be uniEncode(theUtf8Text, "UTF8") ?
> 
> Dave
> 
> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode




-------------------------------------------------
I check email roughly 2 to 3 times per day. 
Kagi main office: +1 (510) 550-1336







More information about the use-livecode mailing list