why one char in UTF8 (3 bytes) converted to UTF16 becomes 6 bytes?

Dave Cragg dave.cragg at lacscentre.co.uk
Wed Mar 30 03:05:16 EDT 2011


On 30 Mar 2011, at 02:30, Kee Nethery wrote:

> I have the don't sign symbol (Combining enclosing circle backslash) in a text file that I read into livecode. For grins, the character between "Petro" and "Max" seen below.
> 
> Petro⃠Max
> 
> When I scan the bytes, in UTF8, this is encoded as: 226 131 160 also known as E2 83 A0. This is the correct UTF8 encoding for this character.
> 
> When I convert this to UTF16 using
> 
> uniencode(theUtf8Text) or uniencode(theUtf8Text,"UTF16") the byte values are: 26 32 201 0 32 32


Shouldn't that be uniEncode(theUtf8Text, "UTF8") ?

Dave






More information about the use-livecode mailing list