why one char in UTF8 (3 bytes) converted to UTF16 becomes 6 bytes?

Jan Schenkel janschenkel at yahoo.com
Wed Mar 30 14:16:57 EDT 2011


--- On Wed, 3/30/11, Kee Nethery <kee at kagi.com> wrote:
> Dave it appears that you are
> absolutely correct. The "language" in the uniencode()
> function is what you have, not what you want it converted
> into.
> 
> I added a note to the uniencode function in the dictionary
> to try to make that extra clear. 
> 
> In Python, uniencode takes unicode and encodes it into
> something else. 
> In LiveCode, uniencode takes something and encodes it into
> unicode (UTF16).
> Two functions with the same name that do the exact
> opposite. Both uses are justifiable function names depending
> upon whether you view the world as normally unicode (Python)
> or normally not unicode (LiveCode).
> 
> In my ideal programming language world, there would be no
> uniencode or unidecode, there would just be encode and you
> would specify the from and the to. Then there would never be
> any confusion although you'd probably need a language format
> of "any" when you want the parser to figure it out.
> 
> encode <inputText> from <sourceEncoding> to
> <destinationEncoding>
> 
> Thanks,
> 
> Kee
> 

Ideally, all the conversion would take place at the end-points:
open file <theFilePath> for text read with encoding <theEncoding>
open file <theFilePath> for text write with encoding <theEncoding>
put <theVariable> into URL <theUrl> with encoding <theEncoding>
put URL <theUrl> into <theVariable> with encoding <theEncoding>

Internally, the engine would handle everything in UTF-16, or whatever is most appropriate and efficient; but reading and writing data to and from files, databases, etc. in another encoding should be as transparent as possible.

Just my 2 cents,

Jan Schenkel.
=====
Quartam Reports & PDF Library for LiveCode
www.quartam.com

=====
"As we grow older, we grow both wiser and more foolish at the same time."  (La Rochefoucauld)





More information about the use-livecode mailing list