why one char in UTF8 (3 bytes) converted to UTF16 becomes 6 bytes?
Jan Schenkel
janschenkel at yahoo.com
Wed Mar 30 14:16:57 EDT 2011
--- On Wed, 3/30/11, Kee Nethery <kee at kagi.com> wrote:
> Dave it appears that you are
> absolutely correct. The "language" in the uniencode()
> function is what you have, not what you want it converted
> into.
>
> I added a note to the uniencode function in the dictionary
> to try to make that extra clear.
>
> In Python, uniencode takes unicode and encodes it into
> something else.
> In LiveCode, uniencode takes something and encodes it into
> unicode (UTF16).
> Two functions with the same name that do the exact
> opposite. Both uses are justifiable function names depending
> upon whether you view the world as normally unicode (Python)
> or normally not unicode (LiveCode).
>
> In my ideal programming language world, there would be no
> uniencode or unidecode, there would just be encode and you
> would specify the from and the to. Then there would never be
> any confusion although you'd probably need a language format
> of "any" when you want the parser to figure it out.
>
> encode <inputText> from <sourceEncoding> to
> <destinationEncoding>
>
> Thanks,
>
> Kee
>
Ideally, all the conversion would take place at the end-points:
open file <theFilePath> for text read with encoding <theEncoding>
open file <theFilePath> for text write with encoding <theEncoding>
put <theVariable> into URL <theUrl> with encoding <theEncoding>
put URL <theUrl> into <theVariable> with encoding <theEncoding>
Internally, the engine would handle everything in UTF-16, or whatever is most appropriate and efficient; but reading and writing data to and from files, databases, etc. in another encoding should be as transparent as possible.
Just my 2 cents,
Jan Schenkel.
=====
Quartam Reports & PDF Library for LiveCode
www.quartam.com
=====
"As we grow older, we grow both wiser and more foolish at the same time." (La Rochefoucauld)
More information about the use-livecode
mailing list