working with unicodeFormattedText

Dr. Hawkins dochawk at gmail.com
Mon Jun 10 18:02:40 EDT 2013


On Mon, Jun 10, 2013 at 1:52 PM, Dar Scott <dsc at swcp.com> wrote:
> I neglected to explain why.
>
> The short "why" is that what you get from unicodeText is UTF-16 (16-bit characters, mostly)
>in native byte order, that is, the order the computer likes.  Those same characters can be
>represented in UTF-8, which is nice for text that is mostly ASCII, is robust concerning
>byte-order issues, is efficient in memory needs (but not compressed) and yet can represent
>all of Unicode.  LiveCode strings (in the current version) are really just byte sequences we
>interpret as characters.  Each Unicode character we rip out of a field is two bytes.

UTF-16 opens an entire new can of worms . . .

I want to stay at utf8, and even have a very, very limited use for
that instead of plain ascii.  Curly quotes are nice, and I need things
like ñ for names, and that's it.

Turning things from native to UTF8 on the way to the db will solve
what I need--but I'm not quite clear how to do this (all my
machinations so far have failed), and I'm not clear whether I need to
watch somehow for non native (say, pasted from a webpage), or across a
VM from another operating system.
-- 
Dr. Richard E. Hawkins, Esq.
(702) 508-8462




More information about the use-livecode mailing list