working with unicodeFormattedText
dsc at swcp.com
Mon Jun 10 19:09:27 EDT 2013
You don't have to think UTF-16, maybe the "why" is distracting. Those lines I gave you should work.
To put a UTF-8 string into the field...
set the unicodeText of field "unicodeText" to uniEncode(UTF8String,"UTF8")
To get a UTF-8 string from the field...
put uniDecode( the unicodeText of field "Unicode Text", "UTF8" ) into UTF8String
Or, the last one encapsulated in a function...
function utf8FromField s
return uniDecode( the unicodeText of field s, "UTF8" )
On Jun 10, 2013, at 4:02 PM, Dr. Hawkins wrote:
> On Mon, Jun 10, 2013 at 1:52 PM, Dar Scott <dsc at swcp.com> wrote:
>> I neglected to explain why.
>> The short "why" is that what you get from unicodeText is UTF-16 (16-bit characters, mostly)
>> in native byte order, that is, the order the computer likes. Those same characters can be
>> represented in UTF-8, which is nice for text that is mostly ASCII, is robust concerning
>> byte-order issues, is efficient in memory needs (but not compressed) and yet can represent
>> all of Unicode. LiveCode strings (in the current version) are really just byte sequences we
>> interpret as characters. Each Unicode character we rip out of a field is two bytes.
> UTF-16 opens an entire new can of worms . . .
> I want to stay at utf8, and even have a very, very limited use for
> that instead of plain ascii. Curly quotes are nice, and I need things
> like ñ for names, and that's it.
> Turning things from native to UTF8 on the way to the db will solve
> what I need--but I'm not quite clear how to do this (all my
> machinations so far have failed), and I'm not clear whether I need to
> watch somehow for non native (say, pasted from a webpage), or across a
> VM from another operating system.
> Dr. Richard E. Hawkins, Esq.
> (702) 508-8462
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
More information about the Use-livecode