Unicode and the higher planes of existence.

Richmond richmondmathewson at gmail.com
Sat Dec 15 08:10:27 CST 2012


On 12/13/2012 04:29 AM, Peter W A Wood wrote:
> According to the User Guide, LiveCode employs UTF-16 encoding:
>
> "LiveCode fields and other controls use the UTF-16 encoding for Unicode. In order to use Unicode in a field or in the labels of controls, paste in Unicode text, or set the textFont of the control to ",unicode"."
>
> As I understand, characters beyond the Basic Multilingual Plane occupy two "UTF-16 Characters"
>
> This example is taken from the Mac Character Viewer:
> 𐅐
> GREEK ACROPHONIC ATTIC TEN STATERS
> Unicode: U+10150 (U+D800 U+DD50), UTF-8: F0 90 85 90
>
> This worked for me (using LiveCode 5.5.3):
>
> on mouseUp
>     set the useUnicode to true
>     set the unicodeText of fld "f1" to numToChar(55296) & numToChar(56656)
> end mouseUp
>
> Hope this helps.
>
> Peter
>

I'm sorry I took so long to get back with this; lots of other stuff on 
my plate!

Maybe this is a bit naive; but how does one derive the 2 Hex numbers for 
a character beyond the
Basic Multilingual Plane if one knows the Unicode address?

>
> On 13 Dec 2012, at 04:53, Richmond wrote:
>
>> So there I am wondering about the higher plains (the svarga-lokas . . . LOL), so I try
>> this:
>>
>> I make a stack with 2 buttons (called 'button 1' and 'button 2' respectively)
>> and 2 fields (called 'f1' and 'f2' repectively) and put the following code into button 1:
>>
>> on mouseUp
>>   set the useUnicode to true
>>   set the unicodeText of fld "f1" to numToChar(65940)
>> end mouseUp
>>
>> and when I click on the button I get nothing like anything that should appear (a capital X with a bar through its middle),
>> but something that resembles a badly deformed Hebrew 'Ain'.
>>
>> in button 2 I put the following script:
>>
>> on mouseUp
>>   set the useUnicode to true
>>   put charToNum(fld"f1") into fld "f2"
>> end mouseUp
>>
>> and get "404", which is, indeed a sort of funny 'Ain'.
>>
>> This would seem to suggest 2 things:
>>
>> 1. RR/LC cannot cope with Unicode addresses above the first plane.
>>
>> 2. RR/LC copes with this by truncating the addresses.
>>
>> Does anybody know of a work around for this?
>>
>> [this does not affect my work directly, but does interest me both in terms of future work, and
>> relating to the capabilities of RR/LC in general]
>>
>> Richmond.
>>
>> _




More information about the use-livecode mailing list