Double-Value Unicode?

Richmond richmondmathewson at gmail.com
Sat May 17 05:43:11 EDT 2014


On 17/05/14 05:19, Dar Scott wrote:
> The good news is that LiveCode 7 will not have this problem, but it still might sneak in if one is not careful.
>
> With LiveCode 7, the codePoint chunk will apply even to U+1F4DE (a new character, I guess, for a phone?), that is, 128222.  However, if you look at the implementation level, the codeUnit, you will see the two numbers you describe.  These are surrogates, away for 16-bit representation to include characters above U+FFFF, or outside the BMP.
>
> So to represent those high codes, two 16-bit codes are used.  These are assigned but never occur in LiveCode 7 code points, only in the implementation code units.  So, if you’re working with characters or code points, you are good.  With 7.
>
> For now…
>
> There is a way to calculate the full code from the surrogates.
>
> Take the least significant 10 bits of the first number, add 0x40 and then shift left 10 bits.  Add that to the least significant 10 bits of the second number.  Stand on one foot and...
>
> It is probably someplace online in a clearer form.
>
> Dar
>
>
>

This place is very useful:  http://www.unicode.org/charts/

as you can download the relevant unicode charts for the character set(s) 
you are working with
for reference purposes.

and here:

http://www.russellcottrell.com/greek/utilities/SurrogatePairCalculator.htm

is a way to switch back and forth between a full codePoint chunk and the 
surrogate pairs.

Richmond.




More information about the use-livecode mailing list