Double-Value Unicode?

Dar Scott dsc at
Sun May 18 04:15:23 EDT 2014

Perhaps it would be useful to know that such a pair represents single character when the first number is in the range 0xD800-0xDBFF and the second number is in the range 0xDC00-0xDFFF.  The pair represents a character with a code in the range 0x00010000-0x0010FFFF.

Conversely, only convert a code to two surrogates if it is in the range 0x00010000-0x0010FFFF.  

There are no characters with codes above 0x0010FFFF.  No pair is needed for characters with codes 0x0000-0xFFFF.  

From those, you can see that the full Unicode range is 0x000000-0x0010FFFF with a hole at 0x00D800-0x00DFFF for surrogates.  

I see you have created a function to create a code (representing a character) from two surrogates.   I’ll leave that alone.

Dar Scott

On May 16, 2014, at 5:58 PM, Scott Rossi <scott at> wrote:

> Hi All:
> I thought I had figured out the display of Unicode glyphs in LC 6.6, and
> then ran up against a threshold where the value of a character is
> displayed as two values.  I'm guessing this is related to a "double-byte"
> something or other.  How does one retrieve the value of a character as a
> single value?
> For example, I can set the htmlText of a field to 📞 and get the
> correct unicode character to display.  But when retrieving the value for
> the character, I get: ��
> Can I encode/decode/jumpcode/flipcode something here to get a single value
> representation of the character?
> (Also, I'm unable set the value of the by setting the unicodeText to
> numToChar(128222) -- only using the htmlText property seems to work).
> Thanks for helping out this uniclueless dude.
> Regards,
> Scott Rossi
> Creative Director
> Tactile Media, UX/UI Design
> _______________________________________________
> use-livecode mailing list
> use-livecode at
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:

More information about the Use-livecode mailing list