Characters above ASCII 255

Dar Scott dsc at swcp.com
Sun Sep 10 23:19:29 EDT 2006


On Sep 10, 2006, at 1:08 PM, Adrian Williams wrote:

> I'm having trouble accessing ASCII characters above 255
> in a font that has 1200 characters. I've tried this...
>
>   put numToChar(1378) into field "FontDisplay.txt"
>
> but get two unwanted characters: a hollow square next to a character.
> Should I be doing some kind of conversion or calling for a
> Hexadecimal value or something else?

(Strictly speaking, ASCII characters are encoded with codes 0 through  
127.)

If property 'the useUnicode' is true, then numToChar() will generate  
16-bit values, each value being two bytes each in host order.   
Normally, this will display as two characters if put into a field as  
above.

Normally, characters are interpreted as encoded in Mac Roman on Mac  
platforms and as Latin-1 (or Windows-1252) on others.  These are  
single byte encodings that are supersets of ASCII, that is,  
characters are assigned to most codes in the range 128-255.  So,  
(normally) those two bytes are interpreted as two characters when put  
into a field on the platform specified.  This will probably display  
the symbol for "the character is not in the font" next to a "b" in  
the above case.

So, maybe you have useUnicode set to true and you are on a Mac.   
Also, since you see a box, I'd guess you are using a font that  
doesn't have a character defined for the code 5.

(There is a way to set the language of a font, but I'm not familiar  
with that approach.  Others might help.  Like Phil, I'm not too sure  
about this.)

One way to extend beyond the Mac Roman (or Latin-1) character set is  
to use Unicode encoding.  If the characters, that is, the glyphs, you  
want are specified in Unicode and there is a way to use Unicode  
encoding with the font, then this should work.  If this is a  
specialized font, maybe not.

You can create Unicode text several ways.  Here are two:  One is to  
use numToChar(0x0562) for Armenian small ben, or any other Unicode  
encoding, to build the character if you find the characters in  
Unicode (www.unicode.org/charts/).  You can '&' them together.  Or  
you can use fields on cards in a library or substack that contains  
the text you want and get the unicodeText property of the field with  
the data you need.  In either of those two cases, you can set the  
unicodeText of a field to the text you compose.

A third method is to use htmlText.  See Phil's suggestion.

Sometimes a special font uses private codes to represent special  
characters in Unicode.  The font documentation might provide that  
info, if that is what you need.

There are some enhancement suggestions in bugzilla to allow checking  
of characters in a font and to allow uniform Unicode-based  
representation of a universal set of characters.

Dar Scott



More information about the use-livecode mailing list