Wanted: idiot's guide to using Unicode in Rev
Ben Rubinstein
benr_mc at cogapp.com
Thu Dec 2 18:55:58 EST 2004
I've just fiddled (and part of my problem is that I don't know any non-roman
languages, so have very limited capability to understand if I'm doing
something right).
I have an app in which users edit text in a field; simple controls let them
apply simple styling (bold/italic/both) and some more perverse markup; a
popup stack menu assists in entering intering interesting non-roman
characters.
To date these have all been characters accomodated in the 8-bit ISO 8859 1
character set - now I need to handle some which come from extended code
blocks, for example o-macron, that is a latin small letter o with a
horizontal bar above.
I set the unicodeText of a field to the hex bytes 59 01 4D, which should be
"Y", followed by o-macron. I got some interesting japanese characters in
the field. What was going on there? Should I have set the font to
something first, to avoid misinterpretation - but surely the whole point of
saying that it's unicode is that I'm specifying code points, there's no
interpretation needed.
And if I do have to set a font, how do I know which one to specify? The
documentation talks about "Unicode fonts", and mentions "Osaka,Japanese" as
an example. How can I find out which Unicode fonts are available - on my
system, on my user's system? And if I found them, would it tell me which
language they relate to? And in any case, how should I know which language
o-macron relates to?
If I construct an HTML file to include the text
Yō
or
Yō
and open this in a browser, I see in each case a capital Y followed by an
o-macron. If I set the htmltext of a Rev field to the first of these, it is
reflected exactly, ie "Yō". If I set the htmlText of the field to
the second, it appears as "YM".
If I copy the text from the browser ("Y<o-macron>"), and paste it into the
field, it appears correctly. If I then get the htmlText of the field, it
is:
<p>Y<font face="Arial" lang="el">ō</font></p>
or sometimes I've got
<p>Y<font face="Lucida Grande" lang="el">ō</font></p>
Again, can some kind soul help me understand the relation of fonts to using
Unicode? If I want to set a character in a field to one outside the first
255 Unicode code points, do I need to do so in the context of a font?
Another specific question: can I set the label of a button to a unicode
string, or more generally to anything which requires more than simple roman
character set?
TIA,
Ben Rubinstein | Email: benr_mc at cogapp.com
Cognitive Applications Ltd | Phone: +44 (0)1273-821600
http://www.cogapp.com | Fax : +44 (0)1273-728866
More information about the use-livecode
mailing list