Unicode Chinese Mac
Dar Scott
dsc at swcp.com
Tue May 10 19:12:30 EDT 2005
On May 10, 2005, at 1:53 PM, Dar Scott wrote:
> You can't use =, is a number, contains, line, item, foundchunk, filter
> (except for a trick), find, +, -, /, *, add, subtract, offset (except
> with extra scripting), and just about anything.
But as was pointed out earlier, you get some gain by using htmlText
instead of unicodeText.
Also, UTF8 will work OK for words (usually), items and lines. Not
chars; you have to remember that all characters outside of the ASCII
range are represented by multiple bytes. The cool thing is that ASCII
characters cannot be in those multiple bytes. All of the syntactically
significant characters in words, items and lines are ASCII and thus the
coding cannot be embedded in those characters.
You can use (null-free) UTF8 as a key in arrays. You can use it with
'=', offset and 'contains', I think, as long as the strings are correct
UTF8. If caseSensitive applies to only ASCII characters, then that can
be true or false.
But since each char is 1 to 4 bytes, the easiest way to get the char
count is to assume BMP (no surrogates) and convert to UTF16 and half
the length.
UTF8 has no byte-order, so it can move among OSes without BOM
consideration.
So, for some types of processing, using UTF8 might be better than host
UTF16.
Dar
--
**********************************************
DSC (Dar Scott Consulting & Dar's Lab)
http://www.swcp.com/dsc/
Programming and software
**********************************************
More information about the use-livecode
mailing list