Unicode

Dar Scott dsc at swcp.com
Tue May 10 15:25:34 EDT 2005


On May 10, 2005, at 12:46 PM, Thomas McGrath III wrote:

> I get from what you are saying that if they were unicode then they 
> won't work with line, item, word.


The two-byte codes might contain a one-byte character used in these 
chunks in either the upper or lower half of the code.

Consider these from the first page of the Unicode CJK Unified 
Ideographs:

U+4E0A contains a line end. (above?)
U+4E20 contains a space.
U+4E09 contains a tab. (three?)
U+4E2C contains a comma.

Multiply that by 82 pages in Unicode CJK Unified Ideographs and all the 
support pages and you have lots of candidates for clashes.

Try this:

on mouseUp
   set useUnicode to true
   get numtoChar(0x4e0a)  -- above?
   put the number of lines in (it & it & it & it)
end mouseUp

On OS X, I get 4.

Almost all the CJK pages are filled, so you can't even do something 
clever with special codes.

Dar


-- 
**********************************************
     DSC (Dar Scott Consulting & Dar's Lab)
     http://www.swcp.com/dsc/
     Programming and software
**********************************************



More information about the use-livecode mailing list