persistent Unicode problems

Toma Tasovac ttasovac at Princeton.EDU
Thu May 29 13:43:00 EDT 2003


Even though Cyrillic Unicode support has gotten better (one can now 
import a text file into a field without it being mapped to Japanese -- 
thanks Tuviah!),  there are still glaring problems, even in the final 
release.  The clickText function, for instance, returns the word AND 
the punctuation mark after it (it does not return the full-stop, but it 
does return comma, colon, exclamation point, question mark and 
semi-colon.  This is plain wrong, for clickText should return a word, 
i.e. any chunk delimited by spaces, tabs, returns or punctuation.  I'm 
disappointed to see that this has not been fixed in the final release.  
Also, the text doesn't wrap properly, words get broken down in half at 
the end of line...

Before I write again to the Rev guys about this, I would like to get 
some advice from the list though.  It concerns Unicode encoded text in 
customPropertySets.  -- perhaps those dealing with other languages can 
help me figure out if I'm doing something wrong or if it's a Revolution 
bug.

I am doing a simple thing: reading a utf8 encoded file into a variable, 
turning the variable into an array by splitting it with cr and tab, 
then setting the custom property set of the stack to the array.  All 
quite straight forward.

However:

1) when I look up the custom properties -- in my case, cTranslations -- 
in the inspector, I get garbled text consisting mostly of empty squares 
and numbers.

2) when I try to use the custom properties with the following code:

put the clickText into tClickedWord
-- the text which is being clicked on in a field was imported from a 
utf8 encoded file
put the cTranslations[tClickedWord] of this tack into fld 
"translationDisplay"

nothing happens,

3) when I try to use the message box with the following code:

put the cTranslations[зима] of this stack

(I don't know if this will come out right on everybody's email clients, 
but the bracketed word is in Cyrillic) -- after I hit return, the 
Cyrillic word gets turned into squares and numbers, and I get the 
message:

Script compile error:
Error description: Commands: missing ','

So: what's happening with unicode in custom properties?  Why is this 
not working?  Has anybody made it work with other languages?  I'd be 
grateful for any tips...

All best,
Toma

________________________________
Toma Tasovac
Princeton University
Department of Comparative Literature
91 Prospect Avenue
Princeton, NJ 08544
USA

ttasovac at princeton.edu
ttasovac at post.harvard.edu



More information about the use-livecode mailing list