the mouseText and Unicode

Slava Paperno slava at
Sat Jun 18 12:41:25 EDT 2011

Thanks, Bernd! Using the html is something I didn't try, but otherwise your results are exactly the same as mine: The Russian Robert (Роберт) is the fourth word, yet clicking its first letter reports it as word 3.

Yes, Robert's Chinese ancestor are the culprit here, of course :)

The Chinese characters are displayed whenever you get the bytes wrong, e.g. try to display char 10 to char 11 when the actual double-byte character is char 11 to 12. When all text is non-Roman in a double-byte field, it's easy not to make that mistake (always start with an odd number) , but when some characters are Roman (like that first comma in my example), the mouseCharChunk fails to account for the null byte in front of it, and reports the next character (space) incorrectly. That's my theory at this point... it may be wrong.

The exasperating thing, for me, is that getting word N of a string is not a problem, and neither is locating the position of a word (once you know its characters) . It's identifying the word-position of the mouse-click that is screwed up.

Thanks again... If we ever find a sure-fire way to do this, I'll post the solution here.

Enjoy your weekend,


> -----Original Message-----
> From: use-livecode-bounces at [mailto:use-livecode-
> bounces at] On Behalf Of BNig
> Sent: Saturday, June 18, 2011 11:53 AM
> To: use-revolution at
> Subject: RE: the mouseText and Unicode
> Hi Slava,
> I tried your example of mixed unicode and ASCII words. Using the word
> technique and the html I did this:
> -----------------------------
> on mouseUp
>    get word 2 of the clickCharChunk
>    put the number of words in char 1 to it of me into tWordNum
>    put word 1 to (the number of words in char 1 to it of me) of me into tWords
>    put the htmlText of  word tWordNum of me into tWordClicked
>    put tWordClicked into field 3
>    set the htmlText of field 2 to tWordClicked end mouseUp
> -------------------------------
> 3 fields, the first contained the unicode/ASCII mix of your example I pasted it
> into the field from the browser, looked good.
> second field is where your clicked word goes third field where the html of the
> clicked word goes.
> worked quite well except for Роберт he must have chinese ancestors :)
> all other words came over as they should. The html of the word makes it even
> easy to parse out the comma.
> maybe this is a lead to your problem? Or I am way off, don't know anything
> about unicode.
> Kind regards
> Bernd
> --
> View this message in context: http://runtime-
> tp3607206p3607973.html
> Sent from the Revolution - User mailing list archive at
> _______________________________________________
> use-livecode mailing list
> use-livecode at
> Please visit this url to subscribe, unsubscribe and manage your subscription
> preferences:

More information about the Use-livecode mailing list