the mouseText and Unicode: a 3-char puzzle
Slava Paperno
slava at lexiconbridge.com
Tue Jun 21 02:40:43 EDT 2011
Following Tariel's report, here is a puzzle:
Make a text entry field and set its font to Arial,Unicode.
Put these three characters in the field and lock it:
<->
The first one is decimal 171, the last one is decimal 187; they are called
Double Angle Quotation Marks.
The one in the middle is called Em-Dash, decimal 8212.
Give the field this mouseDown script:
on mouseDown
PUT "FIELD"
repeat with i = 1 to length( the unicodeText of field "TextToClick")
put cr & byteToNum(byte i of the unicodeText of field "TextToClick")
after msg
end repeat
put the unicodeText of field "TextToClick" into locEntireText --this is
UTF16
PUT cr & "VAR UTF-16" after msg
repeat with i = 1 to length(locEntireText)
put cr & byteToNum(byte i of locEntireText) after msg
end repeat
put uniDecode(locEntireText, "UTF8") into locEntireText --this is UTF8
PUT cr & "VAR UTF-8" after msg
repeat with i = 1 to length(locEntireText)
put cr & byteToNum(byte i of locEntireText) after msg
end repeat
end mouseDown
When I click the field in LC 4.6.1 on my Windows 7 machine, I get this
display in the Message box:
FIELD
171
0
20
32
187
0
VAR UTF-16
171
0
20
32
187
0
VAR UTF-8
194
171
226
128
148
194
187
The FIELD and the VAR UTF-16 reports are entirely predictable, but the VAR
UTF-8 list is puzzling to me. I expected six bytes, not seven.
There is a practical reason for trying to solve this puzzle: these three
characters throw off the byte count that I used in the workaround for the
"clickedUnicodeText" problem that was discussed under this Subject line the
other day. I feel obliged to restore order in this chaotic universe, or fall
asleep trying.
Thanks, Tariel, and thank you all for reading this,
Slava
> -----Original Message-----
> From: Tariel Gogoberidze [mailto:tariel at me.com]
> Sent: Monday, June 20, 2011 11:58 AM
> To: slava at lexiconbridge.com
> Subject: Re: the mouseText and Unicode: CONCLUSION
>
>
> Hi Slava,
>
> Tried your script (nice job), but with text I copied from some Russian
> web side it brakes on word "dikanky" and all words after that.
> Try attached stack, you will see on which char it brakes farther word
> selection and removing this char will allow correct selection again.
>
> regards
> Tariel
More information about the use-livecode
mailing list