the mouseText and Unicode: the Russian letter R

Slava Paperno slava at lexiconbridge.com
Sat Jun 18 20:47:52 EDT 2011


Hi Bernd! I hope you are having fun :) This is very entertaining...

So "the htmlText of char i" may CONTAIN "<p> <" ??? Incredible... I'll be playing with this all night.

My Robert is a little different from yours:

set useUnicode to true
--field "Robert" contains one Russian letter, the upper case  Р, decimal 1056, typed into the field

answer the htmlText of field "Robert" --the Russian letter Р is properly displayed in the answer box

answer charToNum(char 1 to 2 of the unicodeText of field "Robert") --1056 as expected

answer byteToNum(byte 1 of the unicodeText of field "Robert") --32 (as reported by your guys, confusing for word delimiter users)
answer byteToNum(byte 2 of the unicodeText of field "Robert") --4 (not 9 as I heard you say at the bottom of this email; I am in Windows 7, 64-bit)

put the unicodeText of field "Robert" into tRussianR
answer charToNum(char 1 to 2 of tRussianR) --1056 as it should be

answer byteToNum(byte 1 of tRussianR) --32
answer byteToNum(byte 2 of tRussianR) --4

put uniDecode(the unicodeText of field "Robert", "UTF8") into tRussianR
answer charToNum(char 1 to 2 of tRussianR) --41168 (inexplicable)

answer byteToNum(byte 1 of tRussianR) --208 (hm...)
answer byteToNum(byte 2 of tRussianR) --160 (hm...)

I am really intrigued by what Mark said this morning: apparently if you assign the unicodeText to that field by using clipboardData["unicode"] things work a little differently. I haven't been able to see the difference, though I understand the potential (and the hope) that things may get better if you use that.

Slava

> -----Original Message-----
> From: use-livecode-bounces at lists.runrev.com [mailto:use-livecode-
> bounces at lists.runrev.com] On Behalf Of BNig
> Sent: Saturday, June 18, 2011 6:12 PM
> To: use-revolution at lists.runrev.com
> Subject: Re: the mouseText and Unicode: the Russian letter R
> 
> Slava,
> 
> although Роберт is a nice guy he must give in:
> 
> I tried with
> 
> Саша, Наташа, Митя, Роберт, Robert, Jeffrey, and Соня Петрова, СССР,
> ССРРСС
> Слава
> Паперно, Лора Баглай, Макс, Паперно
>  Роберт, СССР, ССРРСС
> 
> -------------------
> on mouseUp
>    get word 4 of the clickCharChunk
>    put it into tSelPos
>    put 0 into tStartSel
> 
>    repeat with i = tSelPos down to 1
>       put the htmlText of char i of field 1 into tHTML
>       if  (tHTML contains  "<p> <" or tHTML is "<p></p>" or tHTML
> contains
> ">,<") then
>          put i into tStartSel
>          exit repeat
>       end if
>    end repeat
> 
>    put the number of chars of field 1 into tEndSel
>    repeat with i = tSelPos to the number of chars of field 1
>       put the htmlText of char i of field 1 into tHTML
>       put char i of tData into taChar
>       if  (tHTML contains  "<p> <" or tHTML is "<p></p>" or tHTML
> contains
> ">,<") then
>          put i into tEndSel
>          exit repeat
>       end if
>    end repeat
> 
>    select char tStartSel + 1 to tEndSel -1 of me
> 
>    put the htmlText of  the selectedtext into tWordClicked
>    -- put tWordClicked into field 3
>    set the htmlText of field 2 to tWordClicked
> end mouseUp
> -------------------
> 
> please watch out for linebreaks
> 
> selecting a word of either the unicode kind or the roman kind works
> with
> above code.
> 
> 
> I now test the htmlText for space, return and comma. I scan from the
> clickCharChunk up and down until any of these are true. Then I exit the
> scan
> and 'declare' what is between a word, select the word in the field and
> get
> the html of the selectedText.
> Should also work with the unicodeText of the selectedText instead of
> the
> htmlText I am using now.
> 
> If I look at the chartoNum of Р (russian R) I see it is made of ascii
> 32 and
> ascii 9. ASCII 32 being a space maybe that is a clue to why it throws
> Livecode off. I would consider this a, well, anomaly and can only hope
> for
> Livecode to eventually support Unicode more completely.
> 
> you said "Curiouser and curiouser..."
> I would say "uglier and uglier..."
> 
> Kind regards
> 
> Bernd
> 







More information about the use-livecode mailing list