LiveCode 7 codepoint question

Kenji Kojima index at kenjikojima.com
Fri Apr 25 15:34:25 EDT 2014


Dar,

I understood your script well. Thank you. 
But I was wondering why we needed these Japanese characters. 
We do not use these combined characters. 

We use them in Japanese writing
	numToCodepoint( 0x30C8 )   — this is ト
	numToCodepoint( 0x30C9 )   — this is ド
	the num of codepoints of “ド”   — returns 1

But we do not use this character, and I do not know how I can type. I have to use LiveCode.
	numToCodepoint( 0x30C8 ) & numToCodepoint( 0x3099 )   — this is ド 
	the num of codepoints of “ド”    — returns 2

I found the answer. 

1) About 150 years ago. Many foreign words came into Japan. 
	Japanese did not have phonetic characters RA or LA then
	some people started to write “ラ” as “RA” and “ラ゚” as “LA”
		numToCodepoint( 0x30E9 )    — this is ラ
		numToCodepoint( 0x30E9 ) & numToCodepoint( 0x309A)   — this is ラ゚	
	But Japanese do not write differences between them anymore. Now only “ラ” is used. 
	Contemporary Japanese do not care RA or LA. 

2) イ゚ ロ゚ ニ゚ ト゚ チ゚ リ゚ ヌ゚ ル゚ ヲ゚ ワ゚ カ゚ ヨ゚ タ゚ レ゚ ソ゚ ツ゚  and more characters. 
	They were used for a telegram code. Probably before the war. 
	They are not used now. 

Maybe there were some more other old usages. 
--
Kenji Kojima / 小島健治
http://www.kenjikojima.com/





On Apr 18, 2014, at 5:47 PM, Dar Scott <dsc at swcp.com> wrote:

> Here is my experiment to look at characters that are multiple codepoints in Japanese.  (This experiment is limited to Katakana.)
> 
> (I don’t know Japanese, so I apologize for anything goofy.)
> 
> This shows the two-codepoint versions of ド as one character.
> 
> On my OS X system, the latter two did not render as one character in the message box, though.  I might be doing something wrong.  When I pasted the output string into mail, one of those combined but the second didn’t—maybe it is intended for use with half-width Katakana.  
> 
> The output is this:
> 
> ト	ド	ド	1	ド	1
> 
> Dar
> 
> ———
> on mouseUp
>   put numToCodepoint( 0x30C8 ) into kto
>   put numToCodepoint( 0x30C9 ) into kdo
>   put numToCodepoint( 0x3099 ) into kVoiceMark
>   put numToCodepoint( 0xFF9E ) into kHalfVoiceMark
>   put kto & kVoiceMark into kdoAlt1
>   put length(kdoAlt1) into kdoAlt1N
>   put kto & kHalfVoiceMark into kdoAlt2
>   put length(kdoAlt2) into kdoAlt2N
>   put kto & tab & kdo & tab & kdoAlt1 & tab & kdoAlt1N & tab & kdoAlt2 & tab & kdoAlt2N
> end mouseUp
> ———
> 
> 
> 
> 
> On Apr 18, 2014, at 2:59 PM, Kenji Kojima <index at kenjikojima.com> wrote:
> 
>> What is the actual single unicode character which is composed of two or more code points?
>> I could not find it in Japanese characters. I could use same “char” and “code point” in Japanese. 
>> Are there it in other languages?
>> 
>> There is a comment of “codepoint" on the dictionary. 
>> "A codepoint is an integer identifier associted witha a Unicode character. 
>> A single character is composed of one or more code points.”
>> 
>> Thanks,
>> --
>> Kenji Kojima / 小島健治
>> http://www.kenjikojima.com/
>> 
>> 
>> 
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list