Use of Phonetic Script

Dar Scott dsc at swcp.com
Fri Aug 18 02:05:28 EDT 2006


On Aug 17, 2006, at 7:03 PM, Cat Kutay wrote:

> I am trying to compare phonetic script from the htmlText of a card,  
> and in a database. However the htmlText im meoory seems to lose the  
> unicode formating, and also displaying messages for error do not  
> have the formatting

Concerning the displaying of messages, the Revolution command  
dictionary entry for 'answer' includes this paragraph:

      The prompt can be either formatted text (in the htmlText  
property's format)
      or plain text. If the prompt contains <p> or a start/end tag pair,
      the answer command assumes the text is in the same format as the
      htmlText property. Otherwise, the answer command assumes the text
      is plain text.

So, if you are using 'answer' to display messages with the special  
characters, try wrapping the text with <p>.


Concerning the loss of unicode in memory, I have assumed IPA for the  
phonetic script and tried this:

on mouseUp
   set the useUnicode to true
   -- IPA for "er" in some dialects of English
   set the unicodeText of field "field" to numToChar(0x025A)
   put the htmlText of field "field"
end mouseUp

That put this into the message box:

<p><font face="Lucida Grande" lang="ja">ɚ</font></p>

(Don't worry about the "ja"; Revolution thinks everything is Japanese.)

The IPA "er" is represented by the "&#602".  Notice that the number  
is now in decimal, not in hexadecimal.


I have seen Revolution get confused about diacritical marks, so if  
there are combining marks in your phonetic notation, then there might  
be a problem.  For example, I tried to modify the above handler to  
display a dental t, but the diacritic did not display and it was  
isolated from the t in the htmlText.


If the database uses UTF-16 or UTF-18 rather than the number for the  
character or a character encoding other than Unicode, then you might  
have trouble with the comparison.

If you want Unicode text, then use the unicodeText property.  That is  
UTF-16 in host byte order.  You can convert that to UTF-8 with the  
uniDecode() function.  Look at 'unicodeText' and 'uniDecode' in the  
Revolution dictionary.


I might have misunderstood the problem.  Please ask again if that is  
the case.

Dar Scott






More information about the use-livecode mailing list