Unicode from Variable?

Phil Davis revdev at pdslabs.net
Tue Nov 25 15:37:38 EST 2008


Hi Devin,

Thanks for your short list! Your insights are massively helpful.

Phil



Devin Asay wrote:
> Hi guys,
>
> Wow, where to start? I completely understand your confusion and feel 
> your pain. When I started with unicode I felt lost, too. Here's a 
> short list of Aha! points that should help. (Caveat--I'm describing my 
> understanding purely from a developer perspective; I have little 
> understanding about how Rev implements unicode "under the hood").
>
> - When we talk about unicode in Rev, we're talking about UTF-16, not 
> UTF-8 or UTF-32.
>
> - The current implementation of unicode is not perfect, but it is 
> perfectly usable. (Right-to-left languages are still problematic, 
> especially if you need to support user input. Display of same is 
> usually fine.)
>
> - The useUnicode property has very limited application. It only 
> affects the behavior of the charToNum and numToChar functions. If 
> useUnicode is false, these 2 functions behave as we're accustomed; if 
> true, these 2 functions assume two byte characters instead of 1 byte.
>
> - The byte order in which unicode files are stored is dependent upon 
> the processor in the host machine. That means that if you're 
> transferring unicode files from, say, a PPC-based machine to an 
> Intel-based one, UTF-16 files will be scrambled unless you invert the 
> bytes as you read them in.
>
> - In light of the above, it's usually best to store unicode text as 
> UTF-8 or even htmlText. These have been the most reliable transfer 
> formats for me.
>
> - In a Rev field unicode and ascii get mixed up all the time. For 
> instance, characters that normally fall within the ascii range, like 
> space, return and common punctuation, are considered ascii. While this 
> can be confusing, it does ensure that normal Rev chunk expressions 
> work as expected.
>
> - There is no 100% reliable way I know of to look at a file and 
> determine heuristically whether it's unicode, or what flavor of 
> unicode it is.
>
> - The section on unicode in the Rev User Guide (section 6.4) is pretty 
> good as far as it goes, but doesn't cover all the "gotchas".
>
> - Dealing with unicode in text fields is different that in buttons and 
> menus.
>
> Anyhow, those are some of the key points. For a more in depth 
> discussion, see my Unicode presentation from RevLive if you've got the 
> DVD. Failing that, you're welcome to read my presentation notes at:
>
> http://asay.byu.edu/revUnicode.pdf
>
> The stack I used in that presentation, which shows lots of examples, 
> is at:
>
> go url "http://asay.byu.edu/unicode-RevLive08.rev"
>
> I'm happy to help if you still have specific issues after you look at 
> this stuff. Unicode is doable, once you learn the tricks and pitfalls.
>
> Regards,
>
> Devin
>
>
> On Nov 24, 2008, at 6:45 PM, Scott Rossi wrote:
>
>> Recently, Phil Davis wrote:
>>
>>> Thanks for asking the questions, Scott. I'm interested in clarity here
>>> too since I'll be working with Arabic again in the next few months, and
>>> am still a Unicode lightweight.
>>
>> You want questions?  I got a truck-load of 'em...
>>
>> For instance...  I have characters from several languages in the text 
>> I'm
>> working with: Roman, French (accented), Chinese, and Russian.  When I 
>> set
>> the unicodeText of a field to the text, the accented French characters
>> render incorrectly.  Looking in the source text file, it appears the
>> original French characters may have been reformatted when saving the 
>> file as
>> UTF-16.  Is there any way to keep the French characters intact within 
>> the
>> unicode text?
>>
>> Thanks & Regards,
>>
>> Scott Rossi
>> Creative Director
>> Tactile Media, Multimedia & Design
>>
>>
>> _______________________________________________
>> use-revolution mailing list
>> use-revolution at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your 
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-revolution
>
> Devin Asay
> Humanities Technology and Research Support Center
> Brigham Young University
>
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your 
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution
>

-- 
Phil Davis

PDS Labs
Professional Software Development
http://pdslabs.net




More information about the use-livecode mailing list