Unicode from Variable?
Phil Davis
revdev at pdslabs.net
Tue Nov 25 15:37:38 EST 2008
Hi Devin,
Thanks for your short list! Your insights are massively helpful.
Phil
Devin Asay wrote:
> Hi guys,
>
> Wow, where to start? I completely understand your confusion and feel
> your pain. When I started with unicode I felt lost, too. Here's a
> short list of Aha! points that should help. (Caveat--I'm describing my
> understanding purely from a developer perspective; I have little
> understanding about how Rev implements unicode "under the hood").
>
> - When we talk about unicode in Rev, we're talking about UTF-16, not
> UTF-8 or UTF-32.
>
> - The current implementation of unicode is not perfect, but it is
> perfectly usable. (Right-to-left languages are still problematic,
> especially if you need to support user input. Display of same is
> usually fine.)
>
> - The useUnicode property has very limited application. It only
> affects the behavior of the charToNum and numToChar functions. If
> useUnicode is false, these 2 functions behave as we're accustomed; if
> true, these 2 functions assume two byte characters instead of 1 byte.
>
> - The byte order in which unicode files are stored is dependent upon
> the processor in the host machine. That means that if you're
> transferring unicode files from, say, a PPC-based machine to an
> Intel-based one, UTF-16 files will be scrambled unless you invert the
> bytes as you read them in.
>
> - In light of the above, it's usually best to store unicode text as
> UTF-8 or even htmlText. These have been the most reliable transfer
> formats for me.
>
> - In a Rev field unicode and ascii get mixed up all the time. For
> instance, characters that normally fall within the ascii range, like
> space, return and common punctuation, are considered ascii. While this
> can be confusing, it does ensure that normal Rev chunk expressions
> work as expected.
>
> - There is no 100% reliable way I know of to look at a file and
> determine heuristically whether it's unicode, or what flavor of
> unicode it is.
>
> - The section on unicode in the Rev User Guide (section 6.4) is pretty
> good as far as it goes, but doesn't cover all the "gotchas".
>
> - Dealing with unicode in text fields is different that in buttons and
> menus.
>
> Anyhow, those are some of the key points. For a more in depth
> discussion, see my Unicode presentation from RevLive if you've got the
> DVD. Failing that, you're welcome to read my presentation notes at:
>
> http://asay.byu.edu/revUnicode.pdf
>
> The stack I used in that presentation, which shows lots of examples,
> is at:
>
> go url "http://asay.byu.edu/unicode-RevLive08.rev"
>
> I'm happy to help if you still have specific issues after you look at
> this stuff. Unicode is doable, once you learn the tricks and pitfalls.
>
> Regards,
>
> Devin
>
>
> On Nov 24, 2008, at 6:45 PM, Scott Rossi wrote:
>
>> Recently, Phil Davis wrote:
>>
>>> Thanks for asking the questions, Scott. I'm interested in clarity here
>>> too since I'll be working with Arabic again in the next few months, and
>>> am still a Unicode lightweight.
>>
>> You want questions? I got a truck-load of 'em...
>>
>> For instance... I have characters from several languages in the text
>> I'm
>> working with: Roman, French (accented), Chinese, and Russian. When I
>> set
>> the unicodeText of a field to the text, the accented French characters
>> render incorrectly. Looking in the source text file, it appears the
>> original French characters may have been reformatted when saving the
>> file as
>> UTF-16. Is there any way to keep the French characters intact within
>> the
>> unicode text?
>>
>> Thanks & Regards,
>>
>> Scott Rossi
>> Creative Director
>> Tactile Media, Multimedia & Design
>>
>>
>> _______________________________________________
>> use-revolution mailing list
>> use-revolution at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-revolution
>
> Devin Asay
> Humanities Technology and Research Support Center
> Brigham Young University
>
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution
>
--
Phil Davis
PDS Labs
Professional Software Development
http://pdslabs.net
More information about the use-livecode
mailing list