Unicode

pkc pkc at mac.com
Mon May 9 22:44:30 EDT 2005


Many thanks to Devin for his wonderful site. I am still hung up on a problem very closely related to Thomas McGrath's, however.  Like him, I am working on a project that calls for mixing Asian (in my case, Chinese) characters with English text.  Both the English and the Chinese elements are fixed, so I hit on the strategy of downloading the English text as an English text with markers for Chinese characters, then getting the client to feed characters from the Chinese text into the English text; then the client identifies the imported elements through reference to the list they came from and setting the textFont of the foundChunk to ",chinese."

This works extremely well on my machine (OS X). In fact, it works perfectly. However, nobody using it on a Windows machine gets the right characters (they get about 40% correct characters and the rest is junk).  this is very perplexing, since the whole point of Unicode is supposed to be that the characters are unique and will render correctly in either Mac or Windows. But they don't.

One thing I discovered in experimentation was that if I moved a Chinese file line by line into a new field I could get the same junk that Windows users are getting (the boxes --light or dark-- and random character elements in a meaningless string). (this was not a paste, which reproduces the textFont perfectly, as previously noted). It seemed to me that I was transferring not only the characters but also carriage-returns (is this possible??).  When I manually removed the spaces between characters, everything straightened up.  The problem was that I was left with a field that the computer thinks has one character in it.  That won't work with my strategy --all the characters will go into the marker for the first character.

I think that could have been a blind alley, and now I am interested in the idea that Mac and Windows systems reverse the placement of the null character when dealing with UTF-16.  That could be why Windows users are picking up the carriage returns and Mac users are not. The problem is I don't see a way in Revolution to tell the computer to put the null characters where WIndows expects it to be. Oh, one thing I just thought of experimenting with... I have so far been making the editing software remove all empty spaces (they produce those weird cross things in Unicode). But maybe it would be healthier to make sure the empty spaces are there, in order to keep the character returns out of the WIndows character renderings?  No idea, will experiment and see if the results are any better.

If there is no Revolution trick that will fix this, I would like to know if there is a setting Windows users can use to get their Unicode to work the way Mac works.

A bit frustrating. I have the basis for a research portal that works very pleasantly on the Mac and looks insane on Windows.  Despite it all being in Unicode!

I'm still reading and still experimenting, but have sort of plateaued here.  Quite a while since I had a breakthrough.  

Anyway, Tom, I sympathize.

Pamela Crossley


More information about the use-livecode mailing list