Detecting Unicode text
lists at mangomultimedia.com
Thu Nov 17 20:58:45 CST 2005
On Nov 17, 2005, at 5:37 PM, Sarah Reichelt wrote:
> If I UniDecode the text, it comes good except for a weird character at
> the start which I can handle, but is there a neat way to detect the
> encoding of text before I start? I suppose I can just look for the
> word "Subject" and if it isn't there, uniDecode and try again, but it
> seems there should be a way to detect the encoding of the text itself.
> Does the weird stuff at the start give me any clues? Checking the
> ASCII codes, the text starts with ASCII 254, ASCII 255, space and then
> the first character of my text. Perhaps that's my answer, but will
> they always be 254 & 255 or does that vary with the encoding?
> Any ideas?
The "weird stuff" at the beginning is the BOM. This tells
applications opening the file what kind of UTF file you are dealing
with. Now, I'm not sure how to decipher each BOM but perhaps Google
will know the answer.
Blue Mango Multimedia
trevor at mangomultimedia.com
More information about the use-livecode