Detecting Unicode text

Trevor DeVore lists at mangomultimedia.com
Thu Nov 17 21:58:45 EST 2005


On Nov 17, 2005, at 5:37 PM, Sarah Reichelt wrote:
> If I UniDecode the text, it comes good except for a weird character at
> the start which I can handle, but is there a neat way to detect the
> encoding of text before I start? I suppose I can just look for the
> word "Subject" and if it isn't there, uniDecode and try again, but it
> seems there should be a way to detect the encoding of the text itself.
>
> Does the weird stuff at the start give me any clues? Checking the
> ASCII codes, the text starts with ASCII 254, ASCII 255, space and then
> the first character of my text. Perhaps that's my answer, but will
> they always be 254 & 255 or does that vary with the encoding?
>
> Any ideas?

Hi Sarah,

The "weird stuff" at the beginning is the BOM.  This tells  
applications opening the file what kind of UTF file you are dealing  
with.  Now, I'm not sure how to decipher each BOM but perhaps Google  
will know the answer.


-- 
Trevor DeVore
Blue Mango Multimedia
trevor at mangomultimedia.com





More information about the use-livecode mailing list