Distinguishing between ASCII and UTF8
ambassador at fourthworld.com
Wed Oct 6 16:23:24 EDT 2010
I have an app that needs to auto-detect Unicode and plain text, and
render them correctly based on that auto-detection.
I have the UTF16 stuff working, but with UTF8 I have a problem: there
is no BOM to let me know if it's Unicode, and some plain text files will
occasionally have high-ASCII values in them (like the dagger symbol).
What patterns should I be looking for in the binary data of a file to
distinguish UTF8 from plain text?
LiveCode training and consulting: http://www.fourthworld.com
Webzine for LiveCode developers: http://www.LiveCodeJournal.com
LiveCode Journal blog: http://LiveCodejournal.com/blog.irv
More information about the Use-livecode