How to determine if a text file is UTF8 ?
Bob Sneidar
bobsneidar at iotecdigital.com
Tue Oct 29 11:32:50 EDT 2024
So open file for binary read, then put textDecode(it, “UTF-8”) into tText.
Bob S
> On Oct 29, 2024, at 8:31 AM, Bob Sneidar <bobsneidar at iotecdigital.com> wrote:
>
> I suppose you could also use textDecode(<string>, “UTF-8”) to convert the text to that format.
>
> Bob S
>
>
>> On Oct 29, 2024, at 8:17 AM, Bob Sneidar via use-livecode <use-livecode at lists.runrev.com> wrote:
>>
>> There is a Wikipedia article on this. Turns out it is not straightforward. There can be a Byte Order Mark that the file begins with but not all vendors use it. And I do not think you can make the determination simply by examining the contents of the file.
>>
>> Byte-order mark[edit]
>> If the Unicode byte-order mark U+FEFF is at the start of a UTF-8 file, the first three bytes will be 0xEF, 0xBB, 0xBF.
>> The Unicode Standard neither requires nor recommends the use of the BOM for UTF-8, but warns that it may be encountered at the start of a file trans-coded from another encoding.[23] While ASCII text encoded using UTF-8 is backward compatible with ASCII, this is not true when Unicode Standard recommendations are ignored and a BOM is added. A BOM can confuse software that isn't prepared for it but can otherwise accept UTF-8, e.g. programming languages that permit non-ASCII bytes in string literals but not at the start of the file. Nevertheless, there was and still is software that always inserts a BOM when writing UTF-8, and refuses to correctly interpret UTF-8 unless the first character is a BOM (or the file only contains ASCII).[24]
>>
>> https://en.wikipedia.org/wiki/UTF-8#
>>
>> Bob S
>>
>>
>>> On Oct 29, 2024, at 1:53 AM, jbv via use-livecode <use-livecode at lists.runrev.com> wrote:
>>>
>>> Hi list,
>>>
>>> How to determine if a text file is UTF8 or just plain ASCII ?
>>> In other words, how to know if one should use
>>> open file myfile.txt for UTF8 read
>>> or
>>> open file myfile.txt for read
>>>
>>> Thank you.
>>> jbv
>>>
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode at lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>>
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>
More information about the use-livecode
mailing list