opening txt files

Robert Sneidar slylabs13 at me.com
Wed Jan 16 14:13:01 EST 2013


I will hazard a guess, that when you open the file for reading, you can open binary first and see if the first two characters amount to FE FF, yes? If so, treat as UTF-16. If not, treat as UTF-8. I have not tested this strategy myself, but your second point seems to give the clue to solve this mystery. 

Bob


On Jan 16, 2013, at 9:15 AM, Nishok Love wrote:

> Thanks, Bob. Your command works but the same results occur. Further investigations here found this 
> 
> When Pages is used to export as "Text", the resulting file may be of two kinds:
> 
> (1) if the document contained only characters included in Apple MacRoman charset, the file is a pure text file based on Apple MacRoman encoding.
> 
> (2) if the document contained extraneous characters the created text file take care of this feature and uses the UTF encoding (two bytes per character) and starts with the logical BOM: "FE FF".
> 
> which I've copied from the discussion on  https://discussions.apple.com/message/9518841?messageID=9518841#9518841?messageID=9518841
> 
> Opening both files with TextEdit (which displays both of them correctly, ie without all those extra spaces), duplicating them and then watching the save options shows that one file (the one from Pages) is using UTF-16 whilst Word's Western (Mac OS Roman) export is in UTF-8. Using GetInfo I can now see that the UTF-16 file is twice the size of the other.
> 
> In short, text files are not as simple as they used to be!
> 
> So I'm still looking for a way for LiveCode to spot whether it's opening a file in UTF-8 or UTF-16 (or something else - aaarrgh!). Can I access the file header? read from file just gives me the data...
> 
> I could read the file, count the number of characters and how many of them are spaces and from that I could infer which format is being used. Probably this would be reliable for my purposes - just not very elegant!
> 
> Nishok
> 
> 
>> I am not sure why you are seeing this. I exported a pages newsletter file as text, then ran this command on it:
>> 
>> on mouseUp pMouseBtnNo
>>   answer file "Pick a text file" with "/Users/bobsneidar/Desktop/SneidarNewsletter.txt"
>>   put it into theFile
>>   open file theFile for read
>>   read from file theFile until cr
>>   put it
>>   close file theFile
>> end mouseUp
>> 
>> I got this in the message box:
>> 
>> 2005 Summer Edition
>> 
>> Seems to work.
>> 
>> Bob
>> 
>> 
>> 
>> On Jan 15, 2013, at 10:34 AM, NISHOK LOVE wrote:
>> 
>>> Hi All
>>> 
>>> I have a problem when I open .txt files in OSX, and I don't have much (any!) experience of reading files in LiveCode. 
>>> 
>>> I have a file originally written in Word on Windows. When I export it as a .txt from Word for Mac I just accept the default Mac OS encoding option (Western (Mac OS Roman) and it all works fine when I open the file in my LiveCode.
>>> 
>>> But when I open the original file in Pages and export it as Plain Text, I get a different result. When I open that file in LiveCode I find a space has been inserted after every character. So Hello world becomes H e l l o   w o r l d. 
>>> 
>>> I guess this is a problem with the encoding, but how can my LiveCode understand what the incoming file's encoding is and respond accordingly? My LiveCode needs to be able to deal with any kind of text file...
>>> 
>>> Thanks,
>>> Nishok Love
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode at lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list