opening txt files
nishok.love at virgin.net
Thu Jan 17 04:51:40 EST 2013
This is the first time I've asked a question on use-livecode and I've been pleasantly amazed that people have taken the time to give so much useful advice - much respect, and thankyou to everyone. I think I now have a solution which works, and I've learnt some interesting things too.
(1) Bob's idea (below) is the way to differentiate between UTF-8 and UTF-16. The program can react by ignoring alternate characters if it finds FF FE as the first two characters (thanks HTH for some tidy code but note that the first character after the FF FE is the valid one).
(2) The comment on the Apple discussion (also below) would seem to be right. In case (1) Text Wrangler reports a Unicode UTF-8 file and in case (2) a Unicode UTF-16.
(3) TextEdit seems to resolve the encoding issue before displaying the file so TextWrangler is better for nit-pickers (thanks for that, Francis :) my wife appreciates the extra ammunition!) who want to see everything.
Onwards and upwards,
> From: Robert Sneidar <slylabs13 at me.com>
> I will hazard a guess, that when you open the file for reading, you can open binary first and see if the first two characters amount to FE FF, yes? If so, treat as UTF-16. If not, treat as UTF-8. I have not tested this strategy myself, but your second point seems to give the clue to solve this mystery.
> On Jan 16, 2013, at 9:15 AM, Nishok Love wrote:
>> Thanks, Bob. Your command works but the same results occur. Further investigations here found this
>> When Pages is used to export as "Text", the resulting file may be of two kinds:
>> (1) if the document contained only characters included in Apple MacRoman charset, the file is a pure text file based on Apple MacRoman encoding.
>> (2) if the document contained extraneous characters the created text file take care of this feature and uses the UTF encoding (two bytes per character) and starts with the logical BOM: "FE FF".
>> which I've copied from the discussion on https://discussions.apple.com/message/9518841?messageID=9518841#9518841?messageID=9518841
More information about the Use-livecode