reading win and mac text files on linux

Martin Baxter martin at materiaprima.fsnet.co.uk
Sat Apr 17 15:11:14 EDT 2004


>--- Martin Baxter <martin at materiaprima.fsnet.co.uk>
>wrote:
>> For quite a while now I've been moving a large set
>> of plain text data files
>> back and forth between mac and windows and
>> revolution has read and written
>> them seamlessly on either platform, regardless of
>> the line end characters
>> in the files it opens. (I'm just using "put into url
>> x", and "put url x
>> into", with the "file:" protocol.)
>>
>> Recently I moved these data files to Linux, read
>> them into revolution and
>> was surprised that no automatic conversion of line
>> endings seems to be done
>> when opening them on that platform. The Mac text
>> files opened into
>> revolution as a single line with a sprinkling of
>> ascii 13s, and the PC
>> files all have a spare ascii 13 at the end of each
>> line.
>>
>> I expected line endings to be converted
>> automatically.
>>
>> Can anyone enlighten me why this is happening ?
>> Did I misunderstand the mechanism ? Is this expected
>> behaviour ?
>>
>> Martin
>>
>
>Hi Martin,
>
>The engine merely makes assumptions, based on the
>platform it's running on ; so when a Revolution app
>runs on a Mac, it will expect ASCII 13 as line
>delimiter.
>But when you feed it a Unix file on a Mac, it won't
>recognise it as such (it has no way of guessing).
>
>So I guess the easiest bit is to read as binfile, and
>do a few replacements yourself :
>  put URL ("binfile:" & tFilePath) into tData
>  replace CRLF with return in tData
>  replace numToChar(13) with return in tData
>
>Hope this helped,
>
>Jan Schenkel.

Hmm, That is certainly what happens here in Linux, and is what the docs say:

quote : "when you use a file URL as a container. Revolution translates as
needed between your system's end-of-line marker and Revolution's linefeed
character." endquote

However, I just checked on WinXP and MacOS 7.6 (because they happen to be
to hand) and on both of those, any line endings CR, LF or CRLF in the file
are acceptable to delimit lines when you open the file. Revolution
automatically replaces whatever (non LF) line-endings it finds in the file,
as necessary.

So the behaviour is subtly inconsistent between platforms, and on
reflection I think I have hitherto been taking advantage of a side-effect
of the way these line-ending conversions are implemented internally.:

<theory>
I reckon that what happens is that on non-LF platforms there is a single
routine that converts any non-LF line-endings in the file when you open it.
The intent being that whether your system uses CR or CRLF, either case will
be handled, and there's no need to check which specific OS it actually is.

The effect is subtly different from the intent however because under Mac or
Windows you can open text files from any system, not just the local one,
without consideration of this issue.

On a Unix system however, this step isn't done, because it isn't considered
necessary to translate the local line-endings which are assumed to be
already correct. So if you want to open "non-native" text files under Unix
you have to control the line-endings yourself.
</theory>

So I will follow your good advice Jan. It looks like the way to go. (My
file in/out needs rewriting anyway ,-)

It seems a pity that the behaviour couldn't have been consistent across
platforms though, but changing it now could break things I suppose.

Thanks

Martin Baxter




More information about the use-livecode mailing list