text in fields (CR/LF, CR, LF) which is correct?

Kee Nethery kee at kagi.com
Fri Aug 30 14:19:01 EDT 2002


>J. Landman Gay wrote:
>
>>  Dar Scott <dsc at swcp.com> wrote:
>>
>>>  However, I've seen examples from old timers on the list that
>>>  include return quite a bit.  I find that jarring and sometimes
>>>  confusing.  Yet, others might find it quite natural.
>>
>>  It's a holdover from HyperCard and SuperCard, where "return" has been in
>>  use since the beginning. As you say, it's very natural for anyone coming
>>  from another xtalk environment. The synonym "cr" is what others probably
>>  want to use, it's the same thing.
>>
>>  I'm not convinced we need a new constant though, but of course I could
>>  easily be missing something. I have always just used "cr" or "return"
>>  (interchangeably) and line endings have always been converted correctly
>>  no matter what platform I move my stack to. Never had any problems with
>>  it. If hard binary data needs to be converted, as in the case we've been
>>  discussing, then I just replace the appropriate ascii characters with
>>  "cr" or "return" and it fixes things.
>
>With one issue:  it is perfectly reasonable to expect an intelligent person
>to attempt to use "cr" or "return" in cases where they realy need a _real_
>return (numtochar(13)).
>
>This disparity between the Mac-authoring-tool convention and the
>UNIX-favoring MC engine presents a learnability problem for anyone who
>hasn't yet learned that the "return" constant does not denote a return
>character. :)
>
>In looking for ways to solve the problem, we have to ask ourselves:  should
>we give real and true values to constants for the benefit of all future
>users, or do we keep answering this question ad infinitum in order to
>preserve compatibility with 10 years' of legacy code.  Not an easy call to
>make.
>
>As has been done in other cases where new behavior may require substantial
>revisio to legacy code, we could consider a global property, something like
>the useOldStyleConstants, so we can retain compatibility by adding only one
>line of code.

I think you have described the dilemma for Revolution and have hit on 
a very good suggested solution.

There is no international specification that I am aware of that uses 
the symbol of CR to mean ASCII 10. Every specification I have ever 
seen refers to CR as an abbreviation for Carriage Return and it 
always refers to ASCII 13. Even Revolution has CR equal to ASCII 13 
in the CRLF constant.

Secondly, I have never seen any international specification where 
Return is not just a shorthand way of saying Carriage Return. Thus to 
my mind, the only value that Return or CR should equal, under all 
circumstances is ASCII 13.

If there is doubt about the generally accepted value of CR, please 
refer to RFC318 <http://www.faqs.org/rfcs/rfc318.html> which is just 
one example. Keep in mind that back in 1972 this RFC was probably 
created on a Unix machine where lines are terminated with a linefeed 
and in this spec CR is still denoted as ASCII 13.

I'm just guessing here but perhaps someone in the past Revolution 
history saw the text printed on the return key ("return") and decided 
that because on a unix system hitting the key labelled return 
inserted a LineFeed, that it made sense to them to have the Return 
value equal the value created when hitting the Return key on the 
keyboard on their unix machine (the value of LineFeed). Then because 
Return is equivalent to Carriage Return they made the CR constant 
equal ASCII 10 also. They were mistaken to connect the result of 
hitting the Return key to the specific result as seen in a Unix text 
file.

If I was the one to decide how to modify Revolution going forward, 
I'd chose to make Revolution compatible with all the desired future 
users.

1. I'd correct Revolution in the next revision so that CR and return 
are always ASCII 13, regardless of platform.

2. I'd put in a switch for backwards compatibility. 
useOldStyleConstants is a reasonable suggestion.

3. I'd introduce a new constant like endOfLine or LineEnd or 
LineTermination or something else as long as it is not already being 
used in some character specification, and that constant would be 
whatever that specific platform and system uses to terminate lines.


My newly created function calls to make sure I do the right thing 
with the data I get from a database are:



function adjustLineEndOS thetext
   replace (numtochar(13) & numtochar(10)) with numtochar(13) in thetext
   replace numtochar(10) with numtochar(13) in thetext
   replace numtochar(10) with getLineEnd() in thetext
   return thetext
end adjustLineEndOS


function getLineEnd
   switch platform()
   case "Win32"
     put numtochar(13) & numtochar(10) into lineEnd
     break
   case "MacOS"
     put the systemVersion into ver
     put char 1 to (offset(".",ver)-1) of ver into ver
     if ver > 9 then
       put numtochar(10) into lineEnd
     else
       put numtochar(13) into lineEnd
     end if
     break
   default
     put numtochar(10) into lineEnd
   end switch
   return lineEnd
end getLineEnd




Thanks to everyone who offered suggestions and assistance, much 
appreciated. I'll stop talking about this issue and go back to work.


Kee Nethery



More information about the use-livecode mailing list