CSV again.

Alex Tweedly alex at tweedly.net
Sat Oct 17 05:03:55 EDT 2015


Naturally it must be removed.

But I have a more philosophical issue / question.


TSV (in and of itself) doesn't have any quotes, and so doesn't handle 
quoted CRs or TABs.

Currently, the 'old' version - as in Richard's published article, 
doesn't handle TAB characters enclosed within a quoted cell. The 'new' 
version does - but only by returning the data delimited by <GS> instead 
of TAB, and leaving enclosed TABs alone - a mistake, IMHO.

I believe that what the converter should do is :
  - return TSV - i.e. delimited by TABs
  - replace quoted CR by <VT> within quoted cells (as it does now)
  - replace quoted TABs by <GS> within quoted cells

Any comments or suggestions ?

Thanks
Alex.

On 17/10/2015 02:34, Mike Kerner wrote:
> It's safe as long as you remember to remove it at the end of the function
>
> On Fri, Oct 16, 2015 at 7:12 PM, Alex Tweedly <alex at tweedly.net> wrote:
>
>> Duh - replying to myself again :-)
>>
>> It looks as though that's exactly what you do mean - it certainly
>> generates the problems you described earlier. And my one-line additional
>> test would (does in my testing) solve it properly - without it, we don't
>> get a chance to flush "theInsideStringSoFar" to tNuData, with the extra
>> line we do. And adding it is always safe (AFAICI).
>>
>> -- Alex.
>>
>>
>> On 17/10/2015 00:03, Alex Tweedly wrote:
>>
>>> Sorry, Mike, but can you describe what you mean by a "naked" line ?
>>> Is it simply one with no line delimiter after it ?
>>> i.e. could only happen on the very last line of a file of input ?
>>>
>>> Could that be solved by a simple test (after the various 'replace'
>>> statements)
>>>      if the last char of pData <> CR then put CR after pData
>>> before the parsing happens ?
>>>
>>> -- Alex.
>>>
>>>
>>> On 16/10/2015 17:19, Mike Kerner wrote:
>>>
>>>> No, the problem isn't that LC use LF and CR for ascii(10) and ignores
>>>> ascii(13).  That's just a personal problem.
>>>>
>>>> The problem, here, is that the csv parser handles a naked line and a
>>>> terminated line differently.  If the line is terminated, it parses it one
>>>> way, and if it is not, it parses it (incorrectly) a different way, which
>>>> makes me wonder if this is the latest version.
>>>>
>>>> On Fri, Oct 16, 2015 at 11:28 AM, Bob Sneidar <
>>>> bobsneidar at iotecdigital.com>
>>>> wrote:
>>>>
>>>> But what if the cr or lf or crlf is inside quoted text, meaning it is not
>>>>> a delimiter? Oh, I'm afraid the deflector shield will be quite
>>>>> operational
>>>>> when your friends arrive.
>>>>>
>>>>> Bob S
>>>>>
>>>>>
>>>>> On Oct 16, 2015, at 08:04 , Alex Tweedly <alex at tweedly.net> wrote:
>>>>>> Hi Mike,
>>>>>>
>>>>>> thanks for that additional info.
>>>>>>
>>>>>> I *think* (it's been 3 years) I left them as <GS> (i.e. numtochar(29))
>>>>>>
>>>>> because I had some data including normal TAB characters within the cells
>>>>> (!!( and thought <GS> was a safer bet - though of course nothing is
>>>>> completely safe. It's then up to the caller to decide whether to do
>>>>> "replace numtochar(29) with TAB in ...", or do TAB escaping, or whatever
>>>>> they want.
>>>>>
>>>>>> As for the other bigger problem .... Oh dear = CR vs LF vs CRLF ....
>>>>>>
>>>>>> Are you on Mac or Windows or Linux ?
>>>>>> How is the LF delimited data getting into your app ?
>>>>>> Maybe we should just add a "replace chartonum(13) with CR in pData" ?
>>>>>>
>>>>>> (I confess to being confused by this - I know that LC does
>>>>>>
>>>>> auto-translation of line delimiters at various places, but I'm not sure
>>>>> when it is, or isn't, completely safe. Maybe the easiest thing is to
>>>>> jst do
>>>>> all the translations ....
>>>>>
>>>>>>    replace CRLF with CR in pData
>>>>>>    replace numtochar(10) with CR in pData
>>>>>>    replace numtochar(13) with CR in pData
>>>>>>
>>>>>> -- Alex.
>>>>>>
>>>>> _______________________________________________
>>>>> use-livecode mailing list
>>>>> use-livecode at lists.runrev.com
>>>>> Please visit this url to subscribe, unsubscribe and manage your
>>>>> subscription preferences:
>>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>>>
>>>>>
>>>>
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode at lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>
>>
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>
>





More information about the use-livecode mailing list