CSV again.

Alex Tweedly alex at tweedly.net
Thu Oct 29 20:50:31 EDT 2015


I did. And I get
test<VT>"

as expected. I'm obviously missing something here - but let's go 
off-list until we figure it out ....

Here's my test script
    on mouseUp

    local tmp, t1
    put quote & "test" & CR & quote & quote & quote &CR into tmp

    put csvToTab3(tmp) into t1
    put t1 & CR after msg
    repeat for each char x in t1
       put chartonum(x) & ":" & X & CR after msg
    end repeat
    replace numtochar(29) with "<GS>" in t1
    replace numtochar(11) with "<VT>" in t1
    replace TAB with "<TAB>" in t1
    put "[" & t1 & "]" & CR & CR  after msg
end mouseUp

and my output is
test
"

116:t
101:e
115:s
116:t
11:

34:"
10:

[test<VT>"
]

Do you get different ? Can you please send me the output ?
Thanks
-- Alex.




On 30/10/2015 00:07, Mike Kerner wrote:
> Try using exactly the string I sent: "test<CR>"""
>
> I get test<CR>", when I think what you intend is test<VT>"
>
> On Thu, Oct 29, 2015 at 7:25 PM, Alex Tweedly <alex at tweedly.net> wrote:
>
>> On 29/10/2015 14:41, Mike Kerner wrote:
>>
>>> Belay that.  Let's do this on the list.
>>>
>>> Sure ...
>>> On Thu, Oct 29, 2015 at 10:22 AM, Mike Kerner <mike at mikekerner.com
>>> <mailto:mike at mikekerner.com>> wrote:
>>>
>>>      1) In v3, why did you remove the <HT> substitution?  That just bit me.
>>>
>>>
>> Short answer : A bug.
>> Long answer : 2 bugs, but on the same line of code - so kind of just one
>> bug really :-)
>> Very Long Answer :
>> I had a version (say, 2.9) which I tested properly. Then I added some more
>> parameterization, and while doing that I thought "This line is wrong, it
>> shouldn't be doing "replace TAB with ...", it should be using one of these
>> new parameters". This was just plain wrong, so that's bug number 1.
>>
>> Then I later realized that there was no case where I would need to do the
>> "replace" as written - so I commented out the line (also, wrong - that's
>> bug number 2).
>>
>>
>> Solution:
>> I enclose below a new version, csvToTab4. Only change (in the card script)
>> is that line 37 changed from
>>      -- replace pOldItemDelim with pNewTAB in theInsideStringSoFar
>> to
>>      replace TAB with pNewTAB in theInsideStringSoFar
>>
>> And with that change it does (AFAIK) properly produce <GS> (or whatever
>> you pass in as pNewTAB) for any embedded TAB chars.
>>
>> 2) I'm not sure we should bore everyone else with the details on the list,
>>> but I'd like to pick your brain about some of the details of what you're
>>> thinking in various parts of this as I intend to do some tweaking and
>>> commenting for future reference.
>>>
>> Yeah, it would be great to improve the comments, and hopefully explain
>> what it's doing.
>>
>> On 29/10/2015 15:01, Mike Kerner wrote:
>>
>>> So beyond the embedded <HT>, I found another issue.  Let's say the string
>>> is
>>> "test<CR>"""
>>>
>>>
>>> The <CR> is not handled.
>>>
>> Hmmm - in my testing it is, I give it ( last line is same as this example
>> you give )
>>
>> INPUT
>>
>> a,"b
>> c"
>> "c<TAB>d"
>> "e<CR>"""
>>
>> and get OUTPUT
>> a<TAB>b<VT>c
>> c<GS>d
>> e<VT>"
>>
>> which I think is correct. Do you have a more complex test case, or do you
>> get different results ? Can you send me thae case where you see the problem
>> (off-list) ?  Thanks.
>>
>> Should you perhaps do your substitutions on the "inside", instead of on the
>>> "passedQuote"?
>>>
>>> Hmmm - tempting, but no.
>> Firstly, it would need to do the replace in the current item both for
>> status = 'inside' and 'passedquote' because if you have input like
>>     "one<TAB> two""three""four<TAB>five"
>> the status goes from 'inside' to 'passedquote' to 'inside' to
>> 'passedquote' to etc. and for the latter TAB character it is 'passedquote'.
>>
>> More generally, I want to do these substitutions in as few places as
>> possible (i.e. so that I am passing the longest possible string to the
>> engine to do a speedy 'replace'), so the best time to do that after
>> 'passedquote'.
>>
>> New version
>> function CSVToTab4 pData, pOldLineDelim, pOldItemDelim, pNewCR, pNewTAB
>>     -- fill in defaults
>>     if pOldLineDelim is empty then put CR into pOldLineDelim
>>     if pOldItemDelim is empty then put COMMA into pOldItemDelim
>>     if pNewCR is empty then put numtochar(11) into pNewCR   -- Use <VT> for
>> quoted CRs
>>     if pNewTAB is empty then put numtochar(29) into pNewTAB      -- Use
>> <GS> (group separator) for quoted TABs
>>
>>     local tNuData                         -- contains tabbed copy of data
>>
>>     local tStatus, theInsideStringSoFar
>>
>>     -- Normalize line endings: REMOVED
>>     -- Will normaly be correct already, only binfile: or similar chould
>> make this necessary
>>     -- and that exceptional case should be the caller's responsibility
>>
>>     put "outside" into tStatus
>>     set the itemdel to quote
>>     repeat for each item k in pData
>>        -- put tStatus && k & CR after msg
>>        switch tStatus
>>
>>           case "inside"
>>              put k after theInsideStringSoFar
>>              put "passedquote" into tStatus
>>              next repeat
>>
>>           case "passedquote"
>>              -- decide if it was a duplicated escapedQuote or a closing
>> quote
>>              if k is empty then   -- it's a duplicated quote
>>                 put quote after theInsideStringSoFar
>>                 put "inside" into tStatus
>>                 next repeat
>>              end if
>>              -- not empty - so we remain inside the cell, though we have
>> left the quoted section
>>              -- NB this allows for quoted sub-strings within the cell
>> content !!
>>              replace pOldLineDelim with pNewCR in theInsideStringSoFar
>>              replace TAB with pNewTAB in theInsideStringSoFar
>>              put theInsideStringSoFar after tNuData
>>
>>           case "outside"
>>              replace pOldItemDelim with TAB in k
>>              -- and deal with the "empty trailing item" issue in Livecode
>>              replace (pNewTAB & pOldLineDelim) with pNewTAB & pNewTAB & CR
>> in k
>>              put k after tNuData
>>              put "inside" into tStatus
>>              put empty into theInsideStringSoFar
>>              next repeat
>>           default
>>              put "defaulted"
>>              break
>>        end switch
>>     end repeat
>>
>>     -- and finally deal with the trailing item isse in input data
>>     -- i.e. the very last char is a quote, so there is no trigger to flush
>> the
>>     --      last item
>>     if the last char of pData = quote then
>>        put theInsideStringSoFar after tNuData
>>     end if
>>
>>     return tNuData
>> end CSVToTab4
>>
>> -- Alex.
>>
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>
>





More information about the use-livecode mailing list