CSV again.
Mike Kerner
MikeKerner at roadrunner.com
Thu Oct 29 20:07:47 EDT 2015
Try using exactly the string I sent: "test<CR>"""
I get test<CR>", when I think what you intend is test<VT>"
On Thu, Oct 29, 2015 at 7:25 PM, Alex Tweedly <alex at tweedly.net> wrote:
>
> On 29/10/2015 14:41, Mike Kerner wrote:
>
>> Belay that. Let's do this on the list.
>>
>> Sure ...
>
>> On Thu, Oct 29, 2015 at 10:22 AM, Mike Kerner <mike at mikekerner.com
>> <mailto:mike at mikekerner.com>> wrote:
>>
>> 1) In v3, why did you remove the <HT> substitution? That just bit me.
>>
>>
> Short answer : A bug.
> Long answer : 2 bugs, but on the same line of code - so kind of just one
> bug really :-)
> Very Long Answer :
> I had a version (say, 2.9) which I tested properly. Then I added some more
> parameterization, and while doing that I thought "This line is wrong, it
> shouldn't be doing "replace TAB with ...", it should be using one of these
> new parameters". This was just plain wrong, so that's bug number 1.
>
> Then I later realized that there was no case where I would need to do the
> "replace" as written - so I commented out the line (also, wrong - that's
> bug number 2).
>
>
> Solution:
> I enclose below a new version, csvToTab4. Only change (in the card script)
> is that line 37 changed from
> -- replace pOldItemDelim with pNewTAB in theInsideStringSoFar
> to
> replace TAB with pNewTAB in theInsideStringSoFar
>
> And with that change it does (AFAIK) properly produce <GS> (or whatever
> you pass in as pNewTAB) for any embedded TAB chars.
>
> 2) I'm not sure we should bore everyone else with the details on the list,
>> but I'd like to pick your brain about some of the details of what you're
>> thinking in various parts of this as I intend to do some tweaking and
>> commenting for future reference.
>>
> Yeah, it would be great to improve the comments, and hopefully explain
> what it's doing.
>
> On 29/10/2015 15:01, Mike Kerner wrote:
>
>> So beyond the embedded <HT>, I found another issue. Let's say the string
>> is
>> "test<CR>"""
>>
>>
>> The <CR> is not handled.
>>
> Hmmm - in my testing it is, I give it ( last line is same as this example
> you give )
>
> INPUT
>
> a,"b
> c"
> "c<TAB>d"
> "e<CR>"""
>
> and get OUTPUT
> a<TAB>b<VT>c
> c<GS>d
> e<VT>"
>
> which I think is correct. Do you have a more complex test case, or do you
> get different results ? Can you send me thae case where you see the problem
> (off-list) ? Thanks.
>
> Should you perhaps do your substitutions on the "inside", instead of on the
>> "passedQuote"?
>>
>> Hmmm - tempting, but no.
>
> Firstly, it would need to do the replace in the current item both for
> status = 'inside' and 'passedquote' because if you have input like
> "one<TAB> two""three""four<TAB>five"
> the status goes from 'inside' to 'passedquote' to 'inside' to
> 'passedquote' to etc. and for the latter TAB character it is 'passedquote'.
>
> More generally, I want to do these substitutions in as few places as
> possible (i.e. so that I am passing the longest possible string to the
> engine to do a speedy 'replace'), so the best time to do that after
> 'passedquote'.
>
> New version
> function CSVToTab4 pData, pOldLineDelim, pOldItemDelim, pNewCR, pNewTAB
> -- fill in defaults
> if pOldLineDelim is empty then put CR into pOldLineDelim
> if pOldItemDelim is empty then put COMMA into pOldItemDelim
> if pNewCR is empty then put numtochar(11) into pNewCR -- Use <VT> for
> quoted CRs
> if pNewTAB is empty then put numtochar(29) into pNewTAB -- Use
> <GS> (group separator) for quoted TABs
>
> local tNuData -- contains tabbed copy of data
>
> local tStatus, theInsideStringSoFar
>
> -- Normalize line endings: REMOVED
> -- Will normaly be correct already, only binfile: or similar chould
> make this necessary
> -- and that exceptional case should be the caller's responsibility
>
> put "outside" into tStatus
> set the itemdel to quote
> repeat for each item k in pData
> -- put tStatus && k & CR after msg
> switch tStatus
>
> case "inside"
> put k after theInsideStringSoFar
> put "passedquote" into tStatus
> next repeat
>
> case "passedquote"
> -- decide if it was a duplicated escapedQuote or a closing
> quote
> if k is empty then -- it's a duplicated quote
> put quote after theInsideStringSoFar
> put "inside" into tStatus
> next repeat
> end if
> -- not empty - so we remain inside the cell, though we have
> left the quoted section
> -- NB this allows for quoted sub-strings within the cell
> content !!
> replace pOldLineDelim with pNewCR in theInsideStringSoFar
> replace TAB with pNewTAB in theInsideStringSoFar
> put theInsideStringSoFar after tNuData
>
> case "outside"
> replace pOldItemDelim with TAB in k
> -- and deal with the "empty trailing item" issue in Livecode
> replace (pNewTAB & pOldLineDelim) with pNewTAB & pNewTAB & CR
> in k
> put k after tNuData
> put "inside" into tStatus
> put empty into theInsideStringSoFar
> next repeat
> default
> put "defaulted"
> break
> end switch
> end repeat
>
> -- and finally deal with the trailing item isse in input data
> -- i.e. the very last char is a quote, so there is no trigger to flush
> the
> -- last item
> if the last char of pData = quote then
> put theInsideStringSoFar after tNuData
> end if
>
> return tNuData
> end CSVToTab4
>
> -- Alex.
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
--
On the first day, God created the heavens and the Earth
On the second day, God created the oceans.
On the third day, God put the animals on hold for a few hours,
and did a little diving.
And God said, "This is good."
More information about the use-livecode
mailing list