CSV again.
Alex Tweedly
alex at tweedly.net
Thu Oct 29 20:50:31 EDT 2015
I did. And I get
test<VT>"
as expected. I'm obviously missing something here - but let's go
off-list until we figure it out ....
Here's my test script
on mouseUp
local tmp, t1
put quote & "test" & CR & quote & quote & quote &CR into tmp
put csvToTab3(tmp) into t1
put t1 & CR after msg
repeat for each char x in t1
put chartonum(x) & ":" & X & CR after msg
end repeat
replace numtochar(29) with "<GS>" in t1
replace numtochar(11) with "<VT>" in t1
replace TAB with "<TAB>" in t1
put "[" & t1 & "]" & CR & CR after msg
end mouseUp
and my output is
test
"
116:t
101:e
115:s
116:t
11:
34:"
10:
[test<VT>"
]
Do you get different ? Can you please send me the output ?
Thanks
-- Alex.
On 30/10/2015 00:07, Mike Kerner wrote:
> Try using exactly the string I sent: "test<CR>"""
>
> I get test<CR>", when I think what you intend is test<VT>"
>
> On Thu, Oct 29, 2015 at 7:25 PM, Alex Tweedly <alex at tweedly.net> wrote:
>
>> On 29/10/2015 14:41, Mike Kerner wrote:
>>
>>> Belay that. Let's do this on the list.
>>>
>>> Sure ...
>>> On Thu, Oct 29, 2015 at 10:22 AM, Mike Kerner <mike at mikekerner.com
>>> <mailto:mike at mikekerner.com>> wrote:
>>>
>>> 1) In v3, why did you remove the <HT> substitution? That just bit me.
>>>
>>>
>> Short answer : A bug.
>> Long answer : 2 bugs, but on the same line of code - so kind of just one
>> bug really :-)
>> Very Long Answer :
>> I had a version (say, 2.9) which I tested properly. Then I added some more
>> parameterization, and while doing that I thought "This line is wrong, it
>> shouldn't be doing "replace TAB with ...", it should be using one of these
>> new parameters". This was just plain wrong, so that's bug number 1.
>>
>> Then I later realized that there was no case where I would need to do the
>> "replace" as written - so I commented out the line (also, wrong - that's
>> bug number 2).
>>
>>
>> Solution:
>> I enclose below a new version, csvToTab4. Only change (in the card script)
>> is that line 37 changed from
>> -- replace pOldItemDelim with pNewTAB in theInsideStringSoFar
>> to
>> replace TAB with pNewTAB in theInsideStringSoFar
>>
>> And with that change it does (AFAIK) properly produce <GS> (or whatever
>> you pass in as pNewTAB) for any embedded TAB chars.
>>
>> 2) I'm not sure we should bore everyone else with the details on the list,
>>> but I'd like to pick your brain about some of the details of what you're
>>> thinking in various parts of this as I intend to do some tweaking and
>>> commenting for future reference.
>>>
>> Yeah, it would be great to improve the comments, and hopefully explain
>> what it's doing.
>>
>> On 29/10/2015 15:01, Mike Kerner wrote:
>>
>>> So beyond the embedded <HT>, I found another issue. Let's say the string
>>> is
>>> "test<CR>"""
>>>
>>>
>>> The <CR> is not handled.
>>>
>> Hmmm - in my testing it is, I give it ( last line is same as this example
>> you give )
>>
>> INPUT
>>
>> a,"b
>> c"
>> "c<TAB>d"
>> "e<CR>"""
>>
>> and get OUTPUT
>> a<TAB>b<VT>c
>> c<GS>d
>> e<VT>"
>>
>> which I think is correct. Do you have a more complex test case, or do you
>> get different results ? Can you send me thae case where you see the problem
>> (off-list) ? Thanks.
>>
>> Should you perhaps do your substitutions on the "inside", instead of on the
>>> "passedQuote"?
>>>
>>> Hmmm - tempting, but no.
>> Firstly, it would need to do the replace in the current item both for
>> status = 'inside' and 'passedquote' because if you have input like
>> "one<TAB> two""three""four<TAB>five"
>> the status goes from 'inside' to 'passedquote' to 'inside' to
>> 'passedquote' to etc. and for the latter TAB character it is 'passedquote'.
>>
>> More generally, I want to do these substitutions in as few places as
>> possible (i.e. so that I am passing the longest possible string to the
>> engine to do a speedy 'replace'), so the best time to do that after
>> 'passedquote'.
>>
>> New version
>> function CSVToTab4 pData, pOldLineDelim, pOldItemDelim, pNewCR, pNewTAB
>> -- fill in defaults
>> if pOldLineDelim is empty then put CR into pOldLineDelim
>> if pOldItemDelim is empty then put COMMA into pOldItemDelim
>> if pNewCR is empty then put numtochar(11) into pNewCR -- Use <VT> for
>> quoted CRs
>> if pNewTAB is empty then put numtochar(29) into pNewTAB -- Use
>> <GS> (group separator) for quoted TABs
>>
>> local tNuData -- contains tabbed copy of data
>>
>> local tStatus, theInsideStringSoFar
>>
>> -- Normalize line endings: REMOVED
>> -- Will normaly be correct already, only binfile: or similar chould
>> make this necessary
>> -- and that exceptional case should be the caller's responsibility
>>
>> put "outside" into tStatus
>> set the itemdel to quote
>> repeat for each item k in pData
>> -- put tStatus && k & CR after msg
>> switch tStatus
>>
>> case "inside"
>> put k after theInsideStringSoFar
>> put "passedquote" into tStatus
>> next repeat
>>
>> case "passedquote"
>> -- decide if it was a duplicated escapedQuote or a closing
>> quote
>> if k is empty then -- it's a duplicated quote
>> put quote after theInsideStringSoFar
>> put "inside" into tStatus
>> next repeat
>> end if
>> -- not empty - so we remain inside the cell, though we have
>> left the quoted section
>> -- NB this allows for quoted sub-strings within the cell
>> content !!
>> replace pOldLineDelim with pNewCR in theInsideStringSoFar
>> replace TAB with pNewTAB in theInsideStringSoFar
>> put theInsideStringSoFar after tNuData
>>
>> case "outside"
>> replace pOldItemDelim with TAB in k
>> -- and deal with the "empty trailing item" issue in Livecode
>> replace (pNewTAB & pOldLineDelim) with pNewTAB & pNewTAB & CR
>> in k
>> put k after tNuData
>> put "inside" into tStatus
>> put empty into theInsideStringSoFar
>> next repeat
>> default
>> put "defaulted"
>> break
>> end switch
>> end repeat
>>
>> -- and finally deal with the trailing item isse in input data
>> -- i.e. the very last char is a quote, so there is no trigger to flush
>> the
>> -- last item
>> if the last char of pData = quote then
>> put theInsideStringSoFar after tNuData
>> end if
>>
>> return tNuData
>> end CSVToTab4
>>
>> -- Alex.
>>
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>
>
More information about the use-livecode
mailing list