CSV-parsing post?
Ken Ray
kray at sonsothunder.com
Sat Dec 4 01:59:21 EST 2004
On 12/3/04 10:52 AM, "Richard Gaskin" <ambassador at fourthworld.com> wrote:
> MisterX wrote:
>
>>> Richard Gaskin wrote:
>>> I could have sworn I'd seen a post here in parsing CSV files
>>> recently, but now that I'm in a position to test it I can't
>>> turn it up.
>>>
>>> W ould the poster kindly re-post? Thanks in advnance -
>>
>> c for commas or c for concurrent ;)
>
> In "CSV" the "C" stands for "comma" ("comma-separated values"). "C"
> stands for "concurrent" in "CVS".
>
> You and I both currently use a method that walks through the data
> character by character, but the method I'm looking for uses the split
> command in a novel way with a ten-fold speed improvement.
Here's what I have from Alex Tweedly that uses split:
function CSV2Tab3 pData
local tNuData -- contains tabbed copy of data
local tReturnPlaceholder -- replaces cr in field data to avoid line
-- breaks which would be misread as records;
-- replaced later during dislay
local tEscapedQuotePlaceholder -- used for keeping track of quotes
-- in data
local tInQuotedText -- flag set while reading data between quotes
--
put numtochar(11) into tReturnPlaceholder -- vertical tab as
-- placeholder
put numtochar(2) into tEscapedQuotePlaceholder -- used to simplify
-- distinction between quotes in data and those
-- used in delimiters
--
-- Normalize line endings:
replace crlf with cr in pData -- Win to UNIX
replace numtochar(13) with cr in pData -- Mac to UNIX
--
-- Put placeholder in escaped quote (non-delimiter) chars:
replace ("\""e) with tEscapedQuotePlaceholder in pData
replace quote"e with tEscapedQuotePlaceholder in pData --<NEW
--
put space before pData -- to avoid ambiguity of starting context
split pData by quote
put False into tInsideQuoted
repeat for each element k in pData
if (tInsideQuoted) then
replace cr with tReturnPlaceholder in k
put k after tNuData
put False into tInsideQuoted
else
replace comma with tab in k
put k after tNuData
put true into tInsideQuoted
end if
end repeat
--
delete char 1 of tNuData -- remove the leading space
replace tEscapedQuotePlaceholder with quote in tNuData
return tNuData
end CSV2Tab3
Ken Ray
Sons of Thunder Software
Web site: http://www.sonsothunder.com/
Email: kray at sonsothunder.com
More information about the use-livecode
mailing list