CSV-parsing post?

Ken Ray kray at sonsothunder.com
Sat Dec 4 01:59:21 EST 2004


On 12/3/04 10:52 AM, "Richard Gaskin" <ambassador at fourthworld.com> wrote:

> MisterX wrote:
> 
>>> Richard Gaskin wrote:
>>> I could have sworn I'd seen a post here in parsing CSV files
>>> recently, but now that I'm in a position to test it I can't
>>> turn it up.
>>> 
>>> W ould the poster kindly re-post?  Thanks in advnance -
>> 
>> c for commas or c for concurrent ;)
> 
> In "CSV" the "C" stands for "comma" ("comma-separated values").  "C"
> stands for "concurrent" in "CVS".
> 
> You and I both currently use a method that walks through the data
> character by character, but the method I'm looking for uses the split
> command in a novel way with a ten-fold speed improvement.

Here's what I have from Alex Tweedly that uses split:

function CSV2Tab3 pData
   local tNuData -- contains tabbed copy of data
   local tReturnPlaceholder -- replaces cr in field data to avoid line
   --                       breaks which would be misread as records;
   --                       replaced later during dislay
   local tEscapedQuotePlaceholder -- used for keeping track of quotes
   --                       in data
   local tInQuotedText -- flag set while reading data between quotes
   --
   put numtochar(11) into tReturnPlaceholder -- vertical tab as
   --                       placeholder
   put numtochar(2)  into tEscapedQuotePlaceholder -- used to simplify
   --                       distinction between quotes in data and those
   --                       used in delimiters
   --
   -- Normalize line endings:
   replace crlf with cr in pData          -- Win to UNIX
   replace numtochar(13) with cr in pData -- Mac to UNIX
   --
   -- Put placeholder in escaped quote (non-delimiter) chars:
   replace ("\"&quote) with tEscapedQuotePlaceholder in pData
   replace quote&quote with tEscapedQuotePlaceholder in pData --<NEW
   --
   put space before pData   -- to avoid ambiguity of starting context
   split pData by quote
   put False into tInsideQuoted
   repeat for each element k in pData
     if (tInsideQuoted) then
       replace cr with tReturnPlaceholder in k
       put k after tNuData
       put False into tInsideQuoted
     else
       replace comma with tab in k
       put k after tNuData
       put true into tInsideQuoted
     end if
   end repeat
   --
   delete char 1 of tNuData -- remove the leading space
   replace tEscapedQuotePlaceholder with quote in tNuData
   return tNuData
end CSV2Tab3


Ken Ray
Sons of Thunder Software
Web site: http://www.sonsothunder.com/
Email: kray at sonsothunder.com




More information about the use-livecode mailing list