CSV again.

Peter Haworth pete at lcsql.com
Mon May 7 20:23:42 EDT 2012


Thanks for this Alex!

For list members, I am indebted to Alex for his original csv parsing code
which I used, with his permission, in my SQLiteAdmin application.

I will check out this code and see how it compares to the code currently
embedded in SQLiteAdmin.

Pete
lcSQL Software <http://www.lcsql.com>



On Mon, May 7, 2012 at 4:30 PM, Alex Tweedly <alex at tweedly.net> wrote:

> Some years ago, this list discussed the difficulties of parsing
> comma-separated-value file format; Richard Gaskin has a great article about
> it at http://www.fourthworld.com/**embassy/articles/csv-must-die.**html<http://www.fourthworld.com/embassy/articles/csv-must-die.html>
>
> Following that discussion, I came up with some code to parse CSV in
> Livecode which was significantly faster than the straightforwards methods
> (quoted in the above article). At the time, I put that speed gain down to
> two factors
>
> 1. a way of looking at the problem "sideways" that enables a different
> approach
> 2. a 'clever' use of split + array access
>
> Recently the topic came up again, and I looked at the code again; I now
> realize that in fact the speed gain came entirely from the first of those
> two factors, and using split + arrays was not helpful. Livecode's chunk
> handling is (in this case) faster than using arrays (my only excuse is that
> I was new to Livecode, and so I was using techniques I was familiar with
> from other languages). So I revised the code to use chunk handling rather
> than split+arrays, and the resulting code runs about 40% faster, with the
> added benefit of being slightly easier to read and understand.  The only
> slightly mind-bending feature of the new code is the use of
>
>    set the lineDelimiter to quote
>    repeat for each line k in pData ....
>
> I find it hard to think about "lines" that aren't actually lines :-)
>
> So - for anyone who needs or wants more speed, here's the code
>
>  function CSV3Tab pData,pcoldelim
>>  local tNuData -- contains tabbed copy of data
>>  local tReturnPlaceholder -- replaces cr in field data to avoid line
>>  --                       breaks which would be misread as records;
>>  --                       replaced later during dislay
>>  local tEscapedQuotePlaceholder -- used for keeping track of quotes
>>  --                       in data
>>  local tInQuotedText -- flag set while reading data between quotes
>>  local tInsideQuoted, k
>>  --
>>  put numtochar(11) into tReturnPlaceholder -- vertical tab as
>>  --                       placeholder
>>  put numtochar(2)  into tEscapedQuotePlaceholder -- used to simplify
>>  --                       distinction between quotes in data and those
>>  --                       used in delimiters
>>  --
>>  if pcoldelim is empty then put comma into pcoldelim
>>  -- Normalize line endings:
>>  replace crlf with cr in pData          -- Win to UNIX
>>  replace numtochar(13) with cr in pData -- Mac to UNIX
>>  --
>>  -- Put placeholder in escaped quote (non-delimiter) chars:
>>  replace ("\"&quote) with tEscapedQuotePlaceholder in pData
>>  replace quote&quote with tEscapedQuotePlaceholder in pData
>>  --
>>  put space before pData   -- to avoid ambiguity of starting context
>>  put False into tInsideQuoted
>>  set the linedel to quote
>>  repeat for each line k in pData
>>    if (tInsideQuoted) then
>>      replace cr with tReturnPlaceholder in k
>>      put k after tNuData
>>      put False into tInsideQuoted
>>    else
>>      replace pcoldelim with numtochar(29) in k
>>      put k after tNuData
>>      put true into tInsideQuoted
>>    end if
>>  end repeat
>>  --
>>  delete char 1 of tNuData -- remove the leading space
>>  replace tEscapedQuotePlaceholder with quote in tNuData
>>  return tNuData
>> end CSV3Tab
>>
>>
> -- Alex.
>
> ______________________________**_________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/**mailman/listinfo/use-livecode<http://lists.runrev.com/mailman/listinfo/use-livecode>
>



More information about the use-livecode mailing list