Sample code for reading a CSV file

Paul Dupuis paul at researchware.com
Thu Feb 17 15:01:53 EST 2011


First, thanks to everyone who replied, but especially to Nosanity. Your 
code reminded me that you can effectively tell when you are inside an 
encapsulated bit of data by an odd/even count of the encapsulation 
character. So, for anyone who wants it, here is a generalized function 
that I just wrote to parse a CSV file, regardless of the field or record 
delimiters (commas, tabs or whatever) and to deal with encapsulation 
appropriately.

This assumes you read the entire CSV file into a variable you pass into 
pData, so a call would look like:

put csvToArray(myEntireCSVData,return,comma,quote) into myDataAsArray

I have tested it a bit in the last 30 minutes and it working in the 
cases I tried, but did not test exhaustively and have not checked 
performance on large datasets. If any one uses this and run into an 
issue, please let me know.

function csvToArray pData, pRecordDelimiter, pFieldDelimiter, 
pEncapsulationDelimiter
   local tReservedRecordDelimiter, tReservedFieldDelimiter, tArray

   # Initialize the temporary record and field delimiters. Change these 
if your CSV file may contain them.
   put charToNum(1) into tReservedRecordDelimiter; put charToNum(2) into 
tReservedFieldDelimiter;

   # Step 1: Replace any Record or Field delimiters that are 
encapsulated with temporary characters
   set itemdel to pEncapsulationDelimiter
   repeat with i = 1 to the number of items in pData
     if trunc(i/2) = (i/2) then
       replace pFieldDelimiter with tReservedFieldDelimiter in item i of 
pData
       replace pRecordDelimiter with tReservedRecordDelimiter in item i 
of pData
     end if
   end repeat

   # Step 2: Replace all occurances of the encapsulation delimiter
   replace pEncapsulationDelimiter with empty in pData

   # Step 3: Parse records and fields into the array, replace any 
occurances of the reserved record and field delimiters for each element
   set itemdel to pFieldDelimiter
   set lineDel to pRecordDelimiter
   repeat with i = 1 to the number of lines in pData
       repeat with j = 1 to the number of items in line i of pData
          get item j of line i of pData
          replace tReservedRecordDelimiter with pRecordDelimiter in it
          replace tReservedFieldDelimiter with pFieldDelimiter in it
          put it into tArray[i][j]
       end repeat
    end repeat

    # Step 4: return the array
    return tArray
end csvToArray


-- 
Paul Dupuis
Cofounder
Researchware, Inc.
http://www.researchware.com/





More information about the use-livecode mailing list