Sample code for reading a CSV file
Paul Dupuis
paul at researchware.com
Thu Feb 17 15:01:53 EST 2011
First, thanks to everyone who replied, but especially to Nosanity. Your
code reminded me that you can effectively tell when you are inside an
encapsulated bit of data by an odd/even count of the encapsulation
character. So, for anyone who wants it, here is a generalized function
that I just wrote to parse a CSV file, regardless of the field or record
delimiters (commas, tabs or whatever) and to deal with encapsulation
appropriately.
This assumes you read the entire CSV file into a variable you pass into
pData, so a call would look like:
put csvToArray(myEntireCSVData,return,comma,quote) into myDataAsArray
I have tested it a bit in the last 30 minutes and it working in the
cases I tried, but did not test exhaustively and have not checked
performance on large datasets. If any one uses this and run into an
issue, please let me know.
function csvToArray pData, pRecordDelimiter, pFieldDelimiter,
pEncapsulationDelimiter
local tReservedRecordDelimiter, tReservedFieldDelimiter, tArray
# Initialize the temporary record and field delimiters. Change these
if your CSV file may contain them.
put charToNum(1) into tReservedRecordDelimiter; put charToNum(2) into
tReservedFieldDelimiter;
# Step 1: Replace any Record or Field delimiters that are
encapsulated with temporary characters
set itemdel to pEncapsulationDelimiter
repeat with i = 1 to the number of items in pData
if trunc(i/2) = (i/2) then
replace pFieldDelimiter with tReservedFieldDelimiter in item i of
pData
replace pRecordDelimiter with tReservedRecordDelimiter in item i
of pData
end if
end repeat
# Step 2: Replace all occurances of the encapsulation delimiter
replace pEncapsulationDelimiter with empty in pData
# Step 3: Parse records and fields into the array, replace any
occurances of the reserved record and field delimiters for each element
set itemdel to pFieldDelimiter
set lineDel to pRecordDelimiter
repeat with i = 1 to the number of lines in pData
repeat with j = 1 to the number of items in line i of pData
get item j of line i of pData
replace tReservedRecordDelimiter with pRecordDelimiter in it
replace tReservedFieldDelimiter with pFieldDelimiter in it
put it into tArray[i][j]
end repeat
end repeat
# Step 4: return the array
return tArray
end csvToArray
--
Paul Dupuis
Cofounder
Researchware, Inc.
http://www.researchware.com/
More information about the use-livecode
mailing list