Stupid CSV tricks

Richard Gaskin ambassador at FourthWorld.com
Wed Jun 12 17:28:01 EDT 2002


Some implementations of the CSV format do not consistenly use
comma-separated values.  Microsoft products and others use a comma only for
numeric values, with all others designated as text by enclosing them in
quotes, effectively using a quote-comma-quote delimiter.

To make parsing such files even more of a challenge, a quoted string can
contain any character, including commas and returns.

I've tried a number of algorithms for parsing these files efficiently, and
even explored the issue wih Ken Ray and others, and the only robust
algorithm we've come up with yet is one which walks through each of the
characters to determine what is a delimiter and wha is part of the data.

I'd like to find a faster method, but thus far all attempts at using the
replace command and replacetext function have fallen short in one way or
another.

Considering the ubiquity of this format, I would imagine I'm not the first
MetaTalker needed to parse CSV.  Anyone found an algorithm faster that
walking through the chars?

-- 
 Richard Gaskin 
 Fourth World Media Corporation
 Custom Software and Web Development for All Major Platforms
 Developer of WebMerge 2.0: Publish any Database on Any Site
 ___________________________________________________________
 Ambassador at FourthWorld.com       http://www.FourthWorld.com
 Tel: 323-225-3717                       AIM: FourthWorldInc




More information about the metacard mailing list