Stupid CSV tricks
Richard Gaskin
ambassador at FourthWorld.com
Wed Jun 12 17:28:01 EDT 2002
Some implementations of the CSV format do not consistenly use
comma-separated values. Microsoft products and others use a comma only for
numeric values, with all others designated as text by enclosing them in
quotes, effectively using a quote-comma-quote delimiter.
To make parsing such files even more of a challenge, a quoted string can
contain any character, including commas and returns.
I've tried a number of algorithms for parsing these files efficiently, and
even explored the issue wih Ken Ray and others, and the only robust
algorithm we've come up with yet is one which walks through each of the
characters to determine what is a delimiter and wha is part of the data.
I'd like to find a faster method, but thus far all attempts at using the
replace command and replacetext function have fallen short in one way or
another.
Considering the ubiquity of this format, I would imagine I'm not the first
MetaTalker needed to parse CSV. Anyone found an algorithm faster that
walking through the chars?
--
Richard Gaskin
Fourth World Media Corporation
Custom Software and Web Development for All Major Platforms
Developer of WebMerge 2.0: Publish any Database on Any Site
___________________________________________________________
Ambassador at FourthWorld.com http://www.FourthWorld.com
Tel: 323-225-3717 AIM: FourthWorldInc
More information about the metacard
mailing list