Read a text file

Alex Tweedly alex at tweedly.net
Mon Jan 17 20:36:48 EST 2005


Thomas Gutzmann wrote:

> I haven't tested the regular expressions in Rev, but in Perl, it 
> would  take some scratching of the head only to cope with commas 
> embedded in  quotes. Or some browsing in the Internet. But it depends 
> on the quality  of the RE parser.

Rev's RE library is based on PCRE, so should be adequately capable.

However, I don't think it's as easy to parse the realistic version of 
CSV with REs as you might think. Six months ago (when the earlier csv 
thread on this list started), I looked into it; took about 5 mins to 
convince me I couldn't do it myself without spending more time than I 
wanted in learning the obscure corners of regex. So I spent an hour or 
so searching the Internet, but didn't find anything even approaching the 
real cases you encounter in csv files.  Since there was an alternative 
that did everything being discussed then, and which was adequately fast, 
I didn't look any further.

Since I got your mail, I've spent another hour or two idly browsing the 
net. I followed each of the to 20 hits from a couple of different Google 
searches. I found a lot of articles that claim to have a regex that 
handles csv files - but in fact their "coverage" ranges between 10% and 
70% of the cases I think I'd need to handle in real apps.

There is one very credible looking article on a .NET regex that sounds 
like it might do more than that - but the regex used is clearly not 
going to succeed in PCRE - and indeed didn't in Python or Rev - so 
either it uses a feature of .NET that isn't in Perl/PCRE, or there's a 
typo, or something.

I believe that if there were a complete solution in regex, it would show 
up pretty high on Google, so I am now, still, of the opinion that the 
complete csv problem is beyond regex, even though the simpler cases can 
be done fairly easily.  I'd be delighted if someone could change that 
opinion.

-- Alex.


-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.11 - Release Date: 12/01/2005



More information about the use-livecode mailing list