Regex (MatchText) speed

Richard Gaskin ambassador at fourthworld.com
Wed Jun 23 21:32:52 EDT 2004


Troy Rollins wrote:

> I took a look through the archives, and didn't see anything definitive 
> about speed advantages in Rev of using matchText with regEx, compared to 
> more basic "chunking" techniques - including "contains". I find Rev is a 
> string handling monster with Transcript alone, but don't know just what 
> it is actually doing with Regex... for instance is it "a Transcript 
> regex engine" or some kind of compiled external? Or a compiled internal?
> 
> I noted that Tuviah says that the regex engine caches the last 20 
> patterns, but...
> 
> Anyone have a real-world sense of the speed difference? I have a parsing 
> routine which I put together hastily, knowing that it would need to be 
> later optimized. I'm edging in on that optimization phase, and I'm 
> wondering what angle I might want to approach it. Speed is definitely a 
> concern.

Results can vary, depending on what you're doing.  The best method 
(though admittedly tedious) is to implement both and time them.

In one specific case I needed to parse HTML attributes, and used both 
regex and a combination of offset and replace, and the more generalized 
regex took about twice as long.  I've seen similar results with parsing 
HTML tags, but have done little benchmarking with regex on the 
assumption that it's generalized conveniences will usually perform 
slower than a custom algorithm for the job at hand.

I would love to be proven wrong, however; crafting custom algorithms for 
every little text parsing task is indeed tedious. :)

-- 
  Richard Gaskin
  Fourth World Media Corporation
  ___________________________________________________
  Rev tools and more:  http://www.fourthworld.com/rev


More information about the use-livecode mailing list