How trim: Bug in RegExp engine

Marielle Lange mlange at lexicall.org
Mon Oct 24 18:02:33 EDT 2005


In all programs I know that implement regular expressions, replaceText 
("A C","^ *","") will return A C not C. I tested it in BBedit, A C is  
returned. This is not due to the greediness of the "*", as this  
cannot explain that ^ seems to eat up the A. Perhaps it is due to the  
fact the routine is coded within revolution as it seems that the ^  
absorbs the first letter.

put replaceText("A C","^p","")   -> A C
put replaceText("A C","^","")     -> empty.

This looks like a bug.

Marielle

>
> Mark Greenberg wrote:
> Though it's academic now since Bob has his solutions, this isn't a  
> Rev bug; it's the way Regular Expressions work (or fail to in this  
> case).  The problem is in the greediness of the * quantifier.

> Though I can't say that I totally understand why, in cases where  
> the RegEx reduces to nothing after the optional parts are removed,  
> matching with either the ? or the * quantifiers causes unexpected  
> results, regardless of whether the RegEx is in Perl, egrep, or  
> wherever.

> This is because the RegEx engine continues to try to find a match  
> (to nothingness, I guess), consumes the entire string, and then  
> backtracks giving up one character at a time.  Why "C" instead of  
> "A C"?  I don't know, but my RegEx reference book (Mastering  
> Regular Expressions by Jeffrey E. F. Friedl) does warn against such  
> constructions as "^ *" with a lengthy explanation about greediness  
> of the * and ? quantifiers.


------------------------------------------------------------------------ 
--------
Marielle Lange (PhD),  Psycholinguist

Alternative emails: mlange at blueyonder.co.uk, M.Lange at ed.ac.uk
Homepage                                                            
http://homepages.lexicall.org/mlange/
Easy access to lexical databases                    http://lexicall.org
Supporting Education Technologists              http:// 
revolution.lexicall.org





More information about the use-livecode mailing list