perl regex modifiers

Dar Scott dsc at swcp.com
Sat Jul 26 11:48:01 EDT 2003


On Friday, July 25, 2003, at 07:21 PM, Mark Brownell wrote:

> on mouseUp
>   put "Do a <perl>web search for Perl regular expressions</perl>  
> tutorials," into myVar
>   put "<perl>(.*)(</perl>)" into regEx
>
>   -- perlRegEx
>   put the milliseconds into tStartTime
>   repeat with x = 1 to 500
>     put matchText(myVar, regEx, tElement) into bbYes
>   end repeat
>   put (the milliseconds - tStartTime) into ptTime
>   answer tElement
>
>   -- PNLP
>   put the milliseconds into tStartTime
>   repeat with i = 1 to 500
>     put offset("<perl>", myVar) into tNumA
>     put offset("</perl>", myVar) into tNumB
>     put char (tNumA + 6) to (tNumB - 1) of myVar into tElement
>   end repeat
>   put (the milliseconds - tStartTime) into otTime
>   answer tElement
>
>   -- show results
>   answer "perlRegEx = "  & ptTime   & ", PNLP = "  & otTime
> end mouseUp

(Weird.  'myVar' is red in my script editor.)

Part of the timing difference is that you are comparing apples and  
oranges a little bit.

The regex matches the text between the first <perl> and the last  
</perl>.  If <perl></perl> can occur more than once, then that is not  
what you need.

The offset method matches the first existence of <perl> and the first  
existence of </perl>, that is, in any order.  It gets the text between,  
which is empty if one </perl> is before <perl>.

Also, the matchText method sets bbYes, which you will need, I assume.   
The offset method doesn't.

You might have some "don't care" in your need, of course.

However, to compare these, I would make both match only the first  
<perl></perl> pair (ignoring embedded pairs).  Also the offset method  
should set bbYes.  This makes both take longer, but the resulting  
differences in time are less.

Here is my try:

on mouseUp
   put "Do a <perl>web search for Perl regular expressions</perl>  
tutorials," into myVar
   put "<perl>(.*?)</perl>" into regEx  -- Added ?

   -- perlRegEx
   put the long milliseconds into tStartTime
   repeat with x = 1 to 500
     put matchText(myVar, regEx, tElement) into bbYes
   end repeat
   put (the long milliseconds - tStartTime) into ptTime
   answer tElement

   -- PNLP
   put the long milliseconds into tStartTime
   repeat with i = 1 to 500
     -- Added code to set bbYes
     put offset("<perl>", myVar) into tNumA
     if tNumA is 0 then
       put false into bbYes
     else
       put offset("</perl>", myVar, tNumA+5) into tNumB -- Added chars  
to skip
       if tNumB is 0 then
         put false into bbYes
         put empty into tElement
       else
         put char (tNumA + 6) to (tNumA + tNumB +4) of myVar into  
tElement
         put true into bbYes
       end if
     end if
   end repeat
   put (the long milliseconds - tStartTime) into otTime
   answer tElement

   -- show results
   answer "perlRegEx = "  & ptTime   & ", PNLP = "  & otTime
end mouseUp

(I added long to the milliseconds so I get nontrivial times on my  
computer.)

On my computer the matchText takes 30% longer (using your timing code  
above) to match in your example string, but it takes twice as long to  
fail when I add a space in the last element.  It may take tweaking.

(My timing shows less of a difference.)

Dar Scott

************************************************************************ 
****
   Dar Scott Consulting    http://www.swcp.com/dsc/    Programming  
Services
************************************************************************ 
****




More information about the use-livecode mailing list