perl regex modifiers
Dar Scott
dsc at swcp.com
Sat Jul 26 07:48:01 EDT 2003
On Friday, July 25, 2003, at 07:21 PM, Mark Brownell wrote:
> on mouseUp
> put "Do a <perl>web search for Perl regular expressions</perl>
> tutorials," into myVar
> put "<perl>(.*)(</perl>)" into regEx
>
> -- perlRegEx
> put the milliseconds into tStartTime
> repeat with x = 1 to 500
> put matchText(myVar, regEx, tElement) into bbYes
> end repeat
> put (the milliseconds - tStartTime) into ptTime
> answer tElement
>
> -- PNLP
> put the milliseconds into tStartTime
> repeat with i = 1 to 500
> put offset("<perl>", myVar) into tNumA
> put offset("</perl>", myVar) into tNumB
> put char (tNumA + 6) to (tNumB - 1) of myVar into tElement
> end repeat
> put (the milliseconds - tStartTime) into otTime
> answer tElement
>
> -- show results
> answer "perlRegEx = " & ptTime & ", PNLP = " & otTime
> end mouseUp
(Weird. 'myVar' is red in my script editor.)
Part of the timing difference is that you are comparing apples and
oranges a little bit.
The regex matches the text between the first <perl> and the last
</perl>. If <perl></perl> can occur more than once, then that is not
what you need.
The offset method matches the first existence of <perl> and the first
existence of </perl>, that is, in any order. It gets the text between,
which is empty if one </perl> is before <perl>.
Also, the matchText method sets bbYes, which you will need, I assume.
The offset method doesn't.
You might have some "don't care" in your need, of course.
However, to compare these, I would make both match only the first
<perl></perl> pair (ignoring embedded pairs). Also the offset method
should set bbYes. This makes both take longer, but the resulting
differences in time are less.
Here is my try:
on mouseUp
put "Do a <perl>web search for Perl regular expressions</perl>
tutorials," into myVar
put "<perl>(.*?)</perl>" into regEx -- Added ?
-- perlRegEx
put the long milliseconds into tStartTime
repeat with x = 1 to 500
put matchText(myVar, regEx, tElement) into bbYes
end repeat
put (the long milliseconds - tStartTime) into ptTime
answer tElement
-- PNLP
put the long milliseconds into tStartTime
repeat with i = 1 to 500
-- Added code to set bbYes
put offset("<perl>", myVar) into tNumA
if tNumA is 0 then
put false into bbYes
else
put offset("</perl>", myVar, tNumA+5) into tNumB -- Added chars
to skip
if tNumB is 0 then
put false into bbYes
put empty into tElement
else
put char (tNumA + 6) to (tNumA + tNumB +4) of myVar into
tElement
put true into bbYes
end if
end if
end repeat
put (the long milliseconds - tStartTime) into otTime
answer tElement
-- show results
answer "perlRegEx = " & ptTime & ", PNLP = " & otTime
end mouseUp
(I added long to the milliseconds so I get nontrivial times on my
computer.)
On my computer the matchText takes 30% longer (using your timing code
above) to match in your example string, but it takes twice as long to
fail when I add a space in the last element. It may take tweaking.
(My timing shows less of a difference.)
Dar Scott
************************************************************************
****
Dar Scott Consulting http://www.swcp.com/dsc/ Programming
Services
************************************************************************
****
More information about the use-livecode
mailing list