Regex (MatchText) speed

Mark Brownell gizmotron at earthlink.net
Wed Jun 23 21:37:03 EDT 2004


On Wednesday, June 23, 2004, at 06:25  PM, Troy Rollins wrote:

> Anyone have a real-world sense of the speed difference? I have a 
> parsing routine which I put together hastily, knowing that it would 
> need to be later optimized. I'm edging in on that optimization phase, 
> and I'm wondering what angle I might want to approach it. Speed is 
> definitely a concern.

I tried fooling around with a few things to pull-parse non-SGML 
well-formed text. I got good results fro offset() in some cases.

Here this pull-parser stuff again:

-- put getElement("<record>", "</record>", tZap) into theElement
function getElement tStTag, tEdTag, stngToSch
   put empty into zapped
   put the number of chars in tStTag into dChars
   put offset(tStTag,stngToSch) into tNum1
   put offset(tEdTag,stngToSch) into tNum2
   if tNum1 < 1 then
     return "error"
     exit getElement
   end if
   if tNum2 < 1 then
     return "error"
     exit getElement
   end if
   put char (tNum1 + dChars) to (tNum2 - 1) of stngToSch into zapped
   return zapped
end getElement

-- put getAttribute("name", tZap) into theAttribute
function getAttribute tAttribute, strngToSearch
   put empty into zapA
   put quote into Qx
   if char 1 of tAttribute = space then
     put tAttribute & "=" & Qx into tAttributeX
   else
     put space & tAttribute & "=" & Qx into tAttributeX
   end if
   put the number of chars in tAttributeX into dChars
   put offset(tAttributeX,strngToSearch) into tNum1
   if tNum1 < 1 then
     return "error"
     exit getAttribute
   end if
   put tNum1 + dChars into tNumX
   put offset(Qx,strngToSearch,tNumX) into tNumZ
   if tNumX < 1 then
     return "error"
     exit getAttribute
   end if
   if tNumZ < 1 then
     return "error"
     exit getAttribute
   end if
   put char tNumX to (tNumX + (tNumZ - 1)) of strngToSearch into zapA
   return zapA
end getAttribute

-- put getElementsArray("<record>", "</record>", tZap) into theArray
function getElementsArray tStartTag, tEndTag, StringToSearch
   put empty into tArray
   put 0 into tStart1
   put 0 into tStart2
   put 1 into tElementNum
   put the number of chars in tStartTag into dChars
   repeat
     put offset(tStartTag,StringToSearch,tStart1) into tNum1
     put (tNum1 + tStart1) into tStart1
     if tNum1 < 1 then exit repeat
     put offset(tEndTag,StringToSearch,tStart2) into tNum2
     put (tNum2 + tStart2) into tStart2
     if tNum2 < 1 then exit repeat
     --if tNum2 < tNum1 then exit repeat
     put char (tStart1 + dChars) to (tStart2 - 1) of StringToSearch into 
zapped
     put zapped into tArray[tElementNum]
     add 1 to tElementNum
   end repeat
   return tArray
end getElementsArray

Mark




More information about the use-livecode mailing list