Once upon a time?

Jim Ault JimAultWins at yahoo.com
Sat Feb 3 13:54:21 EST 2007


On 2/3/07 5:24 AM, "David Bovill" <david at openpartnership.net> wrote:
>Now I wonder which will be fastest:
>get wordOffset(word1, tText)
>if (it > 0) AND (it = (wordOffset(word2, tText) - 1)) then
>or
>
> put space into P
> put replaceText("once   upon a  time","\s+",P)  into cleanVar
> put P&"once upon"&P is in P&cleanVar&P = true

RegEx is a slower technique almost every time.
The larger the text block, the more hits, and the larger the number text
blocks  all add to the demand.

Regex is an engine that actually scans back and forth through a text block
and follows rules.  The simpler the rules you give it, the shorter the
processing time.

Using Rev's chunking ability will always about 10-100 times faster.
However, a field of 100 lines will not be noticeable.  I use some heavy
regEx to parse web pages everyday, every minute because I need pin-point
accuracy and data mining vs fasted execution.  Lots of rules, lots of steps.
Chunking just won't do it without a lot of 'IF' statements.

In Rev, this is actually very fast.  You can extract only the words on the
lines where they live by:

repeat for each line LNN in textBlock
   repeat for each word WRD in LNN
      put WRD & space after newTextBlock
   end repeat
   delete last char of newTextBlock
   put cr after newTextBlock
end repeat
delete last char of newTextBlock

Of course this example strips the punctuation and tabs

Jim Ault
Las Vegas





More information about the use-livecode mailing list