Matchtext script results
J. Landman Gay
jacque at hyperactivesw.com
Thu Nov 30 18:07:08 EST 2006
I was going to post a compilation of all the scripts I received in
response to my matchtext question, but there were so many that the post
would likely be too long for the listserve. I very much enjoyed looking
at all the solutions, and was impressed all over again at how many
different ways there are to solve a problem in Revolution.
What I did was copy each contribution to a test stack, and alter it
slightly as necessary to fit into this structure:
on mouseUp -- in a button
findWords "list house dog"
end mouseUp
on findwords pWords
put "/Users/jgay/Comm Data/Humor/" into tDir
set the directory to tDir
put the files into tFiles
put the ticks into tStart
repeat for each line l in tFiles
put url ("file:" & l) into tText
-- unique handler contents inserted here
if tMatch then put l & cr after tList
end repeat
put the ticks - tStart
put tList into fld 1
end findwords
If necessary, I did any text replacements before starting the timer
(replacing spaces with commas, for example) and also included any setup
stuff before the timer began (putting a string into a "goodchars"
variable, for instance.) Only the actual file processing was timed. I
tested with a folder containing 118 amusing text files of varying
lengths. There were two files in the folder that contained all three of
the test words. I measured in ticks because a general idea of speed was
good enough for my purposes.
Here is a summary of the contributions (in the reverse order they were
posted) and their timings and results on my G5-Intel Mac:
Mark Smith -- native syntax: 18 ticks. Found 2 matches
Mark Smith -- filter: 18 ticks. Found 2 matches
Mark Smith -- array: 15 ticks. Found 2 matches
Dick Kriesel -- array: 8 ticks. Found 1 match
John Craig -- regex: 242 ticks. Found 2 matches
Dick Kriesel -- array: 8 ticks. Found 2 matches
Brian Yennie -- arrays: 7 ticks; found 2 matches
Jim Ault -- filter: 5 ticks; found 5 matches (strings, not words)
Ken Ray -- regex: 15 ticks (first run),8 ticks (subsequent runs); found
2 matches
Jacque Gay -- original Rev script: 4 ticks. Found 2 matches
The last one in the list is the one I thought I had to replace with
something faster. It is simply this:
repeat for each line l in tFiles
put url ("file:" & l) into tText
repeat for each word w in pWords
put w is among the words of tText into tMatch
if tMatch = false then exit repeat
end repeat
if tMatch then put l & cr after tList
end repeat
And that is what surprised me -- that no tinkering with arrays, or
matchtext, or anything else is faster than the most straightforward
Revolution syntax. I was thinking this would take a long time, but in
fact it is the fastest way to do it (that I've seen so far, anyway.)
We've mentioned this on the list before, but I guess I need to be hit on
the head with the facts occasionally, just to remind me how good we've
got it.
Surprise. Rev wins out again.
--
Jacqueline Landman Gay | jacque at hyperactivesw.com
HyperActive Software | http://www.hyperactivesw.com
More information about the use-livecode
mailing list