SC, Rev,and RB speed test
Brian Yennie
briany at qldlearning.com
Sat Apr 17 07:47:12 EDT 2004
Dave,
Good catch- yeah it probably could use a sweep of everything non
alphanumeric (replaced by spaces) before it begins. Rev will includes
commas and other punctuation as part of "words"...
i.e.
word 1 of "this,that and those" is "this,that"
> I like this. It cuts the speed by half on my machine. But it has a
> problem with word definition between the target text and the search
> strings. Quoted text in the target text is treated as a single word.
> That can be fixed by just replacing quotes with empty. But even with
> that it was missing a couple of matches. I assume this is either where
> a single character word is followed by punctuation in the target text.
> (e.g. "I,") or where there are quotes in the search string. I'm not
> sure if this can be dealt with easily without making big assumptions
> about the search strings.
Agreed- and on the flipside, offset() or "contains" or "is in" won't
give you good word matches either: "I think I can" will happily match
with "I think I cannot", and the only way you could detect it would be
to figure out the same word boundary problem...
I guess despite all of it's wonderful word chunking abilities, Rev
still isn't a full-text parser and indexer!
- Brian
More information about the use-livecode
mailing list