wordOffset, repeat loop, speed?

J. Landman Gay jacque at hyperactivesw.com
Mon Jan 6 00:32:01 EST 2003


Mark Brownell <gizmotron at earthlink.net> wrote:

> Would using "wordOffset(wordToFind,stringToSearch[,wordsToSkip])" in a
> repeat loop, to create a list of numerical values for reoccurring instances
> of the same word found from within the whole chunk of text, be the fastest
> way to get a list of these words.

That would be one way, but I don't think it would be the fastest way. 
This exact same word counting algorithm is one of the examples that 
ships with the MetaCard program (on which Revolution is based.) Here's 
the page from the documentation:

****
This example demonstrate the use of associative arrays in MetaTalk 
language.  The script parses a text file, count the occurance of each 
word, and display the result in a field.

on mouseUp
   put empty into field "result"
   answer file "Select a text file for input:"
   if it is empty then exit mouseUp
# let user know we're working on it
   set the cursor to watch
   put it into inputFile
   open file inputFile for read
   read from file inputFile until eof
   put it into fileContent
   close file inputFile
# wordCount is an associative array, its indexes are words
# with the contents of each element being number of times
# that word appears
   repeat for each word w in fileContent
     add 1 to wordCount[w]
   end repeat
# copy all the indexes that is in the wordCount associative array
   put keys(wordCount) into keyWords
# sort the indexes -- keyWords contains a list of elements in array
   sort keyWords
   repeat for each line l in keyWords
     put l & tab & wordCount[l] & return after displayResult
   end repeat
   put displayResult into field "result"
end mouseUp

This is lightening fast. I know, because I once had a contest with Scott 
Raney to see who could write a faster script to solve this problem, and 
his won. It can do huge files in a couple of ticks.

-- 
Jacqueline Landman Gay         |     jacque at hyperactivesw.com
HyperActive Software           |     http://www.hyperactivesw.com




More information about the use-livecode mailing list