Words Indexing strategies

Richard Gaskin ambassador at fourthworld.com
Fri Feb 12 09:21:31 EST 2010


Bernard Devlin wrote:

> However, it looks to me like the existing indexes don't contain enough
> information for you to calculate frequency of occurrence (a measure of
> relevance).

Once again, MetaCard to the rescue! :)

Raney included this little gem in MC's Examples stack, and using "repeat 
for each" and arrays it's blazing fast, able to make a frequency table 
for even large files in almost no time at all:

on mouseUp
   put empty into field "result"
   answer file "Select a text file for input:"
   if it is empty then exit mouseUp
# let user know we're working on it
   set the cursor to watch
   put it into inputFile
   open file inputFile for read
   read from file inputFile until eof
   put it into fileContent
   close file inputFile
# wordCount is an associative array, its indexes are words
# with the contents of each element being number of times
# that word appears
   repeat for each word w in fileContent
     add 1 to wordCount[w]
   end repeat
# copy all the indexes that is in the wordCount associative array
   put keys(wordCount) into keyWords
# sort the indexes -- keyWords contains a list of elements in array
   sort keyWords
   repeat for each line l in keyWords
     put l & tab & wordCount[l] & return after displayResult
   end repeat
   put displayResult into field "result"
end mouseUp


--
  Richard Gaskin
  Fourth World
  Rev training and consulting: http://www.fourthworld.com
  Webzine for Rev developers: http://www.revjournal.com
  revJournal blog: http://revjournal.com/blog.irv




More information about the use-livecode mailing list