How to get word offset all instances of a string in a chunk of text?

Keith Clarke keith.clarke at me.com
Fri Aug 31 10:43:00 EDT 2018


Thanks Alex, HH & Jim for all the help & ideas.

Just to close out the thread with a solution for future reference, the code below now extracts from a text source a list of unique words, cleaned up against a noise-word list, with word frequency, word & and a comma-delimited string of the word number within the original source.


# Build unique words array
repeat for each trueWord W in tSource

add 1 to tWordNum

if tANoise[W] then next repeat

put comma & tWordNum after tAWords[W]

end repeat


# Convert unique words array to list

repeat for each key K in tAWords

put K && tAWords[K] & CR after tTemp

end repeat


repeat for each line tLine in tTemp

put the number of items in tLine & comma & tLine & cr after tWords

end repeat


sort lines of tWords descending numeric by item 1 of each

put tWords into field "Words"


Thanks & regards,
Keith




 


More information about the use-livecode mailing list