Trying to code the fastest algorithm...
Alex Tweedly
alex at tweedly.net
Sun Jun 5 16:14:31 EDT 2005
jbv wrote:
>
>yes, you're right. That was a typo.
>Actually my script is slightly more complex, and I had to strip
>off a couple of things to make the problem easier to explain...
>hence the typo...
>
>So here's a clean version of my code :
>
>put "" into newList
>repeat for each line j in mySentences
> put j into a
> replace " " with itemdelimiter in a
> put "" into b
>
> repeat for each item i in a
> get itemoffset(i,myReference)
> if it>0 then
> put it & itemdelimiter after b
> end if
> end repeat
> delete last char in b
>
> if b contains itemdelimiter then
> put b & cr after newList
> end if
> end repeat
>
>
>
>>I don't see anything here which "keeps only those sentences containing
>>more than 1 word" ?
>>
>>
>
>it's the test :
> if b contains itemdelimiter then
>which runs about 25% to 30% faster than
> if number of items of b > 1 then
>
>
cute !! very nice - I'll remember that one ....
>>Since you're asking for "fastest" algorithm, I assume at least one of
>>these numbers is large - and it's unlikely the same algorithm would
>>excel at both ends of the spectrum .....
>>
>>I have 2 or 3 approaches in mind ... any clues you have on the
>>characteristics of the data would help decide which ones to pursue .....
>>e.g. build an assoc array of the words which occur in myReferences,
>>where the content of the array element is the word number within the
>>references, and lookup each word in the sentence with that ... should be
>>faster than "itemOffset" but probably not enough to justify it if the
>>number of words in myReferences is very small.
>>
>>
>>
>
>The "array" approach crossed my mind, but I'm afraid it's not feasable :
>the set of reference words is unpredictable. It is actually derived from a
>sentence entered by the end user, and after some processing / selection
>a set words is extracted from that sentence and then compared with other
>sentences in a data base...
>
>
>
Still feasible ....( typing straight into email - typos possible )
put empty into myArray
put 0 into count
repeat for each item W in myReference
add 1 to count
put W && count & cr after myArray
end repeat
split myArray by CR and space
I'll try some simple experiments based on the figures you gave.... more
later.
--
Alex Tweedly http://www.tweedly.net
-------------- next part --------------
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.323 / Virus Database: 267.6.2 - Release Date: 04/06/2005
More information about the use-livecode
mailing list