deleteDups() -- split is faster

Geoff Canyon gcanyon at inspiredlogic.com
Wed Nov 16 16:17:40 EST 2005


'doh! -- I made a mistake -- see below:

On Nov 16, 2005, at 10:03 AM, HyperChris at aol.com wrote:

> Thanks Geoff. Actually, I am getting split as four times faster
> than 'for each' (under v2.6.0 & 2.6.1)


On Nov 16, 2005, at 10:51 AM, Mark Smith wrote:
> I'm getting the split version as about 10% faster than 'repeat for  
> each'

If you're using Eric's version, this line is hurting your performance:

      if tLine is not among the lines of tStrippedList

This line is faster, and doesn't degrade as quickly with larger lists:

      put 1 into y[L]

That said, I re-did the test with the actual functions:

function deleteDups pList -- does _not_ retain input order
   split pList by cr and numToChar(3)
   return keys(pList)
end deleteDups

function deleteDupes pList -- does _not_ retain input order
   repeat for each line L in pList
     put 1 into x[L]
   end repeat
   return the keys of x
end deleteDupes

And found that variable initialization is a big factor, and isn't  
taken into account in a simple "do this 100 times" repeat loop. The  
end result is that putting both the above functions to the test, I  
get much the same result as Mark Smith: the split command was about  
20% faster for me: 9 ticks vs. 11 ticks.

If you want to retain the input order, this took about 14 ticks:

function deDupe pList -- retains input order
   repeat for each line L in pList
     if x[L] is empty then put L & cr after tReturn
     put 1 into x[L]
   end repeat
   return char 1 to -2 of tReturn
end deDupe

If you are certain that the duplicates are sequential, this takes  
only 5 ticks:

function deDupe2 pList -- retains input order, assumes dupes are  
sequential
   put empty into tLast
   repeat for each line L in pList
     if L is tLast then next repeat
     put L & cr after tReturn
     put L into tLast
   end repeat
   return char 1 to -2 of tReturn
end deDupe2

regards,

Geoff





More information about the use-livecode mailing list