deleteDups() -- split is faster
Geoff Canyon
gcanyon at inspiredlogic.com
Wed Nov 16 16:17:40 EST 2005
'doh! -- I made a mistake -- see below:
On Nov 16, 2005, at 10:03 AM, HyperChris at aol.com wrote:
> Thanks Geoff. Actually, I am getting split as four times faster
> than 'for each' (under v2.6.0 & 2.6.1)
On Nov 16, 2005, at 10:51 AM, Mark Smith wrote:
> I'm getting the split version as about 10% faster than 'repeat for
> each'
If you're using Eric's version, this line is hurting your performance:
if tLine is not among the lines of tStrippedList
This line is faster, and doesn't degrade as quickly with larger lists:
put 1 into y[L]
That said, I re-did the test with the actual functions:
function deleteDups pList -- does _not_ retain input order
split pList by cr and numToChar(3)
return keys(pList)
end deleteDups
function deleteDupes pList -- does _not_ retain input order
repeat for each line L in pList
put 1 into x[L]
end repeat
return the keys of x
end deleteDupes
And found that variable initialization is a big factor, and isn't
taken into account in a simple "do this 100 times" repeat loop. The
end result is that putting both the above functions to the test, I
get much the same result as Mark Smith: the split command was about
20% faster for me: 9 ticks vs. 11 ticks.
If you want to retain the input order, this took about 14 ticks:
function deDupe pList -- retains input order
repeat for each line L in pList
if x[L] is empty then put L & cr after tReturn
put 1 into x[L]
end repeat
return char 1 to -2 of tReturn
end deDupe
If you are certain that the duplicates are sequential, this takes
only 5 ticks:
function deDupe2 pList -- retains input order, assumes dupes are
sequential
put empty into tLast
repeat for each line L in pList
if L is tLast then next repeat
put L & cr after tReturn
put L into tLast
end repeat
return char 1 to -2 of tReturn
end deDupe2
regards,
Geoff
More information about the use-livecode
mailing list