deleteDups()

Alex Tweedly alex at tweedly.net
Wed Nov 16 12:44:03 EST 2005


Geoff Canyon wrote:

>
> Bother -- it happened again. First we had repeat for each turning up  
> faster than the filter command. Now I've done a test on the  
> following, and it looks like the split command takes not quite twice  
> as long as repeat for each (when repeat for each is handled the  
> following way).
>
> function deleteDupes pList -- does _not_ retain sort order
>   repeat for each line L in pList
>     put 1 into temp[L]
>   end repeat
>   return the keys of temp
> end deleteDupes
>
>
I wonder if this is because the "split" method used a secondary 
delimiter (numtochar(3)) which doesn't exist in the data. (Nothing to do 
with the choice of secondaryDelimiter - you need to pick one that isn't 
in the data).

So the effort involved in the "split" is

   search for next linedelimiter
   search within that line for the secondaryDelimiter
            (which isn't there, so needs to scan all the way through 
each line)
   create the array element
(i.e. every character is scanned twice).

In your  "repeat for each ... " version, there is only a single search 
(for the lineDelimiter), so you're doing basically half as much 
searching, plus a little bit of loop overhead.

That fits the data (given huge dollops of hindsight :-), though it is 
still surprising and maybe disappointing.


-- 
Alex Tweedly       http://www.tweedly.net



-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.362 / Virus Database: 267.13.2/170 - Release Date: 15/11/2005




More information about the use-livecode mailing list