Finding duplicates in a list

Bill Marriott wjm at wjm.org
Wed Jan 9 06:33:34 EST 2008


on mouseUp

  put url "file:D:/desktop/dupeslist.txt" into tList
  set the itemdelimiter to tab
  put the milliseconds into tt

  put 0 into n
  repeat for each line tCheck in tList
    add 1 to n
    put n & tab & tCheck & return after tCheckArray[tCheck]
  end repeat

  put empty into tListResult

  repeat for each key theKey in tCheckArray
    if the number of lines in tCheckArray[theKey] > 1 then
      repeat with i = 2 to the number of lines in tCheckArray[theKey]
        put item 1 of line 1 of tCheckArray[theKey] & tab & \
            item 1 of line i of tCheckArray[theKey] & tab & \
            theKey & return after tListResult
      end repeat
      -- put  tCheckArray[j] & return after tListResult
    end if
  end repeat

  put the milliseconds - tt & return & "number of files:" && \
      the number of lines in tList & return & return & tListResult
end mouseUp


64 milliseconds on my computer, versus 5023 for yours.



"Ian Wood" <revlist at azurevision.co.uk> wrote 
in message news:5461718F-FCCB-45BA-804C-4D4F4DB3A089 at azurevision.co.uk...
> Hi Bill,
>
> <http://azurevision.co.uk/rev/dupeslist.txt>
>
> Yes, I wish to end up with a list of duplicates, not a list of uniques.
>
> Ian
>
> On 9 Jan 2008, at 08:57, Bill Marriott wrote:
>
>> Hi Ian,
>>
>> I have a couple ideas... Would you be able to upload a sample list
>> somewhere -- the result of ijwAPLIB_getAllChecksums() -- so we could  try 
>> it
>> out and measure? Also, just want to double-check that you intend to 
>> produce
>> a list of duplicate items, not a list of uniques.
>>
>> - Bill
>>
>> "Ian Wood" <revlist at azurevision.co.uk> 
>> wrote
>> in message news:3E1F0603-98FC-4F54-8259- 
>> E56C048FF3C9 at azurevision.co.uk...
>>> The problem - trying to find duplicate files in a database (Apple
>>> Aperture), and have found a checksum column for all the image files.
>>>
>>> I've had a go at writing a handler to find the dupes and it does  OK, 
>>> but
>>> wondered if the bright sparks on the list have any advice on   speeding 
>>> it
>>> up it...
>>>
>>> [snip]






More information about the use-livecode mailing list