Counting and numbering duplicates in a list

Jan Schenkel janschenkel at yahoo.com
Thu Sep 29 00:55:33 EDT 2011


Ah, I like a challenge while I'm eating corn flakes :-)
This one works pretty good on my old iMac G4 (1 millisecond for the small sample you gave us)

##
on mouseUp
   local tInputList, tOutputList
   put the text of field "Input List" into tInputList
   local tStartTime, tEndTime
   putthemillisecondsintotStartTime
   Dupend tInputList, tOutputList
   putthemillisecondsintotEndTime
   put tOutputList into field "Output List"
   answer "Dupending took" && (tEndTime - tStartTime) && "milliseconds"
end mouseUp

command Dupend @pInput, @pOutput
   local tLineDA, tLine
   repeat for each line tLine in pInput
      add 1 to tLineDA[tLine]["found"]
   end repeat
   put empty into pOutput
   repeat for each line tLine in pInput
      if tLineDA[tLine]["found"] = 1 then
         put tLine & return after pOutput
      else
         add 1 to tLineDA[tLine]["current"]
         put tLine & "-" & tLineDA[tLine]["current"] & return after pOutput
      end if
   end repeat
   delete char -1 of pOutput
end Dupend##

HTH,

Jan Schenkel.
=====
Quartam Reports & PDF Library for LiveCode
www.quartam.com


=====
"As we grow older, we grow both wiser and more foolish at the same time."  (La Rochefoucauld)


----- Original Message -----
From: "Peter M. Brigham, MD" <pmbrig at gmail.com>
To: How to use LiveCode <use-livecode at lists.runrev.com>
Cc: 
Sent: Thursday, September 29, 2011 5:55 AM
Subject: Re: Counting and numbering duplicates in a list

Just timed the script below using a 10000 (10^4) line list -- 1.964 seconds. Not great if you're dealing with >= 10^5 items. Can someone do better?

-- Peter

Peter M. Brigham
pmbrig at gmail.com
http://home.comcast.net/~pmbrig

> function flagDupes tList
>   put tList into scratchList
>   repeat for each line t in tList
>      if t is among the lines of scratchList or \
>             freqArray[t] > 0 then
>         add 1 to freqArray[t]
>         put cr & t & "-" & freqArray[t] after outputList
>      else
>         put cr & t after outputList
>      end if
>      delete line 1 of scratchList
>   end repeat
>   delete char 1 of outputList
>   return outputList
> end flagDupes
> 
> I *think* this should be fast with large lists.

On Sep 28, 2011, at 10:52 PM, Roger Eller wrote:

> There are several ways I could approach this, but I'm unsure which way is
> best?  I have a list of numbers that 'may' contain duplicates.  I need to
> sequence ONLY the duplicates without changing the order the list. If there
> is only one, it does not need to be sequenced.
> 
> Should I just repeat, and keep the content of line x in a variable, then add
> 1 to a sequence variable if the number is encountered again?  Is there a
> better way?  Simple stuff, I know, but these lists can be really long, and I
> want it to process as quickly possible.
> 
> 12345
> 12345
> 12344
> 12333
> 10112
> 12333
> 
> must become:
> 
> 12345-1
> 12345-2
> 12344
> 12333-1
> 10112
> 12333-2
> 
> ˜Roger

_______________________________________________
use-livecode mailing list
use-livecode at lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list