Counting and numbering duplicates in a list
Peter M. Brigham, MD
pmbrig at gmail.com
Wed Sep 28 23:55:08 EDT 2011
Just timed the script below using a 10000 (10^4) line list -- 1.964 seconds. Not great if you're dealing with >= 10^5 items. Can someone do better?
-- Peter
Peter M. Brigham
pmbrig at gmail.com
http://home.comcast.net/~pmbrig
> function flagDupes tList
> put tList into scratchList
> repeat for each line t in tList
> if t is among the lines of scratchList or \
> freqArray[t] > 0 then
> add 1 to freqArray[t]
> put cr & t & "-" & freqArray[t] after outputList
> else
> put cr & t after outputList
> end if
> delete line 1 of scratchList
> end repeat
> delete char 1 of outputList
> return outputList
> end flagDupes
>
> I *think* this should be fast with large lists.
On Sep 28, 2011, at 10:52 PM, Roger Eller wrote:
> There are several ways I could approach this, but I'm unsure which way is
> best? I have a list of numbers that 'may' contain duplicates. I need to
> sequence ONLY the duplicates without changing the order the list. If there
> is only one, it does not need to be sequenced.
>
> Should I just repeat, and keep the content of line x in a variable, then add
> 1 to a sequence variable if the number is encountered again? Is there a
> better way? Simple stuff, I know, but these lists can be really long, and I
> want it to process as quickly possible.
>
> 12345
> 12345
> 12344
> 12333
> 10112
> 12333
>
> must become:
>
> 12345-1
> 12345-2
> 12344
> 12333-1
> 10112
> 12333-2
>
> ˜Roger
More information about the use-livecode
mailing list