There should be a "unique" option on sort . . .

Peter Haworth pete at lcsql.com
Mon Jan 6 13:35:57 EST 2014


Hi Richard,
It's not so much the arithmetic operation or putting a value into the array
element, it's the repeat loop itself which I don't think is needed.  It can
be replaced with "combine pData with return and return".  That one line
gets rid of the duplicates.  So the function would look like this:

function UniqueListFromArray2 pData
   local tKeys
   combine pData with return and return
   put the keys of pdata into tKeys
   sort tKeys
   return tKeys
end UniqueListFromArray2

I came up with that because the OP mentioned that an engine-embedded way to
de-dup would be faster than any user written handler, which seems like a
reasonable assumption.

Pretty academic really because it seems like all the solutions work
acceptably fast but nevertheless interesting to compare.

Pete
lcSQL Software <http://www.lcsql.com>


On Mon, Jan 6, 2014 at 9:22 AM, Richard Gaskin
<ambassador at fourthworld.com>wrote:

> Peter Haworth wrote:
>
>> Your handler to use an array isn't quite the same as mine. You have a
>> repeat loop adding 1 to the key to create the array whereas I used
>> combine.
>> Be interesting to see if that makes any significant difference.
>>
>
> Good catch.  The arithmetic operation isn't needed there (I just blindly
> copied-and-pasted from some old MetaCard stuff and revised it only to use
> my var names), so conceivably it could be slightly faster to omit the
> arithmetic and just put empty into that array value.
>
> Or so it would seem....
>
> I couldn't resist trying this, so I just ran another test with two array
> functions, one being the original and the other replacing this:
>
>       add 1 to tA[tLine]
>
> ...with this:
>
>       put empty into tA[tLine]
>
> Here's the results:
>
> 10 iterations on 100000 lines of 5 or fewer chars:
> Non-Arithmetic Array: 499 ms (49.9 ms per iteration)
> Original Array: 470 ms (47 ms per iteration)
> Results match - Each list has 996 lines
>
>
> I'm guessing this has to do with a certain amount of overhead associated
> with strings.
>
> So then I tried another variant, in which it puts a value that has been
> coerced to a number into the array slot:
>
> function UniqueListFromArray2 pData
>    put 0+0 into y
>    repeat for each line tLine in pData
>       put y into tA[tLine]
>    end repeat
>    put the keys of tA into tKeys
>    sort lines of tKeys
>    return tKeys
> end UniqueListFromArray2
>
> And the results:
>
> 10 iterations on 100000 lines of 5 or fewer chars:
> "Add 1" Array: 479 ms (47.9 ms per iteration)
> "Put Y" Array: 530 ms (53 ms per iteration)
> Results match - Each list has 993 lines
>
> Hmmm.....
>
> --
>  Richard Gaskin
>  Fourth World
>  LiveCode training and consulting: http://www.fourthworld.com
>  Webzine for LiveCode developers: http://www.LiveCodeJournal.com
>  Follow me on Twitter:  http://twitter.com/FourthWorldSys
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>



More information about the use-livecode mailing list