Intersecting data question/challenge

Buster wouter.abraham at scarlet.be
Fri Jul 8 22:57:28 EDT 2005


On 09 Jul 2005, at 02:26, Raymond E. Griffith wrote:

> Dennis,
>
> I have a suggestion. It isn't perfect, but it does appear to be  
> relatively
> fast.
>
> I start by creating return-delimited lists. The lists have 5000  
> elements in
> them and 2000 elements in them, respectively, although due to  
> repeats the
> customkeys are significantly less.
>
> One problem is that you cannot set the keys of a variable directly.  
> You can,
> however, set the customkeys of an object directly, then put those
> customproperties into a variable.
>
> Then use intersect.
>
> As I said, this appears to me to be relatively fast.
>
> on mouseUp
>   put 5000 into n1
>   put 2000 into n2
>   repeat with i = 1 to n1
>     put random(10000) & cr after A
>   end repeat
>   repeat with i = 1 to n2
>     put random(10000) & cr after B
>   end repeat
>   put the long milliseconds into ms
>   set the customkeys of fld "LA" to A
>   set the customkeys of fld "LB" to B
>   put the customproperties of fld "LA" into arrA
>   put the customproperties of fld "LB" into arrB
>   intersect arrA with arrB
>   answer keys(arrA) & return & "___" & the long milliseconds - ms
> end mouseUp
>
> Perhaps someone can try comparing this idea with others for time  
> trials?
>
> I hope this helps.
>
> Raymond E. Griffith

> Raymond,
>
> Good idea.  It does get around the need to iterate to get the  
> keys.  Unfortunately, that operation seems to be very slow in Rev.   
> If I use the data from the previous tests, and do everything  
> starting with the lists, is is 10 times slower than my first  
> example.  If I save the constant array first, it is 5 times  
> slower.  If I save both arrays and only get the get the  
> customProperties to the array in the timing loop, it is still twice  
> as slow, which is 17 times slower than the fastest way.
>
> So you met the challenge of no loops --good job.  But the Rev  
> setting customKeys and getting customProperties seems to be much  
> slower than any other operations tested.  They must be using the  
> crawl method for those operations.
>
> Dennis




Hi Raymond, Dennis and everybody else,


The way proposed by Dennis is indeed the fastest on not too large  
amounts of data.
So it is only fair to test the other way around too and try Dennis  
proposal on the same amount of data on which Raymond used his handler
Raymond's handler is a neat trick.
Though it is at least 2 times slower than a replace + split  method.

I adapted Raymond's  method slightly to be able to produce a "one  
button copy-paste script" test for comparison :

on mouseUp
   ### filling the vars
   repeat 5000
     put random(10000) & cr after A
   end repeat
   repeat 2000
     put random(10000) & cr after B
   end repeat
   put A into x
   put B into y
   ### custom prop method
   put the long millisecs into zap
   set the customkeys of me to A
   put the customproperties of me into arrA
   set the customkeys of me to B
   put the customproperties of me into arrB
   intersect arrA with arrB
   put keys(arrA) into tKeys1
   put the long millisecs - zap into time1
   set the customkeys of me to ""
   ### replace split method
   put the long millisecs into zap
   replace cr with tab & cr in A
   split A with cr and tab
   replace cr with tab & cr in B
   split B with cr and tab
   intersect A with B
   put keys(A) into tKeys2
   put the long millisecs - zap into time2
   ### repeat for each + is not among method
   replace cr with comma in x
   replace cr with comma in y
   put the long millisecs into zap
   repeat for each item i in x
     if i is not among the items of y then put i & cr after tList
   end repeat
   put the long millisecs - zap into time3
   put time1 &cr& time2 & cr & time3
end mouseUp


Greetings,
Wouter




More information about the use-livecode mailing list