Chunks vs Arrays - surprising benchmarking results

Paul Looney support at ahsomme.com
Fri Aug 7 13:51:51 EDT 2009


Richard,
Very true. Especially regarding the limitations of the filter.
The multiple search works best with hardwired searches and searches  
where the user selects from a series of pop-up or radio buttons. It  
works well when you are generating canned lists.
PL

On Aug 7, 2009, at 10:22 AM, Richard Gaskin wrote:

> Paul Looney wrote:
>> I have nothing to add directly to the chunk vs array discussion   
>> (Trevor's reply was very good) but I have often found it helpful  
>> to  increase the speed of compound selections by breaking them  
>> into  individual ones.
>> For instance if you have a large database of names and sexes and  
>> you  want to select every female named "Jan" ("Jan" could be male  
>> or female).
>> Select all of the Jans first (this will run much faster than the   
>> compound selection).
>> Then select all of the females from the result of the first  
>> selection  (this will run faster because it is searching only  
>> "Jan"s - a very  small list).
>> This double selection will run faster than a single compound  
>> selection.
>> Obviously this requires a known data-set where one filter will   
>> eliminate a lot of records (selecting "female", then selecting  
>> "Jan"  would be much slower in our example because, presumably,  
>> half of the  list is female and a small portion is Jan).
>> On many lists this can create a much bigger speed difference than   
>> chunk vs array variance you noted.
>
> One of the tough challenges with this sort of benchmarking is that  
> different methods will favor different test cases.
>
> But with delimited rows and columns, I haven't found a way to make  
> a two-pass search run faster than one pass, except in very  
> specialized cases as you noted.
>
> There's a temptation to use the filter command for the first pass,  
> but filter is only faster when testing the first few items;  
> filtering on the 10th item is much slower, and attempting to test  
> the 50th item in a sample data set caused Rev to hang.  RegEx is a  
> harsh mistress.
>
> In my case, I don't often know in advance which item will be  
> searched. The queries I'm running usually come from a Search dialog  
> in which the user can specify criteria.  I could make the search  
> function smart enough to special-case certain types of searches to  
> use a two-pass method in which the first pass is the filter command  
> where practical, but the overhead of analyzing both the query and  
> the data to make such determinations may detract from the benefits  
> of doing so, esp. since my continued testing on this is  
> increasingly nudging me toward multi-dimensional arrays anyway.   
> Even with the data bloat and the surprising overhead of moving  
> arrays in and out of storage, with a little extra work to deal with  
> those the performance of arrays seems unbeatable in the broadest  
> range of use cases I've run thus far.
>
> --
>  Richard Gaskin
>  Fourth World
>  Revolution training and consulting: http://www.fourthworld.com
>  Webzine for Rev developers: http://www.revjournal.com
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your  
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution




More information about the use-livecode mailing list