Chunks vs Arrays - surprising benchmarking results
Paul Looney
support at ahsomme.com
Fri Aug 7 13:51:51 EDT 2009
Richard,
Very true. Especially regarding the limitations of the filter.
The multiple search works best with hardwired searches and searches
where the user selects from a series of pop-up or radio buttons. It
works well when you are generating canned lists.
PL
On Aug 7, 2009, at 10:22 AM, Richard Gaskin wrote:
> Paul Looney wrote:
>> I have nothing to add directly to the chunk vs array discussion
>> (Trevor's reply was very good) but I have often found it helpful
>> to increase the speed of compound selections by breaking them
>> into individual ones.
>> For instance if you have a large database of names and sexes and
>> you want to select every female named "Jan" ("Jan" could be male
>> or female).
>> Select all of the Jans first (this will run much faster than the
>> compound selection).
>> Then select all of the females from the result of the first
>> selection (this will run faster because it is searching only
>> "Jan"s - a very small list).
>> This double selection will run faster than a single compound
>> selection.
>> Obviously this requires a known data-set where one filter will
>> eliminate a lot of records (selecting "female", then selecting
>> "Jan" would be much slower in our example because, presumably,
>> half of the list is female and a small portion is Jan).
>> On many lists this can create a much bigger speed difference than
>> chunk vs array variance you noted.
>
> One of the tough challenges with this sort of benchmarking is that
> different methods will favor different test cases.
>
> But with delimited rows and columns, I haven't found a way to make
> a two-pass search run faster than one pass, except in very
> specialized cases as you noted.
>
> There's a temptation to use the filter command for the first pass,
> but filter is only faster when testing the first few items;
> filtering on the 10th item is much slower, and attempting to test
> the 50th item in a sample data set caused Rev to hang. RegEx is a
> harsh mistress.
>
> In my case, I don't often know in advance which item will be
> searched. The queries I'm running usually come from a Search dialog
> in which the user can specify criteria. I could make the search
> function smart enough to special-case certain types of searches to
> use a two-pass method in which the first pass is the filter command
> where practical, but the overhead of analyzing both the query and
> the data to make such determinations may detract from the benefits
> of doing so, esp. since my continued testing on this is
> increasingly nudging me toward multi-dimensional arrays anyway.
> Even with the data bloat and the surprising overhead of moving
> arrays in and out of storage, with a little extra work to deal with
> those the performance of arrays seems unbeatable in the broadest
> range of use cases I've run thus far.
>
> --
> Richard Gaskin
> Fourth World
> Revolution training and consulting: http://www.fourthworld.com
> Webzine for Rev developers: http://www.revjournal.com
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution
More information about the use-livecode
mailing list