Optimization can be tricky

Geoff Canyon gcanyon at gmail.com
Mon Jun 18 17:01:47 EDT 2018


Thanks for all the suggestions -- I tried multiple ideas without much
improvement, then decided to rethink the problem. Roughly:

I have a bunch of source data that I categorized by two numbers, the first
from 1-N where N might be anywhere from 100-5,000, and the second from
1-100. For any given first number, there might be about fifty entries, with
varying values for the second number.

Then for a given query I was trying to:

1. Take a list of 20-100 values for the first number, and for each of
those, a value for the second number,
2. Return a set of about 10-20 entries, where each entry matches one of the
first numbers from the list, and then probabilistically selected based on
the proximity of the second number from the list to the second number of
the item.

(The userSeen aspect was an additional wrinkle)

Since there are about 50 entries for each of the first numbers, the entries
that *might* be returned number [20-100] * 50, or about 1,000-5,000
entries. And everything I tried was too slow, since ideally I want to be
able to do this anywhere from 10-100 times per second.

So I reframed at the source. Instead of keeping all the candidates in a
single array keyed by the first number, I kept them in sub-arrays keyed by
the first digit of the second number.

Then, instead of gathering all the possible values based on the first
number matches, I rotate the list of first number matches each time I
retrieve, and then parse through the resulting matches in order, gathering
any that work, until I have enough to return.

This is certainly not identical to the original method, but it's close
enough, and runs in a small fraction of a second.

thanks again,

gc

On Tue, Jun 12, 2018 at 6:04 PM, Tom Glod via use-livecode <
use-livecode at lists.runrev.com> wrote:

> Thanks for the tip Ralph....love the sound of that filer function.
>
> On Tue, Jun 12, 2018 at 7:00 PM, Curry Kenworthy via use-livecode <
> use-livecode at lists.runrev.com> wrote:
>
> >
> > Optimizing scripts in LC is not the same as running reports in other
> > software suites. If you only need 10 results, you probably don't want to
> > handle all items twice.
> >
> > I hate to loop through all items even once. But if I do, I may be done!
> > I'm not making a full report to print out; I'm just getting my 10 items.
> If
> > I use sort or filter, likewise, I will do whatever necessary to keep it
> > short and sweet, even if that means adjusting some of the starting
> > assumptions.
> >
> > If the sort really is taking more time than the loop, something is
> bogging
> > it down. Look at the random() and arithmetic attached to the sort. That's
> > potentially like running a whole lot of LCS. So if you sort, try a plain
> > numeric sort with no fancy stuff added. Then go grab your 10 - the other
> > requirements can be addressed before or after the sort.
> >
> > Best wishes,
> >
> > Curry Kenworthy
> >
> > Custom Software Development
> > LiveCode Training and Consulting
> > http://livecodeconsulting.com/
> >
> > _______________________________________________
> > use-livecode mailing list
> > use-livecode at lists.runrev.com
> > Please visit this url to subscribe, unsubscribe and manage your
> > subscription preferences:
> > http://lists.runrev.com/mailman/listinfo/use-livecode
> >
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>



More information about the use-livecode mailing list