Optimization can be tricky
Tom Glod
tom at makeshyft.com
Mon Jun 11 21:08:42 EDT 2018
please do goeff.... this subject is very interesting to me. i have some
problems where i need to optimize in a similar way. which kind of repeat
look have u found works fastest? have u tried ? repeat for each key
this_key in array? is that slower?
i love saving milliseconds. :) makes a big diff at scale.
On Mon, Jun 11, 2018 at 8:21 PM, Geoff Canyon via use-livecode <
use-livecode at lists.runrev.com> wrote:
> I have a routine that takes about a minute to run for the test case I
> created, and who knows how long for the real use case. Given that I want to
> run it several thousand times for actual work, I need it to run (much)
> faster.
>
> Roughly, the routine gathers together a bunch of data from arrays, sorts
> the data based on its relationship to other arrays, and then returns a
> subset of the result.
>
> My first pass at the routine looked roughly like this:
>
> repeat for each line T in the keys of interestArray[uID]
> repeat for each line S in storyArray[T]
> if abs(item 2 of S - item 1 of interestArray[uID][T]) < 20 \
> and userSeenArray[uID][item 1 of S] < 4
> then put (101 + userSeenArray[uID][item 1 of S] * 30 + 5 * \
> abs(item 2 of S - item 1 of interestArray[uID][T]) - \
> item 2 of interestArray[uID][T]),T,S & cr after
> candidateList
> end repeat
> end repeat
> sort lines of candidateList numeric by random(item 1 of each)
>
> In simple terms: parse through the individual lines of all the entries that
> possibly work, calculating a relevant value for each and appending that
> value and the line to an interim list, which then gets sorted, randomly
> favoring lower values.
>
> I assumed the problem was all the line-by-line parsing, and I thought of a
> clever way to accomplish the same thing. That led to this:
>
> put storyArray into R
> intersect R with interestArray[uID]
> combine R using cr and comma
> sort lines of R by (101 + userSeenArray[uID][item 2 of each] * 30 + 5 *
> \
> abs(item 3 of each - item 1 of interestArray[uID][item 1 of each])
> \
> - item 2 of interestArray[uID][item 1 of each])
>
> Much simpler, albeit that's a heck of a "sort by" -- more complex by far
> than any I had previously created, and a testament to the power and
> flexibility of "each". Alas, despite condensing my code and removing
> parsing and loops, that version took ten seconds more than the previous
> version, I'm guessing because the previous version has that "if" statement
> that weeds out undesirable entries before the sort has to deal with them.
>
> (I'm writing this email as I parse through this figuring it out)
>
> So it turns out that the crazy use of "each" is only part of the problem --
> changing that line to:
>
> sort lines of R by random(10000)
>
> still takes over 20 seconds -- 3x as fast, but still far too slow. It turns
> out that the source data numbers anywhere from 1,000 to 2,000 lines, so
> sorting it in any way to randomly select the several lines I need is a
> really bad choice. removing the sort step but keeping everything else cuts
> the execution time down to under a second.
>
> Hmm. How to select several lines at weighted random from among a couple
> thousand? I'll think on this and post a follow-up.
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
More information about the use-livecode
mailing list