Optimization can be tricky

chipsm at themartinz.com chipsm at themartinz.com
Tue Jun 12 03:35:49 EDT 2018

I know that this might be a different use-case but I have a CA Lottery - Fantasy five lottery parser that collects lottery data and loads it into a matrix and provides the total number of  hits for each number. This also has 4 optimization buttons to the show differences. This aupplication was optimized by our Pasadena LiveCode user's group. Drop me an email and I will send you a zipped copy of the Script with a sample lotto data file with approximately 8000 lines of picks.
Email: chipsm at themartinz.com

-----Original Message-----
From: use-livecode <use-livecode-bounces at lists.runrev.com> On Behalf Of Geoff Canyon via use-livecode
Sent: Monday, June 11, 2018 5:22 PM
To: How to use LiveCode <use-livecode at lists.runrev.com>
Cc: Geoff Canyon <gcanyon at gmail.com>
Subject: Optimization can be tricky

​I have a routine that takes about a minute to run for the test case I created, and who knows how long for the real use case. Given that I want to run it several thousand times for actual work, I need it to run (much) faster.

Roughly, the routine gathers together a bunch of data from arrays, sorts the data based on its relationship to other arrays, and then returns a subset of the result.

My first pass at the routine looked roughly like this:

   repeat for each line T in the keys of interestArray[uID]
      repeat for each line S in storyArray[T]
         if abs(item 2 of S - item 1 of interestArray[uID][T]) < 20 \
               and userSeenArray[uID][item 1 of S] < 4
         then put (101 + userSeenArray[uID][item 1 of S] * 30 + 5 * \
               abs(item 2 of S - item 1 of interestArray[uID][T]) - \
               item 2 of interestArray[uID][T]),T,S & cr after candidateList
      end repeat
   end repeat
   sort lines of candidateList numeric by random(item 1 of each)

In simple terms: parse through the individual lines of all the entries that possibly work, calculating a relevant value for each and appending that value and the line to an interim list, which then gets sorted, randomly favoring lower values.

I assumed the problem was all the line-by-line parsing, and I thought of a clever way to accomplish the same thing. That led to this:

   put storyArray into R
   intersect R with interestArray[uID]
   combine R using cr and comma
   sort lines of R by (101 + userSeenArray[uID][item 2 of each] * 30 + 5 * \
         abs(item 3 of each - item 1 of interestArray[uID][item 1 of each]) \
         - item 2 of interestArray[uID][item 1 of each])

Much simpler, albeit that's a heck of a "sort by" -- more complex by far than any I had previously created, and a testament to the power and flexibility of "each". Alas, despite condensing my code and removing parsing and loops, that version took ten seconds more than the previous version, I'm guessing because the previous version has that "if" statement that weeds out undesirable entries before the sort has to deal with them.

(I'm writing this email as I parse through this figuring it out)

So it turns out that the crazy use of "each" is only part of the problem -- changing that line to:

   sort lines of R by random(10000)

still takes over 20 seconds -- 3x as fast, but still far too slow. It turns out that the source data numbers anywhere from 1,000 to 2,000 lines, so sorting it in any way to randomly select the several lines I need is a really bad choice. removing the sort step but keeping everything else cuts the execution time down to under a second.

Hmm. How to select several lines at weighted random from among a couple thousand? I'll think on this and post a follow-up.
use-livecode mailing list
use-livecode at lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:

More information about the Use-livecode mailing list