Optimization can be tricky

Alex Tweedly alex at tweedly.net
Tue Jun 12 13:49:58 EDT 2018


I don't know if these are good ideas or BAD ideas .... and without some 
suitable data, not sure if I could find out - so I'll throw the ideas 
over and let you decide if they are worth trying :-)


1. Two ideas combined

  - avoid scanning for the "item 1" of each line
  - use the (hopefully optimised enough) numeric-index array lookups

So, instead of

       put random(101 + s1 * 30 + 5 * s2 - i2),T,S & cr \
       after candidateList

try

       put T,S  into  candidateArray[count]
       put random(101 + s1 * 30 + 5 * s2 - i2) into candidateValue[count]
       add 1 to count

And then later
    put the keys of candidateValue into candidatelist
    sort numeric the lines of candidatelist by candidateValue of each
and the access via the candidateArray.

2. (even wilder)
Assume you have N entries to consider, and that you only need M results
The *IF* M << N (i.e. much less than - you only need a few results from a large input set).

You might try something like this for the inner loop :
(going back to using lines ...)

   repeat for each line S in storyArray[T]
     put userSeenArray[uID][item 1 of S] into s1
     put abs(item 2 of S - i1) into s2
     if s2 < 20 and s1 < 4 then
       put random(101 + s1 * 30 + 5 * s2 - i2),T,S & cr \
           after candidateList
       add 1 to lineCount
       if lineCount > 4 * M then    -- (why 4 ? - maybe 2, maybe 8 ??)
           sort candidateList numeric by item 1 of each
           delete line M+1 to -1 of candidateList
           put M into lineCount
       end if
     end if
   end repeat


3. even wilder still ....
  - create K different candidateLists (where K is some power of 2)
  - sort each one,
  - mergesort in pairs, stopping when you get to M ...

put 8 into KK    -- (why 8  ? - maybe 16, or 32 ...)
repeat for each key T in interestArray[uID]
   put item 1 of interestArray[uID][T] into i1
   put item 2 of interestArray[uID][T] into i2
   repeat for each line S in storyArray[T]
     put userSeenArray[uID][item 1 of S] into s1
     put abs(item 2 of S - i1) into s2
     if s2 < 20 and s1 < 4 then
       add 1 to setCount
       put random(101 + s1 * 30 + 5 * s2 - i2),T,S & cr \
           after candidateList[setCount]
       if setCount > KK then
           put 0 into setCount
       end if
     end if
   end repeat
repeat with i = 1 to KK
    sort numeric lines of candidateList[i] by item 1 of each
end repeat
and then write / use a mergesort that takes two sorted lists, and returns the first N of the merged list.

(or even better, returns the first N, with a limit value; then each mergesort after the first can use the max value from the N'th entry of earlier merges to allow for early exit...)

That is left as an exercise for the reader :-) :-)

As I say - could be totally crazy, could be worthwhile.

I'd be happy to try out benchmarking it - if I had some representative data to play with.

Alex.

On 12/06/2018 12:03, hh via use-livecode wrote:

> You could try the following:
>
> repeat for each key T in interestArray[uID]
>    put item 1 of interestArray[uID][T] into i1
>    put item 2 of interestArray[uID][T] into i2
>    repeat for each line S in storyArray[T]
>      put userSeenArray[uID][item 1 of S] into s1
>      put abs(item 2 of S - i1) into s2
>      if s2 < 20 and s1 < 4 then
>        put random(101 + s1 * 30 + 5 * s2 - i2),T,S & cr \
>        after candidateList
>      end if
>    end repeat
> end repeat
> sort candidateList numeric by item 1 of each
>
>
>> Ali wrote:
>> repeat for each key T in interestArray[uID]
>>
>>> Geoff wrote:
>>> repeat for each line T in the keys of interestArray[uID]
>>>    repeat for each line S in storyArray[T]
>>> 	 if abs(item 2 of S - item 1 of interestArray[uID][T]) < 20 \
>>> 		   and userSeenArray[uID][item 1 of S] < 4
>>> 	 then put (101 + userSeenArray[uID][item 1 of S] * 30 + 5 * \
>>> 		   abs(item 2 of S - item 1 of interestArray[uID][T]) - \
>>> 		   item 2 of interestArray[uID][T]),T,S & cr after candidateList
>>>    end repeat
>>> end repeat
>>> sort lines of candidateList numeric by random(item 1 of each)
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list