[TIP] Sorting by ValueList and synchronized sorting

Richard Gaskin ambassador at fourthworld.com
Tue Jun 17 19:46:33 EDT 2008


Richard Gaskin wrote:
> So this tells me two things:
> 
> 1. Using a function for a sortKey expression introduces a "sometimes" 
> rule in terms of understanding the order of expression evaluation in the 
> engine, in which most of the time functions are evaluated first but in 
> this case the function is applied repeatedly for each line of the sort 
> container as the sort command is run.
> 
> 2. Using a function as a sortKey expression evaluates the sort container 
>   as though by effectively adding data to it, rather than anything 
> necessarily in the data itself.
> 
> Both of these are very valuable insights.  Thanks to all who helped 
> explain this.
> 
> Yes indeed, now I can see many areas where this can be useful, but it 
> also leaves me with a question:
> 
> Given #1 above, how does this affect performance?  Unless there's 
> something ultra-tricky going on (wouldn't be the first time the engine 
> surprised me that way <g>), I would imagine that performance is affected 
> at least linearly, in which the overhead of the function call is 
> multiplied by the number of lines of the container to arrive at the 
> additional performance hit relative to a non-custom sort.
> 
> Perhaps I'll do some benchmarking to verify this theory....

FWIW, I ran some benchmarks and it turns out the theory seems to hold 
up, in which the time taken for a custom function sort is roughly the 
same as the time to call the function multiplied by the number of lines 
of the data being sorted, plus the time of the sort itself.

This isn't surprising given how it works, but may be helpful if an 
alternative algorithm for what you want to accomplish is available.

But given the grace of the syntax and the reasonably good performance of 
the engine overall, if you need to use it you'll likely be hard-pressed 
to come up with anything that does that which is faster.

For benchmarking fetishists, the code I tested is below.....

-- 
  Richard Gaskin
  Fourth World Media Corporation
  ___________________________________________________________
  Ambassador at FourthWorld.com       http://www.FourthWorld.com


on mouseUp
   put 1000 into n -- number of iterations to test
   put fld 1 into tData -- data to sort (mine was 535 lines)
   put the number of lines of tData into tLineNum -- for use in results
   --
   -- 1: Test time to call function
   put the millisecs into t
   repeat n
     get foo()
   end repeat
   put the millisecs - t into t1
   --
   -- 2: Test time to do straight sort
   put tData into tmp
   set the itemdel to tab
   put the millisecs into t
   repeat n
     sort lines of tmp by item 2 of each
   end repeat
   put the millisecs - t into t2
   --
   -- 3: Time to do custom sort:
   put tData into tmp
   put the millisecs into t
   repeat n
     sort lines of tmp by foo()
   end repeat
   put the millisecs - t into t3
   --
   -- Results:
   put "For "&n&" iterations - "&\
       cr&"Time to call foo: "&t1  &\
       cr&"Time to perform simple sort: "&t2 &\
       cr&"Time to perform foo sort: "& t3 &\
       cr&"(foo time * number of lines)+ simple sort time = "&\
         (t1* tLineNum)+ t2
end mouseUp



-- I tested with two variants of foo, each taking different approaches
-- with very different performance times:

--- Serious time-waster:
function foo
   repeat 100
     get random(9999)
   end repeat
   return it
end foo

-- Just complex enough to be measurable at 1000 iterations:
function foo
   get random(9999)
   add 44 to it
   subtract 4444 from it
   put 1* it into it
   put it div 5 + 6 into it
   get it + it / 4
   get round(it) + 10 + 1 + 1
   return it
end foo




More information about the use-livecode mailing list