Efficiency question for list modification

Richard Gaskin ambassador at fourthworld.com
Fri Mar 11 12:36:02 EST 2011


FlexibleLearning wrote:

> Proof of how optimized syntax can make an enormous difference to speed (by
> orders of magnitude in this case).
> 
> This is BAD...
>   repeat for each line L in tData
>     add 1 to n
>     put (item 1 of L/div1) into item 1 of line n of stdout
>     put (item 2 of L/div2) into item 2 of line n of stdout
>   end repeat
> 
> This is GOOD...
>   repeat for each line L in tData
>     put (item 1 of L/div1) &","& (item 2 of L/div2) &CR after stdout
>   end repeat
> 
> The Rule is: Less is More.

Sometimes.  Hard to generalize along those lines when you come to things
like RegEx, which is very compact to write but so enormously generalized
that it's often slower than brute-force parsing using simple chunks.

One non-obvious detail worth noting in your second example is that some
years ago Raney optimized the "after" form of "put" so some of the
overhead you might normally expect under the hood with those sorts of
wholesale block moves is somewhat alleviated, delivering a bit perkier
performance.

But the big tell-tale sign with the example above is the presence of the
line number specifier inside the loop.  I know you know this, but for
the benefit of the many newcomers who've joined this list in recent months:

Expressions relying on delimiters (token, word, item, line) are simple
to write and often very efficient, but when using them on large amounts
of data may be somewhat slow. esp. if used repeatedly within a loop.

In the example above, "line n of stdout" requires that the engine walk
through each character in stdout, counting the return chars as it goes,
stopping only after it reaches n returns.

Doing this once may not be so bad, but within a repeat loop it has to
keep going over and over the same set of characters each time through
the loop.

Fortunately, as the second example takes advantage of, the "repeat for
each" method only walks through the data once, keeping track of where it
is in each iteration and conveniently parsing out the last chunk ready
for use within the loop.

This avoids the redundant traversal caused by specifying the line number
within the loop, and often results in a performance boost of at least an
order of magnitude, often a few orders.

-- 
  Richard Gaskin
  Fourth World
  LiveCode training and consulting: http://www.fourthworld.com
  Webzine for LiveCode developers: http://www.LiveCodeJournal.com
  LiveCode Journal blog: http://LiveCodejournal.com/blog.irv




More information about the use-livecode mailing list