Deleting Data Woefully Slow
Richard Gaskin
ambassador at fourthworld.com
Thu Mar 25 08:08:37 EDT 2010
Kay C Lan wrote:
> What I'm after is the fastest way to take a HUGE amount of data and reduce
> it by roughly 5-10%. The repeat for each code I supplied seems to do that,
> if anyone has any other code that is faster PLEASE provide.
First, I must say how delighted I am to find others fixating on
performance nuances. :) This discussion has been an enjoyable read for
me, and hopefully others as well, as we learn the ways various syntax
options affect what the engine's doing under the hood. Fun stuff, and
very useful.
FWIW, my own tests with various data over the years reflects your
experience, and Raney's description: "repeat for each" is much faster
than "repeat with" or anything involving "get line x" because the line
counting and the parsing out of the line value happens only one per
iteration.
Moreover, Raney once noted that he took the time to optimize
"put...after..." specifically for cases like yours, so it could be used
in conjunction with "repeat for each" to build a subset of data.
Split/combine require a lot of overhead, since under the hood they're
effectively calling a form of "repeat for each" to set up the array
values, with the additional overhead of creating the hash table as it
goes. Useful as they are, for your purposes I'd be surprised if
translating the list to an array was less than 15% slower.
Using "filter" works well in many cases, but as noted elsewhere RegEx is
a complex and highly generalized subsystem, designed to optimize
programmer efficiency at the expense of runtime efficiency. So while it
can be handy to reduce complex filtering to a single line of code, it
rarely outperforms "repeat for each".
In short, I think you're on the right track. Traversing more than a
million lines will always be time-intensive, but at least using "repeat
for each" with "put...after..." you can reduce that as reasonably as one
can expect.
--
Richard Gaskin
Fourth World
Rev training and consulting: http://www.fourthworld.com
Webzine for Rev developers: http://www.revjournal.com
revJournal blog: http://revjournal.com/blog.irv
More information about the use-livecode
mailing list