Another empirical speed test
Cubist at aol.com
Cubist at aol.com
Sat Mar 12 02:30:36 EST 2005
What I've got to say here isn't new, but I think it's worth bringing it up
again for the benefit of those who joined the Revolution *after* the last
time it was mentioned here...
In the project I'm working on now, I want to take a generic CSV file and
split it up into its component columns -- for instance, a CSV file whose lines
fit the pattern "a,b,c,d,e" would be five columns. Sadly, there isn't any
built-in function that does this, so I'll have to roll my own. And as you can well
imagine, speed is an issue. So I worked up a couple of different methods, and
clocked them, and I'm going to present the results (which, as I stated above,
will not surprise any of the 'old hands').
Details of the setup: The test data was 5,175 lines of Apple stock-price
data, from 1984 to the present. All tests conducted on a 400-MHz G3 'pismo'
PowerBook running MacOS 9.1, with 320 MB of RAM. Faster machines will of course
yield faster times, but your relative rankings (i.e., "*this* is X% faster than
*that*") should be about the same as mine.
# Test 1 code
repeat with K1 = 1 to the number of lines in DerData
put item 3 of line K1 of DerData into line K1 of DerData
end repeat
# Time: 18 seconds
# Test 2 code
put "" into Rezult
repeat (the number of lines in DerData)
put (item 3 of line 1 of DerData) & return before Rezult
delete line 1 of DerData
end repeat
# Time: 6.7 seconds
# Test 3 code
put "" into Rezult
repeat for each line LL in DerData
put return & (item 3 of LL) after Rezult
end repeat
# !!! !!! !!! -- Time: < .03 seconds -- !!! !!! !!!
Yes, Virginia, "repeat for each" is about THREE BLEEDING ORDERS OF
MAGNITUDE faster than "repeat with K1"! It's true that "repeat for each" doesn't give
you a counter, as "repeat with K1" does... but with this kind of speed
difference, you can afford to roll your own and slip an "add 1 to MyCounter" into
your loop, right?
One more thing: I quoted the "repeat for each" time as "< .03" because I
got slightly different timings when I tried it with various item-numbers. For
the record:
item 2 of LL, .024 seconds
item 5 of LL, .027 seconds
item 7 of LL, .029 seconds
My test-data had only 7 items per line, so I wondered if it would make any
difference whether I used "item 7" or "item -1". It did, like so:
item -1 of LL, .036 seconds
In other words, there's a 24% difference. It would appear that counting
backwards carries a nontrivial overhead.
You may now return to your normal programming...
More information about the use-livecode
mailing list