How to use an array to solve the following...

Kay C Lan lan.kc.macmail at gmail.com
Tue Feb 21 00:37:23 EST 2012


On Tue, Feb 21, 2012 at 10:20 AM, Geoff Canyon Rev <gcanyon+rev at gmail.com>wrote:

> For items, lines, and words, using
> "item/line/word of myContainer" gets worse the larger the container is. For
> character and with arrays, it doesn't.
>

Excellent rules of thumb, though there is a caveat to all this.

Unfortunately I haven't seen a further response from Glen, as I was going
to wait to ask one more question, before offering further suggestions; but
I'll now offer it anyway.

Glen doesn't mention what the final use/access of the data will be. If you
are only every going to deal with the data as a whole, then repeat for each
line will generally be the fastest. On the other hand, if after preparing
all your lists and merging them, the final purpose is to pick small bits
and pieces out of it from here, there and anywhere, arrays (or a db) might
be better.

So the caveat is, always test and compare. It might be faster to create and
merge the data using repeat for each, but slower to access it that way. It
might be slower to merge the data using arrays, but faster to access it
final format.

Here are some comparisons I carried out.

I created a variable of 1000000 lines containing 11 items similar to Glen's
data.
I created an array of the same data. It took less time to create the array.

I then accessed the data 10000 times in the following ways.

put item 2 of line 1 into tStore2
--because item 1 is an id not data
put item 11 of line 1000000 into tStore2
put item -1 of line 1 into tStore2
put item -1 of line -1 into tStore2
put aStore[1][1] into tStore2
put aStore[1000000][10] tStore2

For the final 2 repeat for each cases I had to add the overhead of an if
statement to test that I was at the right line, and simply put the 1st or
last item into tStore2. In both cases I immediately exit repeat so that it
did not waste cycles processing lines it didn't have too.

Here are the results:

Created tStore in 4962ms
Created aStore[][] in 30463ms
For 10000 cycles.
Finding the 1st item of the 1st line using direct reference = 3ms
Finding the last item of the last line using direct reference = 440294ms
Finding the -1 item of the 1st line using direct reference = 5ms
Finding the -1 item of the -1 line using direct reference = 733095ms
Finding the first child of the first key = 3ms
Finding the last child of the last key = 3ms
Finding 1st item of 1st line using repeat for each line = 0ms
Finding last item of last line using repeat for each line = 454ms

The results speak for themselves. Repeat for each can be blindingly fast,
but naturally, as Geoff pointed out, slows the further into the data you
need to go, and will therefore give an inconsistent feel to a user. Arrays
on the other hand might not be the fastest, but they not slow, and will
always respond the same way no matter where in the data you look.

What ever you do, avoid using item -1 of line -1 !!!!

One last thing. Normally when I do these speed test I don't 'do' anything
else on my computer, although other apps might be loaded in the background.
In this case, because it was taking so long I did read some emails, which
is probably real life conditions, I mean would wait 12min sitting doing
nothing for an item -1 of line -1 script to complete?

LC offers many ways to skin the cat, the method you choose should not be
just based on it's size, but consider the final purpose; whether you want
the inards to eat once, the skin to wear over and over, or both.

My original post included the script I used, but it was rejected for being
too long, so I've removed it.

Anyone want to test the speed of finding data in a 1000000 line variable
using char -1 of word -1 of item -1 of line -1?



More information about the use-livecode mailing list