How to use an array to solve the following...
Glen Bojsza
gbojsza at gmail.com
Tue Feb 21 03:29:49 EST 2012
Sorry got caught up on an issue...
So far things look very fast and seem to give the results I need (will
confirm as I use larger data sets that have a known answer).
But this does bring up a question and doing more than planned with the
final lists...
If the final list is kept in sequential order based on the xs column I
thought that either an sqlite database or arrays could be used for a basic
query. I prefer arrays since it is probably easier to use for the desired
result.
The query would be based on the user selecting a starting xs value and and
ending xs value with the resulting rows between (and including the starting
and ending value rows).
Everything would be based on this query of the final list.
For example a user wants an starting xs value of 10 and an ending xs value
of 40 I think that if the keys are sequential (the keys being the first
column - xs) then it would be a fast solution using arrays since you should
be able to determine the starting line and ending line of the array without
needing to cycle through every line and then just produce a new array /
list. Again, is this a good use for an array? A database would work and be
more flexible if the query was going to have multiple requirements but for
my case it would be over kill...true? Or will Kay prove it to be faster
with just lists :-)
xs wt gt
10 6 32 <--starting value
20 0 0
30 0 0
40 0 67 <--ending value
50 0 0
70 0 0
80 7 0
90 0 0
120 0 0
130 23 55
produces a new list / array
xs wt gt
10 6 32
20 0 0
30 0 0
40 0 67
On Mon, Feb 20, 2012 at 10:37 PM, Kay C Lan <lan.kc.macmail at gmail.com>wrote:
> On Tue, Feb 21, 2012 at 10:20 AM, Geoff Canyon Rev <gcanyon+rev at gmail.com
> >wrote:
>
> > For items, lines, and words, using
> > "item/line/word of myContainer" gets worse the larger the container is.
> For
> > character and with arrays, it doesn't.
> >
>
> Excellent rules of thumb, though there is a caveat to all this.
>
> Unfortunately I haven't seen a further response from Glen, as I was going
> to wait to ask one more question, before offering further suggestions; but
> I'll now offer it anyway.
>
> Glen doesn't mention what the final use/access of the data will be. If you
> are only every going to deal with the data as a whole, then repeat for each
> line will generally be the fastest. On the other hand, if after preparing
> all your lists and merging them, the final purpose is to pick small bits
> and pieces out of it from here, there and anywhere, arrays (or a db) might
> be better.
>
> So the caveat is, always test and compare. It might be faster to create and
> merge the data using repeat for each, but slower to access it that way. It
> might be slower to merge the data using arrays, but faster to access it
> final format.
>
> Here are some comparisons I carried out.
>
> I created a variable of 1000000 lines containing 11 items similar to Glen's
> data.
> I created an array of the same data. It took less time to create the array.
>
> I then accessed the data 10000 times in the following ways.
>
> put item 2 of line 1 into tStore2
> --because item 1 is an id not data
> put item 11 of line 1000000 into tStore2
> put item -1 of line 1 into tStore2
> put item -1 of line -1 into tStore2
> put aStore[1][1] into tStore2
> put aStore[1000000][10] tStore2
>
> For the final 2 repeat for each cases I had to add the overhead of an if
> statement to test that I was at the right line, and simply put the 1st or
> last item into tStore2. In both cases I immediately exit repeat so that it
> did not waste cycles processing lines it didn't have too.
>
> Here are the results:
>
> Created tStore in 4962ms
> Created aStore[][] in 30463ms
> For 10000 cycles.
> Finding the 1st item of the 1st line using direct reference = 3ms
> Finding the last item of the last line using direct reference = 440294ms
> Finding the -1 item of the 1st line using direct reference = 5ms
> Finding the -1 item of the -1 line using direct reference = 733095ms
> Finding the first child of the first key = 3ms
> Finding the last child of the last key = 3ms
> Finding 1st item of 1st line using repeat for each line = 0ms
> Finding last item of last line using repeat for each line = 454ms
>
> The results speak for themselves. Repeat for each can be blindingly fast,
> but naturally, as Geoff pointed out, slows the further into the data you
> need to go, and will therefore give an inconsistent feel to a user. Arrays
> on the other hand might not be the fastest, but they not slow, and will
> always respond the same way no matter where in the data you look.
>
> What ever you do, avoid using item -1 of line -1 !!!!
>
> One last thing. Normally when I do these speed test I don't 'do' anything
> else on my computer, although other apps might be loaded in the background.
> In this case, because it was taking so long I did read some emails, which
> is probably real life conditions, I mean would wait 12min sitting doing
> nothing for an item -1 of line -1 script to complete?
>
> LC offers many ways to skin the cat, the method you choose should not be
> just based on it's size, but consider the final purpose; whether you want
> the inards to eat once, the skin to wear over and over, or both.
>
> My original post included the script I used, but it was rejected for being
> too long, so I've removed it.
>
> Anyone want to test the speed of finding data in a 1000000 line variable
> using char -1 of word -1 of item -1 of line -1?
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
More information about the use-livecode
mailing list