Efficiency question for list modification
Nonsanity
form at nonsanity.com
Thu Mar 10 16:25:23 EST 2011
I didn't use that style because he mentioned he tried it without much
success. I tried it down on the straight-up 100,000 pass and it finished in
4 seconds. Hands down the fastest. I should have tried that on my own just
for completeness' sake.
I guess I was too taken with the faster results I got from chunking the
data. This happens to researchers that aren't careful about what they pay
attention to. It can lead to all sorts of horrendous things... Like
reporting positive results for psi experiments or homeopathy. :)
Hidden assumptions taint research results. My bad here.
The stack has been updated to include the fifth button that uses that style.
Result: 1 seconds. :)
~ Chris Innanen
On Thu, Mar 10, 2011 at 3:49 PM, Bob Sneidar <bobs at twft.com> wrote:
> You should also try using the form:
>
> repeat for each line theLineValue in theData
>
> Apparently this creates an internal array of theData and is much faster.
> The big caveat is that you do not alter what theData contains while in the
> repeat loop, as this will really screw things up. That is because this form
> accesses the memory of theData directly, and if you alter it, the system may
> do some house cleaning, and what once was there may be garbage or another
> bit of memory. It really gets ugly.
>
> Bob
>
>
> On Mar 10, 2011, at 11:42 AM, Nonsanity wrote:
>
> > I made a quick test stack to try out a few ides:
> >
> > http://dl.dropbox.com/u/144280/Divide%20List%20Tests.livecode
> >
> > It generates 100,000 random integer pairs into one field, then has four
> > buttons to do the sample division you gave to the two items in each line.
> >
> > The first is a straight-up "repeat with a = 1 to the number of lines"
> making
> > sure to copy the data from the field into a variable, and write the new
> data
> > into a variable before saving it to the output field. (Because working
> > directly with fields is slower than pure variables.)
> >
> > The second cuts the input list into ten 10,000 line chunks, and runs
> through
> > each of those in turn, tacking it all onto one output variable as it
> goes.
> >
> > The third does the same thing, but cuts the input into a hundred 1,000
> line
> > chunks.
> >
> > And the fourth brings it down to a mere 100 lines per chunk. (But a
> thousand
> > chunks.)
> >
> > Times were as follows on my not-so-hot machine:
> >
> > Generate Randoms: 2 seconds
> > Divide all 100,000: 77 seconds
> > Divide in groups of 10,000: 10 seconds
> > Divide in groups of 1,000: 7 seconds
> > Divide in groups of 100: 7 seconds
> >
> > Those totals are for doing ALL 100,000 regardless of how they are broken
> > down. I get the same output for all of them.
> >
> > So you can see that the slowdown is probably in accessing "line 84,932"
> etc
> > of the input string. Whereas limiting the line numbers to less than
> 10,000
> > makes a HUGE difference in speed. And even a bit better with a line limit
> of
> > 1,000. Any further savings of using tiny 100-line chunks is lost by
> having
> > to cut out 1,000 different chunks.
> >
> > But this should show where the slowdown is, and offer a way to work
> around
> > it.
> >
> > When the cookie is too big, break off what you can chew.
> >
> >
> > ~ Chris Innanen
> > ~ Nonsanity
> >
> >
> > On Thu, Mar 10, 2011 at 1:51 PM, FlexibleLearning <
> > admin at flexiblelearning.com> wrote:
> >
> >> Problem:
> >> I have a long list of several thousand lines.
> >> Each line contains two comma-separated numbers.
> >> I want to divide the first item of each line by one divisor, and divide
> the
> >> second item of each line by a different divisor.
> >> The list order must stay the same.
> >>
> >> Example:
> >> Using 2 and 5 as divisors...
> >> 10,10
> >> 12,15
> >> 8,12
> >> would become
> >> 5,2
> >> 6,3
> >> 4,2.4
> >>
> >> Options:
> >> Using "repeat with n=1 to num of lines" takes far too long.
> >> Using "repeat for each line L" either attempts to modify read-only data,
> or
> >> is only 25% faster using a dumping variable.
> >> Using split/combine will mess up the ordering (numeric array keys are
> not
> >> sorted numerically with combine).
> >>
> >> Any other ideas?
> >>
> >> Hugh Senior
> >> FLCo
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> use-livecode mailing list
> >> use-livecode at lists.runrev.com
> >> Please visit this url to subscribe, unsubscribe and manage your
> >> subscription preferences:
> >> http://lists.runrev.com/mailman/listinfo/use-livecode
> >>
> > _______________________________________________
> > use-livecode mailing list
> > use-livecode at lists.runrev.com
> > Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> > http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
More information about the use-livecode
mailing list