Manipulating large qtys of text - Wow!

Sarah sarahr at genesearch.com.au
Mon May 20 05:40:23 EDT 2002


I had a routine that stepped through over 3000 lines of data and for 
each line, it had to extract a line number and use that to get data out 
of another list. I was already using a "repeat for each line"  loop and 
working entirely with variables but thanks to this thread I found 2 
other major ways to speed things up.

The first was cutting down on progress updates. Instead of updating my 
progress bar after each line, I now only update every 100 lines. After 
making this change, my routine took 495 ticks to complete - down from 
1249.

Then I read this email and thought I would try using an array for the 
first time. I took the list of data that I refer to by line number 
(about 900 lines) and used the split command to change it into an array 
in a single line. Then instead of getting line x of theList, I get 
theList[x]. This dropped the total execution time to 61 ticks!!!!

So with 2 simple changes, my routine time dropped from over 20 seconds 
to just 1 - that's just incredible. I had thought that converting a list 
to an array might have taken some time, but the split command is like 
lightning.

I have also discovered the timing feature of the script debugger which 
makes it very easy to track down which parts of your script need 
optimising. Everything happens a bit slower in debug mode, but the 
relative values are still useful.

Thanks,
Sarah


> It's not the difference between creating a new variable and changing an 
> existing one. It's that, when you're changing an existing variable, 
> you're doing it by changing a particular line. That means that when 
> tNum is 4999,
>
>   put tNum into item 1 of line tNum of tVar
>
> forces the engine to go through tVar looking for 4998th carriage 
> return; then when tNum is 5000 the engine has to go through again 
> looking for the 4999th carriage return, etc. The larger the number of 
> lines, the worse the problem gets.
>
> Building up the variable from scratch means that the engine only has to 
> put things at the end, so it's a lot faster, and the speed should go 
> linearly with the size of the task.
> --
> regards,
>
> Geoff Canyon
> gcanyon at inspiredlogic.com
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> http://lists.runrev.com/mailman/listinfo/use-revolution
>





More information about the use-livecode mailing list