Making Revolution faster with really big arrays

Dennis Brown see3d at writeme.com
Tue Apr 26 20:06:22 EDT 2005


On Apr 14, 2005, at 10:09 AM, Dennis Brown wrote:

> Dick,
>
> Thanks for your various tips and help online and offline.  I need to 
> take a week to digest and incorporate the ideas into a real piece of 
> code that I hope will fly like the wind --then I will share my code 
> and a data generator for you and everyone else to see and rip to 
> shreds ;-)
>
> Thanks again,
> Dennis
>
>

I have been studying methods of making things go faster with the 
numeric processing of big arrays during the last couple of weeks (while 
I have been writing a lot of actually useful processing scripts).  
However, I thought I should provide some simplified scripts to make it 
easier to separate out the essential factors that I have learned so 
far.

The first thing I have found is that accessing the array elements with 
the repeat for each is very fast, but does not allow for accessing more 
than one array at a time.  This is a shame and should be corrected by 
adding or modifying some Transcript operators.

The second thing I have found is that if you want to do just a few 
mathematical operations, the times really start adding up quickly.  Do 
a few adds, rounds, etc. and accessing the data is not the worst of the 
bottlenecks.  I used to have a rule of thumb that a P code compiler 
runs 16 times slower than a machine code compiler, and an interpreter 
was 256 times as slow for number crunching.  Transcript seems to be 128 
times as slow.  Not as optimized as I would have hoped for.  All my 
large array stuff runs about 4 times slower than I had hoped, but it is 
within the realm that I can make it work for me --with a bit of doing 
things in more complicated ways than I would like and taking a lot more 
time with work arounds.

Here are the scripts I came up with for adding each element of a 1000 
line by 1000 item text array to each other.  I know there are much 
faster ways of doing this particular operation, but I wanted to use it 
as a simplified example of accessing parallel elements in two (or more) 
arrays at the same time to see how fast different methods would be.

Method1: 1.7 s =1x
This is how fast Rev would be if it had the proper Transcript support 
for doing this along the lines of a repeat for each.
Method2a: 10.8 s = 6.35x
This is using a repeat for each on one array and a character pointer on 
the other array
Method2b: 18 s =10.6x
This is using a repeat for each on one array and simulating the code 
for new Transcript functions/commands for sequential access
Method3:  7.9 s =4.6x
This is using a repeat for each on one array and the split command to 
access the other array as an indexed array
Method4: 12.6 s =7.4x
This is using the split command on both arrays and accessing each as an 
indexed array

Can anyone think of another way to do this that would be faster (for 
accessing parallel arrays, not the trivial adding them together 
operation I am doing here).  Of course the times are for my machine and 
will vary on yours, but the time relative to Method1 should be noted.  
If you can figure out how to make my examples run faster, I will also 
learn more about this problem.

Thanks,
Dennis

THE SCRIPT:

global gTestArrayX,gTestArrayY

--put the following lines into the menu items of a menu button
Init
Method1
Method2a
Method2b
Method3
Method4
--put the following into the script of the button

on menuPick x
   put the long seconds into s
   send x to me
   put space & (the long seconds -s) after msg
end menuPick

on Init --create two test string arrays 1000x1000 numbers
   put empty into gTestArrayX
   put empty into gTestArrayY
   repeat with i=1 to 1000
     repeat with j=1 to 1000
       put j&comma after gTestArrayX
     end repeat
     put cr into last char of gTestArrayX
   end repeat
   put gTestArrayX into gTestArrayY
   put "Init:"
end Init

on Method1 --just repeat for each on the two arrays as best case 
reference timing (can't add them directly this way)
   put 0 into total
   repeat for each line n in gTestArrayX
     repeat for each item x in n
       add x to total
     end repeat
   end repeat
   repeat for each line n in gTestArrayY
     repeat for each item x in n
       add x to total
     end repeat
   end repeat
   put total&&"Reference(Method1):"
end Method1

on Method2a --Add lines in two text arrays using repeat for each and 
sequential processing method
   put 0 into total
   put 1 into i --character index into gTestArrayY
   repeat for each line n in gTestArrayX
     repeat for each item x in n
       --
       put empty into itemY
       repeat forever --get next item in gTestArrayY
         get char i of gTestArrayY
         add 1 to i
         if (it=comma)or(it=cr)or(it=empty) then exit repeat
         put it after itemY
       end repeat
       --
       add itemY+x to total
     end repeat
   end repeat
   put total&&"Sequential(Method2):"
end Method2a

on Method2b --Add lines in two text arrays using repeat for each and 
simulated Rev sequential functions
   put 0 into total
   put 1 into i --character line index into gTestArrayY
   repeat for each line n in gTestArrayX
     --put nextLine(offset,string) This should be a primitive Rev 
command or function
     put empty into lineY
     repeat forever --get next item in gTestArrayY
       get char i of gTestArrayY
       add 1 to i
       if (it = cr)or(it=empty) then exit repeat
       put it after lineY
     end repeat
     --end of nextLine(offset,string)
     put 1 into j --character item index into lineY
     repeat for each item x in n
       --put nextItem(offset,string) This should be a primitive Rev 
command or function
       put empty into itemY
       repeat forever --get next item in gTestArrayY
         get char j of lineY
         add 1 to j
         if (it = comma)or(it = empty) then exit repeat
         put it after itemY
       end repeat
       --end of nextItem(offset,string)
       add itemY+x to total
     end repeat
   end repeat
   put total&&"Sequential(Method2):"
end Method2b

on Method3 --Add lines in two text arrays using repeat for each and 
split array processing method
   put 0 into total
   put gTestArrayY into y --I do not split the global directly, because 
I don't want to destroy it
   split y by cr
   put 0 into lineCount
   repeat for each line n in gTestArrayX
     add 1 to lineCount
     put y[lineCount] into m
     split m by comma
     put 0 into i
     repeat for each item x in n
       add 1 to i
       add x+m[i] to total
     end repeat
   end repeat
   put total&&"Split(Method3):"
end Method3

on Method4 --Add lines in two text arrays using split array only 
processing method
   put 0 into total
   put gTestArrayX into n --I do not split the globals directly, because 
I don't want to destroy them
   put gTestArrayY into m
   split n by cr
   split m by cr
   repeat with i=1 to 1000
     put n[i] into x
     put m[i] into y
     split x by comma
     split y by comma
     repeat with j=1 to 1000
       add x[j]+y[j] to total
     end repeat
   end repeat
   put total&&"Split(Method4):"
end Method4



More information about the use-livecode mailing list