Making Revolution faster using dimensioned arrays

Dennis Brown see3d at writeme.com
Mon Jul 4 22:33:15 EDT 2005


To all the speed freaks,

I know that I have pushed for faster array processing and have even  
proposed an "Array sub-processor" as a possible solution.  However,  
after giving this much thought, I don't believe that a separate array  
sub-processor is needed to solve the problem of speed.  Transcript  
could provide the speed needed for processing arrays in a simpler way.

If Transcript had an array declaration command that allowed the user  
to fix the dimensions and data size/type for an array and limited the  
"keys" to integer indexes, then high speed array processing could be  
built-in.

Normally it would be a pain to have to dim all arrays.  Arrays in  
Transcript are sparsely populated and very general now --and could be  
further extended to make them even better for many general purpose  
solutions.  However, there is a class of problems that need speed and  
generally lend themselves to fixing the type and size.  These  
problems are quite common, but rarely seen implemented in Transcript  
because of the slow array processing speed.

The key is that once an array is fixed in size and type, the runtime  
does not need to do extensive checking of the data to determine the  
operation code, and the keys do not have to be looked up in a hash  
table to find the location of the element --just use the indexes  
directly.  The compiler can then generate the code for a much simpler  
runtime operator.  These declared arrays would run 10 to 100 times  
faster than the type-less ones for the kind of operations that are  
required by image processing, AI, statistics, and many other  
applications.  This is the main reason array processing runs so much  
faster in other languages.

Having fast array structures then opens up the possibility for a  
larger set of array operators or even an efficient way to pass the  
array pointers directly to an external.  The links between the rest  
of the data structures in Transcript and these dimensioned arrays  
become seamless.

Conceptually, the implementation of a dimensioned array would be as  
if it were a single string variable.  In other words, just a  
contiguous chunk of memory that you can index into based on the  
supplied dimensions.  We already have the concept of a pointer in  
Transcript.  When we say "get char (expression) of variable",  
"(expression)"  is a pointer.  That expression can be the index into  
a conceptual n dimensional character array.  However, at present we  
are limited to directly accessing only a single byte.  We could say   
"get char (expression) to (expression+4) of variable", to get 32 bits  
of information, but even if we know that it is supposed to represent  
a 32 bit floating point number, we have no way of telling Transcript  
that, so it will just hand us back 4 characters that could at best be  
translated into a 4 digit (including decimal point) number like  
3.14.  However, if we told the compiler that we want our variable to  
have 1000 rows by 1000 columns of  4 byte words, then the "get  
variable[25,64]" could get a 4 byte string of characters in lightning  
speed.  We could say that If we initialized our array with strings,  
we would get strings back.  If we initialized it with numbers, we  
would get numbers back --which is very Transcript like, or we could  
just specify the type.  The point is that a single dimensioned array,  
like a single variable, can hold only one data type at a time.

The advantage of such an arrangement is not only speed, but also  
compactness.  If we are interested in speed it is likely that we have  
a large quantity of data.  Arrays in Transcript today have a large  
overhead due to the key structure.  Dimensioned arrays could easily  
take half the space or even less.  The syntax could be very simple like:

global myArray[1000,1000]=0 --numeric array
local myArray[1000,1000,5]=false --Boolean array
local myArray[1000]="" --empty character array
local myArray[100,300]="abcd" --4 character word array

Dennis



More information about the use-livecode mailing list