Making Revolution faster with really big arrays

Dennis Brown see3d at writeme.com
Tue Apr 12 17:36:52 EDT 2005


Thanks Pierre,

I considered that also.  A Database application would certainly handle 
the amount of data, but they are really meant for finding and sorting 
various fields, not for doing the kind of processing I am doing.  The 
disk accessing would slow down the process.

Dennis

On Apr 12, 2005, at 5:27 PM, Pierre Sahores wrote:

> Welcome to the Revolution Dennis,
>
> Why could you not take help from the native Rev ablity to manage the 
> process in storing the datas inside an ACID-DB alike PostgreSQL or 
> OpenBase ? It's how i would handle such amounts of datas, for my own. 
> Transcript for the RAM fine high-speed calculations and SQL for the 
> right datas presets extractions could probably open an interesing 
> datas management way for your process, in about calculations speed and 
> safety.
>
> Best,
>
> Le 12 avr. 05, à 22:36, Dennis Brown a écrit :
>
>> Hi all,
>>
>> I just joined this list.  What a great resource for sharing ideas and 
>> getting help.
>>
>> I am actively writing a bunch of Transcript code to sequentially 
>> process some very large arrays.  I had to figure out how to handle a 
>> gig of data.  At first I tried to load the file data into a data 
>> array[X,Y,Z] but it takes a while to load and processes for random 
>> access and it takes a lot of extra space for the structure.  I also 
>> could never get all the data loaded in without crashing Revolution 
>> and my whole system (yes, I have plenty of extra RAM).
>>
>> The scheme I ended up with is based on the fact that the only fast 
>> way I could find to process a large amount of data is with the repeat 
>> for each control structure.  I broke my data into a bunch of 10,000 
>> line by 2500 item arrays.  Each one holds a single data item (in this 
>> case it relates to stock market data).  That way I can process a 
>> single data item in one sequential pass through the array (usually 
>> building another array in the process).  I was impressed at how fast 
>> it goes for these 40MB files.  However, this technique only covers a 
>> subset of the type of operations I need to do.  The problem is that 
>> you can only specify a single item at a time to work with the repeat 
>> for each.  In many cases, I need to have two or more data items 
>> available for the calculations.  I have to pull a few rabbits out of 
>> my hat and jump through a lot of hoops to do this and still go faster 
>> than a snail.  That is a crying shame.  I believe (but don't know for 
>> sure) that all the primitive operations are in the runtime to make it 
>> possible to do this in a simple way if we could just access them from 
>> the compiler. So I came up with an idea for a proposed language 
>> extension.  I put the idea in Bugzilla yesterday, then today, I 
>> thought I should ask others if they liked the idea, had a better 
>> idea, or could help me work around not having this feature in the 
>> mean time, since I doubt I would see it implemented in my lifetime 
>> based on the speed I see things getting addressed in the Bugzilla 
>> list.
>>
>> The Idea is to break apart the essential functional elements of the 
>> repeat for each control to allow more flexibility.  This sample has a 
>> bit more refinement than what I posted yesterday in Bugzilla.
>>
>> The new keyword would be "access" , but could be something else.
>>
>> An example of the use of the new keywords syntax would be:
>>
>> access each line X in arrayX--initial setup of pointers and X value
>> access each item Y in arrayY --initial setup of pointers and Y value
>> repeat for number of lines of arrayX times --same as a repeat for each
>>    put X & comma & Y & return after ArrayXY --merged array
>>    next line X --puts the next line value in X
>>    next item Y --if arrayY has fewer elements than arrayX, then empty 
>> is supplied, could also put "End of String" in the result
>> end repeat
>>
>> Another advantage of this syntax is that it provides for more 
>> flexibility in structure of loops.  You could repeat forever, then 
>> exit repeat when you run out of values (based on getting an empty 
>> back).  The possibilities for high speed sequential access data 
>> processing are much expanded which opens up more possibilities for 
>> Revolution.
>>
>> I would love to get your feedback or other ideas about solving this 
>> problem.
>>
>> Dennis
>>
>> _______________________________________________
>> use-revolution mailing list
>> use-revolution at lists.runrev.com
>> http://lists.runrev.com/mailman/listinfo/use-revolution
>>
>>
> -- 
> Bien cordialement, Pierre Sahores
>
> 100, rue de Paris
> F - 77140 Nemours
>
> psahores+ at +easynet.fr
> sc+ at +sahores-conseil.com
>
> GSM:   +33 6 03 95 77 70
> Pro:      +33 1 64 45 05 33
> Fax:      +33 1 64 45 05 33
>
> <http://www.sahores-conseil.com/>
>
> WEB/VoD/ACID-DB services over IP
> "Mutualiser les deltas de productivité"
>
>
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> http://lists.runrev.com/mailman/listinfo/use-revolution
>



More information about the use-livecode mailing list