Making Revolution faster with really big arrays
Dennis Brown
see3d at writeme.com
Tue Apr 12 17:36:52 EDT 2005
Thanks Pierre,
I considered that also. A Database application would certainly handle
the amount of data, but they are really meant for finding and sorting
various fields, not for doing the kind of processing I am doing. The
disk accessing would slow down the process.
Dennis
On Apr 12, 2005, at 5:27 PM, Pierre Sahores wrote:
> Welcome to the Revolution Dennis,
>
> Why could you not take help from the native Rev ablity to manage the
> process in storing the datas inside an ACID-DB alike PostgreSQL or
> OpenBase ? It's how i would handle such amounts of datas, for my own.
> Transcript for the RAM fine high-speed calculations and SQL for the
> right datas presets extractions could probably open an interesing
> datas management way for your process, in about calculations speed and
> safety.
>
> Best,
>
> Le 12 avr. 05, à 22:36, Dennis Brown a écrit :
>
>> Hi all,
>>
>> I just joined this list. What a great resource for sharing ideas and
>> getting help.
>>
>> I am actively writing a bunch of Transcript code to sequentially
>> process some very large arrays. I had to figure out how to handle a
>> gig of data. At first I tried to load the file data into a data
>> array[X,Y,Z] but it takes a while to load and processes for random
>> access and it takes a lot of extra space for the structure. I also
>> could never get all the data loaded in without crashing Revolution
>> and my whole system (yes, I have plenty of extra RAM).
>>
>> The scheme I ended up with is based on the fact that the only fast
>> way I could find to process a large amount of data is with the repeat
>> for each control structure. I broke my data into a bunch of 10,000
>> line by 2500 item arrays. Each one holds a single data item (in this
>> case it relates to stock market data). That way I can process a
>> single data item in one sequential pass through the array (usually
>> building another array in the process). I was impressed at how fast
>> it goes for these 40MB files. However, this technique only covers a
>> subset of the type of operations I need to do. The problem is that
>> you can only specify a single item at a time to work with the repeat
>> for each. In many cases, I need to have two or more data items
>> available for the calculations. I have to pull a few rabbits out of
>> my hat and jump through a lot of hoops to do this and still go faster
>> than a snail. That is a crying shame. I believe (but don't know for
>> sure) that all the primitive operations are in the runtime to make it
>> possible to do this in a simple way if we could just access them from
>> the compiler. So I came up with an idea for a proposed language
>> extension. I put the idea in Bugzilla yesterday, then today, I
>> thought I should ask others if they liked the idea, had a better
>> idea, or could help me work around not having this feature in the
>> mean time, since I doubt I would see it implemented in my lifetime
>> based on the speed I see things getting addressed in the Bugzilla
>> list.
>>
>> The Idea is to break apart the essential functional elements of the
>> repeat for each control to allow more flexibility. This sample has a
>> bit more refinement than what I posted yesterday in Bugzilla.
>>
>> The new keyword would be "access" , but could be something else.
>>
>> An example of the use of the new keywords syntax would be:
>>
>> access each line X in arrayX--initial setup of pointers and X value
>> access each item Y in arrayY --initial setup of pointers and Y value
>> repeat for number of lines of arrayX times --same as a repeat for each
>> put X & comma & Y & return after ArrayXY --merged array
>> next line X --puts the next line value in X
>> next item Y --if arrayY has fewer elements than arrayX, then empty
>> is supplied, could also put "End of String" in the result
>> end repeat
>>
>> Another advantage of this syntax is that it provides for more
>> flexibility in structure of loops. You could repeat forever, then
>> exit repeat when you run out of values (based on getting an empty
>> back). The possibilities for high speed sequential access data
>> processing are much expanded which opens up more possibilities for
>> Revolution.
>>
>> I would love to get your feedback or other ideas about solving this
>> problem.
>>
>> Dennis
>>
>> _______________________________________________
>> use-revolution mailing list
>> use-revolution at lists.runrev.com
>> http://lists.runrev.com/mailman/listinfo/use-revolution
>>
>>
> --
> Bien cordialement, Pierre Sahores
>
> 100, rue de Paris
> F - 77140 Nemours
>
> psahores+ at +easynet.fr
> sc+ at +sahores-conseil.com
>
> GSM: +33 6 03 95 77 70
> Pro: +33 1 64 45 05 33
> Fax: +33 1 64 45 05 33
>
> <http://www.sahores-conseil.com/>
>
> WEB/VoD/ACID-DB services over IP
> "Mutualiser les deltas de productivité"
>
>
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> http://lists.runrev.com/mailman/listinfo/use-revolution
>
More information about the use-livecode
mailing list