Making Revolution faster with really big arrays

Dennis Brown see3d at writeme.com
Tue Apr 12 22:20:46 EDT 2005


Thanks Brian,

I don't require random access to the data.  I only need sequential 
access.  That is why the repeat for each operator works so fast --less 
than a microsecond per data item.  I'm not going to match that with 
anything other than RAM.

Dennis

On Apr 12, 2005, at 10:06 PM, Brian Yennie wrote:

> Dennis,
>
> I have to agree with Pierre here. If you are looking for random access 
> to many thousands of records taking up gigabytes of memory, a database 
> engine is, IMO, the only logical choice.
>
> A simple MySQL/PostgreSQL/Valentina/etc database indexed by line 
> number (or stock symbol) would be very fast.
>
> Without indexing your data or fitting all of it into random-access 
> in-memory data structures, you're fighting a painful battle. If you 
> algorithm is scaling out linearly, you'll just run too slow, and if 
> your data size is doing the same you'll run out of memory. On the 
> other hand, database engines can potentially handle _terabytes_ of 
> data and give you random access in milliseconds. You simply won't beat 
> that in Transcript.
>
> One thing you could consider if you don't want a whole database engine 
> to deal with, is the feasibility of indexing the data yourself - which 
> will give you some of the algorithmic benefits of a database engine. 
> That is, make one pass where you store the offsets of each line in an 
> index, and then use that to grab lines. Something like (untested):
>
> ## index the line starts and ends
> put 1 into lineNumber
> put 1 into charNum
> put 1 into lineStarts[1]
> repeat for each char c in tData
>     if (c = return) then
>        put (charNum - 1) into lineEnds[lineNumber]
>        put (charNum + 1) into lineStarts[lineNumber + 1]
>        add 1 to lineNumber
>     end if
>     add 1 to charNum
> end repeat
> if (c <> return) then put charNum into lineEnds[lineNumber]
>
> ## get line x via random char access
> put char lineStarts[x] to lineEnds[x] of tData into lineX
>
> - Brian
>
>> Thanks Pierre,
>>
>> I considered that also.  A Database application would certainly 
>> handle the amount of data, but they are really meant for finding and 
>> sorting various fields, not for doing the kind of processing I am 
>> doing.  The disk accessing would slow down the process.
>>
>> Dennis
>>
>> On Apr 12, 2005, at 5:27 PM, Pierre Sahores wrote:
>>



More information about the use-livecode mailing list