Making Revolution faster with really big arrays
Dennis Brown
see3d at writeme.com
Tue Apr 12 22:20:46 EDT 2005
Thanks Brian,
I don't require random access to the data. I only need sequential
access. That is why the repeat for each operator works so fast --less
than a microsecond per data item. I'm not going to match that with
anything other than RAM.
Dennis
On Apr 12, 2005, at 10:06 PM, Brian Yennie wrote:
> Dennis,
>
> I have to agree with Pierre here. If you are looking for random access
> to many thousands of records taking up gigabytes of memory, a database
> engine is, IMO, the only logical choice.
>
> A simple MySQL/PostgreSQL/Valentina/etc database indexed by line
> number (or stock symbol) would be very fast.
>
> Without indexing your data or fitting all of it into random-access
> in-memory data structures, you're fighting a painful battle. If you
> algorithm is scaling out linearly, you'll just run too slow, and if
> your data size is doing the same you'll run out of memory. On the
> other hand, database engines can potentially handle _terabytes_ of
> data and give you random access in milliseconds. You simply won't beat
> that in Transcript.
>
> One thing you could consider if you don't want a whole database engine
> to deal with, is the feasibility of indexing the data yourself - which
> will give you some of the algorithmic benefits of a database engine.
> That is, make one pass where you store the offsets of each line in an
> index, and then use that to grab lines. Something like (untested):
>
> ## index the line starts and ends
> put 1 into lineNumber
> put 1 into charNum
> put 1 into lineStarts[1]
> repeat for each char c in tData
> if (c = return) then
> put (charNum - 1) into lineEnds[lineNumber]
> put (charNum + 1) into lineStarts[lineNumber + 1]
> add 1 to lineNumber
> end if
> add 1 to charNum
> end repeat
> if (c <> return) then put charNum into lineEnds[lineNumber]
>
> ## get line x via random char access
> put char lineStarts[x] to lineEnds[x] of tData into lineX
>
> - Brian
>
>> Thanks Pierre,
>>
>> I considered that also. A Database application would certainly
>> handle the amount of data, but they are really meant for finding and
>> sorting various fields, not for doing the kind of processing I am
>> doing. The disk accessing would slow down the process.
>>
>> Dennis
>>
>> On Apr 12, 2005, at 5:27 PM, Pierre Sahores wrote:
>>
More information about the use-livecode
mailing list