large amounts of data and cgi

Andre Garzia andre at andregarzia.com
Tue Jan 15 19:26:31 EST 2008


Bernard,

have you tried valentina? It is very fast!!! :-D

cheers
andre

On 1/15/08, Bernard Devlin <bdrunrev at gmail.com> wrote:
> Part of a web application I'm working on requires a large number of
> queries joining several tables, and each table containing millions of
> rows.  Accessing the database  (Frontbase) using either the JDBC
> drivers or the Rev drivers provided by Frontbase, resulted in single
> queries that would take anywhere between 300-900ms.  As there could be
> hundreds of such queries in order to produce one response, I had to
> look at ways to speed this up.  I don't believe that it is a question
> of writing better SQL - the queries have to spider across the data,
> and keep a track of the route they take to get to each node in the
> graph.
>
> I tuned the database, making sure that all the 'where' columns were
> indexed, and checked the query plan to ensure that it was precisely
> those indexes being used in the queries.  I gave the database 100mb of
> RAM cache, and even tried loading the core tables 100% into RAM.  But
> still I could not really get much better performance.  From my
> testing, the data did not seem to be found in RAM until after the
> first time it was retrieved from disk, despite my setting the tables
> to pre-load in RAM.  Perhaps over time RAM-cached data would have
> shown a much better performance than at this stage.
>
> I decided to try HSQLDB - a rdbms written in Java, and which can
> contain all its tables in RAM, so I could be sure that the data was in
> RAM.  After exporting the core tables to CSV format (and transforming
> them to  SQL Inserts using Rev), it took about 1 hour to start the
> database, and when started the database took 1.5gb of RAM.  But the
> queries were still slow (my guess is that HSQLDB is just not designed
> to take that much data).
>
> So, I took the CSV data and imported it into a stack, turning each
> table into a custom property set.  I could now search the data in a
> tiny fraction of a second -- literally.  The hundreds of queries
> involved in making a single response are completed in 200ms.  That's
> right -- I can now run the entire series of queries in less time than
> it took to do a single query accessing a rdbms.
>
> Obviously, custom properties being used in this way are not
> appropriate if one needs referential integrity, locking, concurrent
> access, etc.  But for a read-only query, Rev has provided a very
> satisfactory solution.  Moreover, whilst HSQLDB would take an hour and
> 1.5gb of RAM, Rev could load the 100mb of CSV data in 2-4 secs, making
> the frequent updating of data and subsequent restarting of the
> application feasible.
>
> After following Pierre Sahores' tutorial on creating persistent Rev
> CGI processes, I now have these queries returning a web page in 200ms,
> instead of the 30 secs they would have taken using Java and a
> database.  I know all of this flies in the face of many people's
> expectations, and it is certainly not the position I saw myself being
> in 2 weeks ago.  The rest of the application is written in Java, but
> this core feature has been made possible using Rev.  I  expect that
> associative arrays in e.g. PHP might also be as fast in this regard.
>
> Bernard
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution
>


-- 
http://www.andregarzia.com All We Do Is Code.



More information about the use-livecode mailing list