UPDATE: filterArray()

Richard Gaskin ambassador at fourthworld.com
Mon May 20 15:36:05 EDT 2024


Bob Sneidar wrote:

> I’m surprised about the time differential.

I was too, but in the end it kinda makes sense.

Both LC and SQLite are complex scripting engines, where code has to be parsed and interpreted into executable machine instructions.  The one advantage of filtering LC arrays in LC Script is that it's with-the-grain, without needing to be reformatted for the expectations of a foreign system designed with no consideration of LC at all.

And never underestimate the internal complexity of RDBMS data structures.  B-trees are flexible but computationally expensive.


> I guess the main advantage for using filterArray() is that using the
> LC parser, for each use case would require it’s own code. filterArray()
> is a one-stop-shop approach. For instance, if you need to compare data
> with another dataset (as in an SQL join) or finding the maximum value
> in a dataset, or limiting your results. 

Somewhat. If I read your code correctly, it create a single table from the array, limiting the depth of the array and thereby obviating the complexities inherent in operations across multiple tables.

For table-equivalent arrays, the query field provided for LC in that example stack lets you explore the possibilities just like the SQL query field there.  The syntax is a bit different, e.g.:

   LC:  eeee contains "arbitrary"
   SQL: eeee like '%arbitrary%'

...but both are readable and about as capable as the range of options each parser provides.

One could further generalize this example on the LC side, but it's so easy to custom-code solutions in LC I try remain mindful of the tradeoffs of generalization.

LC's value function has continually surprised me over the years with how well it works as a general-purpose expression evaluation engine.


> It’s conceivable that you could write a pseudo-database system
> that uses LC arrays instead of sql, but you would have to do
> auto saves to file on that kind of database to prevent data
> loss, and that would.

The example I delivered should already be using in-memory SQLite, with the same benefits and risks as LC's arrays.

Your arrayToSQLite includes:

  IF pDBFile is empty THEN put ":memory:" into pDBFile

...and I passed no file specifier to it.

Might be fun to add saving to both and see how that works out.


> LIKE queries are also one of the slowest kinds of queries there are,
> so any query that does not use an index is going to take a longer
> time. I suppose I could add indexing but that would create more first
> time overhead. 

Substring searches are a challenge in any scripting engine. It's laudible that not only does LC do them well, but also in the worst-case scenario, excuted in the dynamic context of the value function.


> I made the assertion when I first wrote this, that for single queries,
> the sql method would not be very performant, because the overhead of
> creating the database to begin with could be significant, but if
> subsequent queries were made directly on the database, the filterArray
> method would begin to catch up, because the LC parser method cannot
> gain efficiency with iteration. I suppose the real test would be to
> see whether or not direct queries agains a memory database is faster than the LC parser. 

That's an interesting question, how performance would be affected if the translation from the LC-native form to the SQLite-native form were eliminated, to compare evaluation engines only.

Indexing is key to every data store, and I've experimented with hash-batched indices with some success over the years.  Tempting, tho I have to be mindful of my time these days, and lately reserve entertainment time for board game design. :)

But the question of isolating the effects of translating data between two systems with very different expectations seems central to this exercise: choosing storage structures per app needs, and then using the querying methods available for the chosen solution.

When associative arrays are the foundation of an app's operations, filtering them using LC-native syntax seems useful, esp, in those cases where it can deliver two orders of magnitude performance gain in about a dozen lines of code.

But of course for apps built around SQLite, there's no need to bother with LC arrays at all (except maybe for display operations like the DataGrid), since SQLite is also a very feature-rich, performant scripting engine.



>> On May 17, 2024, at 6:55 PM, Richard Gaskin wrote:
>> 
>> Thank you, Bob.
>>
>> It reminded me of a test I started a while back, and prompted me to finish the relevant part of it:
>> 
>> go url "https://fourthworld.net/lc/Array%20vs%20SQLite.livecode"
>>


Richard Gaskin
FourthWorld.com



More information about the use-livecode mailing list