Search Values of Array in "One Go"
Richard Gaskin
ambassador at fourthworld.com
Thu Aug 24 15:52:47 EDT 2017
Bob Sneidar wrote:
> I suppose thinking about it you could arrayencode an array then do a
> search on it
You probably don't want to do that. I've given a lot of thought to
methods of indexing large LSON files on disk, and I can find no method
anywhere near as practical as working with them in their native array
structure, given their linear, non-indexed LSON format.
Besides, in order to serialize an LC array it needs to be small enough
to fit into RAM to begin with, so the one thing we know about all LSON
files is that they fit into RAM nicely. :)
It's been a while since Mark Waddingham generously provided some notes
in the LSON format (for the older format, which has changed since the
introduction of Unicode in v7), but IIRC it was roughly:
0x05 -- one-bye header indicating that what follows is an array (now
0x06 in v7 and later)
<element type op-code, with 0x05/0x06 being an array>
<element name> NULL <4-byte data length indicator><element data>
There is a different op-code for numbers than for strings (IIRC 0x02 for
numbers), allowing numbers to use a more compact binary form.
I'm guessing that in addition to the new op-code indicating an array
type, the former NULL separator between element name and length UINT4
has been replaced with a preceding length byte, since of course NULLs
can be part of the Unicode string of the element name.
Nice tidy format, well suited for disk storage and network transfer, but
looking for things requires linear search in LSON, whereas the
de-serialized native array form takes advantage of the super-quick
bucket hash to find a given key.
Much as XQuery works on an in-memory, already-parsed form of XML,
searching associative arrays in memory will be the way to go.
So looping is both faster to execute and easier to script in array form.
The biggest challenges would be parsing the query expression, and
generalizing evaluation within the loop(s) to handle the range of query
options.
--
Richard Gaskin
Fourth World Systems
Software Design and Development for the Desktop, Mobile, and the Web
____________________________________________________________________
Ambassador at FourthWorld.com http://www.FourthWorld.com
More information about the use-livecode
mailing list