Fwd: Speed of control lookup (Was Re: Parent of Target)

Mark Waddingham mark at livecode.com
Fri Aug 11 11:23:41 EDT 2017


On 2017-08-11 14:57, Mark Waddingham via use-livecode wrote:
> The question of course is 'how fast could we get long id parsing to
> be' (as that is the bottleneck here, or at least appears to be).

Okay so I got stuck on what I am *meant* to be doing, so spent a little 
time trying to answer this question.

The results are quite encouraging...

So, my idea was this - long ids which the engine generates have a *very* 
strict structure - so write a very fast parser for them which fails as 
soon as it encounters anything it doesn't expect. If this happens you 
just use the existing (relatively slow) method which handles all the 
edge cases.

It turns out you can hand-craft a parser for 'the long id' in about 150 
lines of C (char-bashing, one might call it, as opposed to bit-bashing) 
and assuming you restrict yourself to only looking up things on the 
current card of the target stack, and don't check group ownership 
(oops!) you end up with something which can resolve a long id in about 
the same time as you could if you hand-coded the syntax. i.e.

   put "button id 1003 of group id 1004 of card id 1002 of stack " & 
quote & "MyStackName" & quote into tRuggedId
   get the id of tRuggedId

   -- has about the same speed (assuming all those objects are in the 
cache) as

   put the id of button id 1003 of group id 1004 of card id 1002 of stack 
"MyStackName"

Now, I do wonder if the long id parsing method we currently have is 
actually using the stack-id cache everywhere it could (this code is 
somewhat old and knotty - so it is quite hard to see) - so I'm not sure 
quite how fair the above comparison is with current long id parsing 
(were the latter as optimal as it potentially could be) but there is 
obviously quite a significant overhead in the way chunks are parsed from 
strings (which I kind of already knew - but perhaps not just *how 
much*!).

Anyway, it is quite pleasing to know that we perhaps don't need the 
complexity of 'caching an object-handle along side a string from which a 
long id has been parsed' (where getting the cache validation right is 
going to be hard to do performantly enough) as we can probably make 
parsing of long ids (or variants of) themselves so quick, that the time 
taken to do so is thoroughly lost in the actual operation they are being 
used for.

This was just an 'experiment', so I can't really say when this change 
might (if ever) appear - however, it certainly suggests a slightly less 
fiddly approach than the caching method.

It has also suggested (to me, at least) a slightly different approach - 
we make the return value of the 'long id' property of objects what you 
might call a StringyLongId value. Instead of returning a string:

   button id 1003 of group id 1004 of card id 1002 of stack "MyStackName"

It would be a value which acts like a string, but internally stores:

     [ BUTTON, 1003, [ 1004 ], 1002, "MyStackName" ]

i.e. A list of component parts - [ main chunk, main chunk id, [ group 
id, ... ], card id, stack name/filename ]

This has the advantage that you use up a great deal less memory for long 
ids of very frequently referenced objects (filenames / names can be 
quite long!). They can be resolved quickly assuming the stack-id cache 
(and if we add a name / filename cache for top-level stacks); compared 
more quickly; and still be rendered as a string when doing stringy 
things to them (it would also be possible to optimize the 'word' chunk 
for them).

This has the *distinct* advantage of not needing to be validated (long 
ids change as objects move around in the hierarchy or id / names change 
] before use.

Of course, this actually requires a bit of machinery we don't actually 
have (but could be used for other things) - the ability for a value 
(internally) to be able to act like a string, but not actually be a 
string.

On balance the optimized long id chunk parser is probably less work (as 
it only needs a bit of code in *one* place - in the chunk evaluator); 
and probably less prone to causing regressions - especially if we only 
used the optimized code path in cases where we were sure there weren't 
any 'difficult' edge-cases to deal with.

I suppose I should now get back to what I was meant to be doing...

Warmest Regards,

Mark.

-- 
Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps




More information about the use-livecode mailing list