Cubist's first bug report

Mark Waddingham mark at livecode.com
Mon Jun 6 13:06:19 EDT 2016


Hi Quentin,

I just thought I'd sum up the situation with this bug and provide a 
little more explanation.

On 2016-06-05 16:12, Quentin Long wrote:
> ...
> This handler *should* end up generating a 16-item string of integers
> which sum to exactly 100. What it *actually does* end up generating,
> is a 16-item string of integers whose sum may or may not fall
> somewhere within the range 80-120. Not sure what the hell is going on
> here, but I am not at all happy about it. Perhaps other people might
> like to try this code on their systems, and see if it works as
> intended for them?
> 
> http://quality.livecode.com/show_bug.cgi?id=17795

This bug has the same underlying cause as 
http://quality.livecode.com/show_bug.cgi?id=7919 - although that bug was 
originally just reported against array subscript chunks, rather than 
chunks in general.

This particular issues has always been present and is a side-effect of 
how the engine currently handles chunk expressions.

If you have a command such as:

    add 1 to item random() of tVar

Then the engine will do the equivalent of the following:

    get item random() of tVar
    add 1 to it
    put it into item random() of tVar

This means that a chunk expression which contains expressions which 
cause side-effects (or use functions which do not return the same result 
for identical inputs - such as 'random()' / 'any') will not necessarily 
work how you expect. This is because those side-effect causing 
expressions will get evaluated twice.

 From 7 onwards, the original bug report case was fixed. For commands 
such as:

    add 1 to tVar[random()]

The engine does something more like:

    put random() into tIndex1
    get tVar[tIndex1]
    add 1 to it
    put it into tVar[tIndex1]

This is because we changed the way that array subscripting operations 
work then they are used as a container (i.e. read/write) so that the 
'path' to the element is only evaluated once. A side-effect of this was 
that we were able to implement the ability for array elements to be able 
to passed by-ref (to @ parameters) which they previously could not.

We still need to change the way more general chunk expressions work to 
do a similar thing. However, as it is quite a large piece of work to do 
(and the behavior has always been the current way) it hasn't yet floated 
to the top of the list. Once it is done though (as a bonus) it should be 
possible to make arbitrary chunk expressions be passed by-ref like array 
elements can now.

Beyond using 'any' in a container chunk expression (which should work 
appropriately as it is part of the chunk expression itself), caution 
should always be taken when composing commands where the sub-expressions 
have overlapping side-effects:

   variable sIndex

   command whatShouldThisDo? sInput
     add char sIndex of line addOneToIndex() of sInput to tSum
   end command

   function addOneToIndex
     add 1 to sIndex
     return sIndex
   end function

Here, what range is used to compute the substring depends *entirely* on 
order of evaluation which is not entirely obvious. One possible ordering 
is strict left to right:

    Eval(sIndex)
    Eval(addOneToIndex())
    Eval(sInput)
    Eval(tSum)

However, the more natural ordering from the point of view of the 
operations being performed is actually:

    Eval(sInput)
    Eval(addOneToIndex())
    Eval(sIndex)
    Eval(tSum)

This is because it follows the pattern of the underlying operations 
which are actually needed:

    1. Evaluate source container
    2. Evaluate line range
    3. Evaluate char of line
    4. Evaluate number to add to

This ordering will actually end up with more 'generally' efficient code 
also - predominantly because it ensures that the values which have been 
evaluated only need to live for the minimum amount of time.

(As a side note, the way to see why this is a more 'natural' ordering 
from the point of view of code execution is to rewrite the operation in 
question in procedural form:

     add(sInput.LineOf(addOneToIndex()).CharOf(sIndex), tSum -> tSum)

Here you can see that the second ordering *is* left to right, but only 
after transforming from chunk syntax to function syntax.)

In an environment where side-effects could be completely known then 
there wouldn't really be a problem here - you could choose any 
well-defined ordering and the compiler could rearrange evaluations in 
cases where there are no side-effect problems to ensure efficiency. 
(Also, where side-effects do make things less efficient, the compiler 
could warn you about this).

Unfortunately for us though, the dynamicity of the message path means 
that it is impossible for the compiler to efficiently (thus making it 
worthwhile!) check side-effects for cases where you have any non-private 
handler call as a sub-expression - thus, the ordering of evaluation is 
probably best made the one which generally provides the best performance 
and to generally avoid sub-expressions with side-effects altogether 
(which you can always do by storing expressions with side-effects into 
temporary variables and *then* parsing them to a command).

Warmest Regardsm

Mark.

-- 
Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps




More information about the use-livecode mailing list