valueDiff for arrays?

Mark Waddingham mark at livecode.com
Mon Aug 6 16:25:58 EDT 2018


On 2018-08-06 22:04, Alex Tweedly via use-livecode wrote:
> On 06/08/2018 16:50, Mark Waddingham via use-livecode wrote:
> 
>> Alex Tweedly didn't talk nonsense... Byte x [to y] of z is (truly) 
>> constant time if z is strictly a binary string.
> That's right - the basic principle wasn't nonsense - but most
> everything else in my email was :-)

Well I spoke a bit of nonsense in the above - only byte x of z is 
constant time (if z is strictly a binary string) - byte x to y of z - is 
O(n) where n = y - x.

Of course the reality is that it should be possibly for it to be 
constant for both - chunk ranges of things should be able to be created 
in constant time as they can reference the thing they are chunking... 
However, the reason we have yet to do that is that ownership of the 
values need to be 'understood' (as in modelled in the engine at runtime) 
to ensure small chunks of much larger strings don't cause the much 
larger strings to stay hanging around when they are no longer 
referenced.

e.g. You have a 20Mb text file which you chunk and extract only 1Kb 
worth of small strings from. You don't want the 20Mb remaining in memory 
when its not needed anymore. A naive implementation of substrings would 
cause that to happen and cause the size of the heap to blow up in size 
with nothing you (as the programmer) could do about it.

> I said you only needed to change two lines - in fact you need to make
> changes in 4 places.
> I said the saving was ~ 40%, in fact it's ~75-80%

I don't think anyone is going to complain where 'nonsense' is actually a 
misstatement by a factor of 2 (in the better way!) over a performance 
improvement :)

> First wrong thing - you can't just casually write into  "byte X of
> tmp". if X is greater than the current length of tmp, then this simply
> appends a single byte. So you need to pre-fill the variable that holds
> the bytes.

Yes - it's the same for all 'put into <chunk>' expressions I believe. I 
can't really say whether the alternative (i.e. padding) would be more 
'correct' or not (well, not without a good deal of pondering!) - 
certainly useful though, in this particular case.

Regardless of correctness - pre-filling the binary string has a memory 
allocation advantage as well. It can be quite expensive (O(n) in fact - 
where n is the existing length) to extend a string (binary or otherwise) 
beyond its current length. So if you pre-fill the length you want, no 
memory allocation has to happen during use of it.

> Second wrong thing was
>   put "a" into byte X of np
> that should be
>   put numtobyte(66) into byte X of np
> 
> I don't thin this *should* make much difference - but in fact it makes
> a huge difference. Not sure why - I suspect a bug ,,,, I'll look at it
> some more and hopefully submit a bug report later.

The issue there is that you are mixing text with binary. I think the 
engine assumes (currently) in that case that you mean the target to be 
text... Although I *think* we fixed that case recently in a patch to 
appear in either 9.0.1 or 9.1.0 (maybe Ali can comment on that). If you 
explicitly say 'byte' in the target then it should end up being binary 
that you get.

Warmest Regards,

Mark.

-- 
Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps




More information about the use-livecode mailing list