Accumulating text is *VERY* slow in LC9 on Windows
Ben Rubinstein
benr_mc at cogapp.com
Thu Aug 26 11:39:47 EDT 2021
Thank you Mark!
Using a buffer indeed essentially solves the problem (at least for this example!).
Because I may have to use this in a bunch of places, I made my life easier by
putting it in a command using variables passed by reference. Because I'm
almost invariably appending a line at a time, and usually have a row counter
going anyway, I'm deciding when to dump the buffer based on number of lines
rather than length of the buffer (which may occasionally mean the buffer has
to be resized up).
command appendRow tRow, @iRowCounter, @tBuffer, @tOutput
put tRow & return after tBuffer
add 1 to iRowCounter
if (iRowCounter mod siBuffSize) = 0 then
put tBuffer after tOutput
if sbDeleteBuffer then
delete codeunit 1 to -1 of tBuffer
else
put empty into tBuffer
end if
end if
end appendRow
I initially wrote this emptying the buffer just by putting empty into it. Then
I borrowed your line of deleting the contents, which I presume makes it more
likely to be retained in the heap.
In general inline is a bit faster than using the command; and deleting the
code points of the buffer is a bit faster than emptying - but it's not
completely clear cut. (This is one set of runs, not averaging repeated ones.)
command command inline inline
buffer size empty delete empty delete
500 lines 33 31 29 29
1500 lines 16 13 7 14
2500 lines 14 10 7 11
4000 lines 18 11 8 9
5000 lines 19 7 11 9
(All times in seconds on Windows, LC 9.6.3.)
And given that not using a buffer takes 589 seconds, none of the above
variations really matter!
So my remaining question for you Mark is this:
> P.S. This is an engine issue - we'll need to look into why there's such a
> difference with 6.7 - as, I'm pretty sure I kept the rules about extending
> buffers pretty much the same in string concatenation.
Do you have a sense of (a) how likely it is you'll have an "aha!" moment and
(b) any idea of when you might have a chance to look at this. I'm just asking
purely selfishly, because if it's soon I might get away without having to
address my 5,000 line code jungle nightmare...!
Many thanks for this tip-off,
Ben
On 26/08/2021 12:23, Mark Waddingham via use-livecode wrote:
> On 2021-08-25 18:15, Ben Rubinstein via use-livecode wrote:
>> I simplified it down to this (pointless) loop which just rebuilds a
>> table one line at a time:
>>
>> local tNewTable
>> repeat for each line tRow in tWorkTable
>> put tRow & return after tNewTable
>> end repeat
>>
>> with these results:
>>
>> 6.7.11 MacOS 8 seconds
>> 6.7.11 Win32 7 seconds
>> 9.6.3 MacOS 0 seconds
>> 9.6.3 Win32 591 seconds
>
> Using a buffer var should workaround the performance issue (which is related
> to the windows heap manager not being very good at continually re-extending a
> buffer):
>
> on mouseUp
> local tLine
> repeat 256 times
> put "*" after tLine
> end repeat
>
> local tTime
> put the millisecs into tTime
>
> local tBuffer
>
> local tText
> repeat 257000 times
> put tLine & return after tBuffer
> if the number of codeunits in tBuffer > 500000 then
> put tBuffer after tText
> delete codeunit 1 to -1 of tBuffer
> end if
> end repeat
> put tBuffer after tText
> answer (the number of codeunits in tText) & return & (the millisecs - tTime)
> end mouseUp
>
> In the original loop, tNewTable is continually extended internally, something
> which appears to cause O(n^2) performance on Windows.
>
> In the revised loop, an intermediate buffer var is used which (after first
> time getting 'full') will have a backing store of 500k ish - meaning tText is
> extended much less often. (Playing with the value of 500000 up or down will
> affect the resulting speed - there will always be a sweet spot).
>
> On my Windows VM - the above loop (which generates about 68mb of text or so,
> takes about 3s.
>
> Hope this helps!
>
> Mark.
>
> P.S. This is an engine issue - we'll need to look into why there's such a
> difference with 6.7 - as, I'm pretty sure I kept the rules about extending
> buffers pretty much the same in string concatenation.
>
More information about the use-livecode
mailing list