Accumulating text is *VERY* slow in LC9 on Windows
Alex Tweedly
alex at tweedly.net
Wed Aug 25 14:48:48 EDT 2021
Crazy idea - totally untried .... (sorry, I don't have a Win machine)
put 1 into tLineCount
repeat for each line tRow in tWorkTable
put tRow into tNewTable[tLineCount]
add 1 to tLineCount
end repeat
combine tNewTable using CR
Alex.
On 25/08/2021 18:15, Ben Rubinstein via use-livecode wrote:
>
> Some 20 months ago, I reported that I was in a situation where an app
> written in 6.7 needed to be updated to access 64bit drivers, which
> meant updating to 9.5 - which displayed horrifying increase in
> processing time.
>
> In fact I was able to put off the evil day - but now it has returned,
> and can be put off no longer. A process that normally takes 2 hours is
> currently taking 9. The core processing stage has gone from around ten
> minutes to over six hours.
>
> After way too long, I've finally got down to at least one smoking gun;
> which is as simple as can be.
>
> Part of what took me so long is a confusion; in production the process
> runs on Windows, but I develop on Mac. Although on Mac the overall
> process does take about a third longer in LC9 than LC6, the simple
> tests I've finally isolated actually run much _quicker_ in LC9 than
> LC6. So switching between LC6 and LC9 on Mac as I tried to isolate the
> issue was giving confusing signals. But unmistakeably it's *much*
> slower on Windows.
>
> A simple routine which loops over a load of tab and return formatted
> data loaded from a TSV file, to truncate a particular field, had the
> following results processing a 70MB file of approximately 257,000 rows:
>
> 6.7.11 MacOS 9 seconds
> 6.7.11 Win32 10 seconds
> 9.6.3 MacOS 2 seconds
> 9.6.3 Win32 498 seconds
>
> I simplified it down to this (pointless) loop which just rebuilds a
> table one line at a time:
>
> local tNewTable
> repeat for each line tRow in tWorkTable
> put tRow & return after tNewTable
> end repeat
>
> with these results:
>
> 6.7.11 MacOS 8 seconds
> 6.7.11 Win32 7 seconds
> 9.6.3 MacOS 0 seconds
> 9.6.3 Win32 591 seconds
>
> (there's obviously a lot of variability in these - both were running
> in IDE, on a logged-in computer, so stuff was probably going on in the
> background; but I know the overall effect is similar when built as
> standalone and running by schedule on an unattended machine. But the
> key thing is: for this task, LC9 is dramatically slower on Windows!)
>
> Have others seen something like this?
>
> When I posted about this before (thread: "OMG text processing
> performance 6.7 - 9.5") Mark Waddingham suggested that it might be to
> do with a hidden cost of binary<->text transforms. That makes some
> sense; but given that the text already exists, I'm wondering whether
> taking a line out of text would cause it to be transformed, only to be
> transformed again when appending? And in particular, why this would
> affect Windows only.
>
> I have also added tests using "is strictly a binary string" in the
> code above, and this was true for neither input 'tWorkTable', nor the
> output 'tNewTable', nor any of the 257,00 extracted lines.
>
> However it is definitely the accumulating of text that is the issue -
> simply looping over the lines - even with testing each one to see if
> it is "strictly a binary string" - is a second or less on Windows in LC9.
>
> Has anyone had similar experiences? Suggestions for how this could be
> avoided?
>
> Many thanks in advance,
>
> Ben
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
More information about the use-livecode
mailing list