OMG text processing performance 6.7 - 9.5

Mark Waddingham mark at livecode.com
Thu Jan 30 09:04:24 EST 2020


On 2020-01-30 13:20, Ben Rubinstein via use-livecode wrote:
> The context is that I'm finally forced to replace an app that's been
> processing data for a client for well over a decade. To date the
> standalone has been built on LC 6.7.11; but now we need to put it on a
> new platform with 64-bit database drivers. The performance has gone
> through the floor, through the floors below, through the foundations,
> and is on its way to the centre of the earth.

What's the need for 64-bit database drivers? i.e. What are you currently
using to talk to the database and why can you not continue to use a 
32-bit
Windows standalone?

> The first stage of the app - which retrieves a load of data from
> various databases and online sources, does minimal processing on it,
> and dumps it to cache files - is approx 2x slower. The main core of
> the app, which loads this data in and does a vast amount of processing
> on it to generate various output data and reports, has gone from 12
> minutes to over *six hours*.

I suspect it is probably a couple of things which are being done 
uniformly
causing the problem rather than lots of things all over the place...

Where exactly is the data coming from? (at a high-level) what sorts
of operations are being performed on it? what sort of I/O is being 
performed?

The main one I can think of is implicit binary<->text conversions. In 
6.7
and below binary data and text were the same thing - in 7+ they are 
distinct
types which require a conversion operation. The functions which were 
always
really returning/taking binary data now actually do.

e.g. textEncode / Decode, compress / decompress, binaryEncode / 
binaryDecode,
the byte chunk, repeat for each byte, numToByte

Given the app is coming from 6.7 vintage, it is unlikely that any of the 
new
unicode text codepaths would be hit (unless there's something odd going 
on
somewhere) as binary data converts to native encoded text - unless of 
course
the means by which the data is getting into the app is being taken as 
unicode
strings (without knowing the exact I/O going on I can't really see how 
this
could happen, but I can't rule it out).

In general, native text processing (item detection, comparison, 
containment
and such) is all as fast if not faster in the post-7 engines than 6.7 as 
I
spent quite a while specializing a lot of lower level routines to make 
sure
it was.

I do know the word chunk has been somewhat adversely affected, however, 
as
that was never optimized in the same way.

> The coding is gnarly - the oldest parts are probably at least 15 years
> old - and I've no doubt it could be made more efficient; but we don't
> have time or budget to rewrite it all. So, are there known gotchas,
> functions which have taken a much greater hit than others, that I
> could concentrate on to get the most ROI in speeding this up?

Given that you don't have time nor budget to really touch the code at 
all
in any depth then it would best to not have to touch it at all and keep
it in 6.7.11? i.e. Do you really need to move to 6?

Could you split the app into the bit which does the database 
communication
and caching (assuming that *really* needs to be 64-bit) and the bit 
which
does the data processing (which could remain as 32-bit in 6.7.11).

Note I should say that the reason I ask the above is not because of a 
lack
of confidence in getting your code to run as fast as it did before but
because of pure business reasoning - why spend time and money on 
something
which isn't necessarily really needed?

There's a difference between needing to update user-facing apps and true
back-office server apps after all - banks and insurance companies still 
have
software written on and running on machines which are decades old 
because
they work and the cost of keeping them running is vastly less than the 
cost
to rewrite and replace!).

Warmest Regards,

Mark.

-- 
Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps




More information about the use-livecode mailing list