having to help Rev (was: Re: Memory Leak on export png????)
Richard Gaskin
ambassador at fourthworld.com
Thu Mar 22 14:29:18 EDT 2007
Dave continued:
> I suppose my point is that in order to have a beta test, you need to
> have gone thru the steps to get there. It's no good producing
> mountains of code, doing little or no testing at the development
> phase and then throwing that out for beta.
With 20/20 hindsight it's easy to suggest that stress testing would have
found this leak, and indeed it might well have. But I don't think it
would be a cost-effective practice for RunRev to adopt across this product.
In the ten years I've been working with this engine this is the first
verified leak I've seen. Let's be generous and say that maybe one or
two others might have been discovered in that time. Even then, over a
decade that's really quite good -- and accomplished without automated
stress testing.
The export command appears to work well when run once or even a dozen
times. Unit testing should always be done, and in this case would yield
a good result. Only a sustained test with a great many iterations will
expose this specific problem, and only in the Rev IDE. The leak doesn't
exist in a standalone or in other IDEs, and since some issues may be
specific to standalones it would be necessary to run any soak tests in
at least those two environments.
And because Rev supports multiple operating systems, each with differing
APIs and idiosyncrasies, each of those two tests would need to be run on
all supported OSes. Here we already have a combinatorial explosion of
test scenarios: 2 engines X Win 98, Win2K, WinME, WinXP, Vista, Mac OS
9, OS X 10.2, OS X 10.3, OS X 10.4, at least one Linux, and for anything
that involves QT multiply 9 of those by the number of supported QT
versions. That's 20 tests without QT, and just for one command.
Now let's look at what needs to be tested. With the export command
alone, its various options would introduce maybe 8 tests.
8 permutations X 2 engines X 10 OSes = 160 tests.
With hindsight we can limit the tokens to be tested to just those which
have been worked on, but given the complexity of the engine and the
interaction between tokens we'd want to test as many tokens as possible,
ideally all of them. Looking at commands alone, not accounting for
functions and various options for each command, we have 134.
134 commands X 2 engines X 10 OSes = 2680 tests, give or take a few.
At a minimum we'd want to run each test for at least one hour....
See where this is going?
Consider all the effort to design, build, and run these tests, release
after release, year after year, and after a decade of that we might turn
up a leak or two and maybe a relative handful of other errors of the
sort which can be exposed through that specific form of testing.
There are many methods by which a company can improve the quality of
their work, of which stress testing is only one. There's also code
review, unit testing, Beta testing, and more. Each has its strengths
and weaknesses, and each is better at exposing certain types of errors
than the others.
Choosing the right mix of QA options for a given product at a given
stage will take into account a great many factors, not the least of
which is ROI.
For a limited subset of features, stress testing can be very useful.
But it isn't practical to design, build, and run stress tests for every
feature, and I know of no vendor who does.
That leaves us with the fine art of deciding which features get
stress-tested and which ones don't. I know RunRev has stress-tested
some features, but I don't know how they decide which ones they do. I'm
pretty sure this one will be stress-tested now that it's been identified
as a leak.
Bringing this all back home, the leak you've discovered might well be
the only one of its kind in more than a decade, and in less than two
days since it was first reported an alternative solution was found that
achieves the same result without error. A bug report has been filed,
you're back to work, and all is well, in a fraction of the time one
might expect with similar issues in a great many products.
So looking at the minimal real-world impact of leak issues in Rev, I
wouldn't be able to arrive at a cost justification for the type of
stress testing which might have exposed this issue before Beta.
But I'll meet you halfway:
For a software as complex as Rev and which is used in so many very
different circumstances, and one which appeals to a relatively slender
audience (those who enjoy programming are such a small slice of the gene
pool), the nature of the product will require a strong dependence on
Beta testing.
Historically, RunRev has often had Beta test cycles shorter than
industry norms, and for a product with more complexity than average. So
while I can't justify stress testing across the product, I would suggest
that lengthening the Beta cycles to something closer to, and ideally
exceeding, industry norms would likely improve product quality well
above the cost of doing so.
> Another problem here is that people may have different ideas on what
> "Beta" means and I haven't seen it defined in terms of RunRev. One
> company I worked for defined it as meaning "Feature Complete, No
> Known Crashing Bugs".
That's the ideal, but I've known no company that ships every Beta in a
state that meets that definition.
I've participated in tests for Adobe, Microsoft, Apple, Oracle, and
others who have shipped Betas while still developing new features.
That you're able to run stress tests on all features and identify 100%
of requirements to complete satisfaction before your first Beta is quite
an accomplishment, and we look forward to URLs of the products you ship
so we can all learn how to improve the quality of our own work.
--
Richard Gaskin
Fourth World Media Corporation
___________________________________________________________
Ambassador at FourthWorld.com http://www.FourthWorld.com
More information about the use-livecode
mailing list