having to help Rev (was: Re: Memory Leak on export png????)

Richard Gaskin ambassador at fourthworld.com
Thu Mar 22 14:29:18 EDT 2007


Dave continued:

> I suppose my point is that in order to have a beta test, you need to  
> have gone thru the steps to get there. It's no good producing  
> mountains of code, doing little or no testing at the development  
> phase and then throwing that out for beta.

With 20/20 hindsight it's easy to suggest that stress testing would have 
found this leak, and indeed it might well have.  But I don't think it 
would be a cost-effective practice for RunRev to adopt across this product.

In the ten years I've been working with this engine this is the first 
verified leak I've seen.  Let's be generous and say that maybe one or 
two others might have been discovered in that time.  Even then, over a 
decade that's really quite good -- and accomplished without automated 
stress testing.

The export command appears to work well when run once or even a dozen 
times.  Unit testing should always be done, and in this case would yield 
a good result.  Only a sustained test with a great many iterations will 
expose this specific problem, and only in the Rev IDE.  The leak doesn't 
exist in a standalone or in other IDEs, and since some issues may be 
specific to standalones it would be necessary to run any soak tests in 
at least those two environments.

And because Rev supports multiple operating systems, each with differing 
APIs and idiosyncrasies, each of those two tests would need to be run on 
all supported OSes.  Here we already have a combinatorial explosion of 
test scenarios:  2 engines X Win 98, Win2K, WinME, WinXP, Vista, Mac OS 
9, OS X 10.2, OS X 10.3, OS X 10.4, at least one Linux, and for anything 
that involves QT multiply 9 of those by the number of supported QT 
versions.  That's 20 tests without QT, and just for one command.

Now let's look at what needs to be tested.  With the export command 
alone, its various options would introduce maybe 8 tests.
8 permutations X 2 engines X 10 OSes = 160 tests.

With hindsight we can limit the tokens to be tested to just those which 
have been worked on, but given the complexity of the engine and the 
interaction between tokens we'd want to test as many tokens as possible, 
ideally all of them.  Looking at commands alone, not accounting for 
functions and various options for each command, we have 134.

134 commands X 2 engines X 10 OSes = 2680 tests, give or take a few.

At a minimum we'd want to run each test for at least one hour....

See where this is going?

Consider all the effort to design, build, and run these tests, release 
after release, year after year, and after a decade of that we might turn 
up a leak or two and maybe a relative handful of other errors of the 
sort which can be exposed through that specific form of testing.


There are many methods by which a company can improve the quality of 
their work, of which stress testing is only one.  There's also code 
review, unit testing, Beta testing, and more.  Each has its strengths 
and weaknesses, and each is better at exposing certain types of errors 
than the others.

Choosing the right mix of QA options for a given product at a given 
stage will take into account a great many factors, not the least of 
which is ROI.

For a limited subset of features, stress testing can be very useful. 
But it isn't practical to design, build, and run stress tests for every 
feature, and I know of no vendor who does.

That leaves us with the fine art of deciding which features get 
stress-tested and which ones don't.  I know RunRev has stress-tested 
some features, but I don't know how they decide which ones they do.  I'm 
pretty sure this one will be stress-tested now that it's been identified 
as a leak.


Bringing this all back home, the leak you've discovered might well be 
the only one of its kind in more than a decade, and in less than two 
days since it was first reported an alternative solution was found that 
achieves the same result without error.  A bug report has been filed, 
you're back to work, and all is well, in a fraction of the time one 
might expect with similar issues in a great many products.

So looking at the minimal real-world impact of leak issues in Rev, I 
wouldn't be able to arrive at a cost justification for the type of 
stress testing which might have exposed this issue before Beta.

But I'll meet you halfway:

For a software as complex as Rev and which is used in so many very 
different circumstances,  and one which appeals to a relatively slender 
audience (those who enjoy programming are such a small slice of the gene 
pool), the nature of the product will require a strong dependence on 
Beta testing.

Historically, RunRev has often had Beta test cycles shorter than 
industry norms, and for a product with more complexity than average.  So 
while I can't justify stress testing across the product, I would suggest 
that lengthening the Beta cycles to something closer to, and ideally 
exceeding, industry norms would likely improve product quality well 
above the cost of doing so.


> Another problem here is that people may have different ideas on what  
> "Beta" means and I haven't seen it defined in terms of RunRev. One  
> company I worked for defined it as meaning "Feature Complete, No  
> Known Crashing Bugs".

That's the ideal, but I've known no company that ships every Beta in a 
state that meets that definition.

I've participated in tests for Adobe, Microsoft, Apple, Oracle, and 
others who have shipped Betas while still developing new features.


That you're able to run stress tests on all features and identify 100% 
of requirements to complete satisfaction before your first Beta is quite 
an accomplishment, and we look forward to URLs of the products you ship 
so we can all learn how to improve the quality of our own work.

--
  Richard Gaskin
  Fourth World Media Corporation
  ___________________________________________________________
  Ambassador at FourthWorld.com       http://www.FourthWorld.com



More information about the use-livecode mailing list