automated testing tools (slightly OT)
J. Landman Gay
jacque at hyperactivesw.com
Wed Oct 12 15:54:42 EDT 2005
Rob Cozens wrote:
> I'd be interested in what Jacque Gay might have to say on the subject.
> I recall a very interesting conversation we had at a Macworld regarding
> the work she did for Apple to automate HyperCard testing.
It was a long time ago so I've forgotten a lot of the details, but this
testing engine was the reason that HyperCard consistently shipped with
virtually no significant bugs. The HC team would start the test engine
running and leave it until it crashed, at which point they'd fix the
problem and start it up again. When the engine ran all the way to
completion, they figured HC was okay to ship. And it almost always was.
Any bugs that surfaced later on were always small things that were well
outside the range of normal operations.
There were two test engines. The main one automated the testing of every
language token, and ran unsupervised. The second one was a manual test
engine that required a human to execute specified tasks. The manual test
stack was much smaller and tested things like manually resizing
graphics, choosing tools from the palette, using keyboard shortcuts, and
other behaviors that couldn't be simulated via scripts.
I didn't write the original test engine, which someone on the HC team
wrote way back at version 1.0. As the HC language grew, more tests were
added for new syntax and behaviors. By the time I became involved with
the test engine it was massive and had been worked on over the years by
many people.
Exactly how it worked is both proprietary (and protected by NDA) and
vague in my memory now, but the basic idea was this. The stack script
contained a handler that used a "do" statement to execute the parameters
that were sent to it. Each card used this handler to execute various
tests. The handler then returned a result that the calling handler could
check and log to an external text file. Each card in the stack tested
one token (command, function, etc) of the language. Each card had a
separate handler for each variation of the token. For example, a card to
test the "go" command would have handlers for going to a different stack
and back, going to a stack in a new window, going to a stack with screen
locked, and so forth.
Each card had an "opencard" handler with a repeat loop that ran through
all the permutations on that card. A main handler in the stack script
also had a repeat loop that went to each card in sequence (which
triggered each opencard handler.) As each test was performed, "the
result" was checked and logged to the text file. At the completion of
the opencard handler, control was returned to the stack script which
then moved on to the next test card.
The scripts were complex and more elaborate than I've described. Each
card test had to follow a particular format and contain handlers with
predefined names. There were as many provisions as possible to bypass
errors without bringing the testing to a halt; errors were logged, the
specific handler was exited, and the next handler in the sequence began.
The logging routines listed the name of the test, the card it came from,
and the final result so that a human could read through it and check for
unanticipated results. If the engine crashed, the last entry in the log
showed what was executing at the time so it could be tracked and fixed.
The first card of the stack had a "Start" button that was clicked to
begin the test sequence. From there, it ran on its own. If not stopped,
it would begin over at the first card and retest everything all over
again, running until a certain keyboard sequence stopped it. I heard
they sometimes ran it for weeks at a time to test memory usage.
Funny this should come up, as I was recently talking to the person I
shared the work with, and we wondered whether Apple would object to us
converting the test engine to Revolution for the RR team. We don't know
the status of our Apple NDA at this point, since HC is long gone.
Conversion, however, would be a massive job; not only does Rev has three
or four times as many features as HC, but many of the existing handlers
would have to be rewritten to accomodate differences in how the engines
work. It would be a project that could easily take a couple of years to
finish. Just writing new tests for HC 2.3/2.4 took three of us some
months, and we didn't have to change any of the existing card tests, we
only had to add new ones. So, the project is probably outside the realm
of possibility at this point.
--
Jacqueline Landman Gay | jacque at hyperactivesw.com
HyperActive Software | http://www.hyperactivesw.com
More information about the use-livecode
mailing list