automated testing tools (slightly OT)

J. Landman Gay jacque at hyperactivesw.com
Wed Oct 12 15:54:42 EDT 2005


Rob Cozens wrote:

> I'd be interested in what Jacque Gay might have to say on the subject.  
> I recall a very interesting conversation we had at a Macworld regarding 
> the work she did for Apple to automate HyperCard testing.

It was a long time ago so I've forgotten a lot of the details, but this 
testing engine was the reason that HyperCard consistently shipped with 
virtually no significant bugs. The HC team would start the test engine 
running and leave it until it crashed, at which point they'd fix the 
problem and start it up again. When the engine ran all the way to 
completion, they figured HC was okay to ship. And it almost always was. 
Any bugs that surfaced later on were always small things that were well 
outside the range of normal operations.

There were two test engines. The main one automated the testing of every 
language token, and ran unsupervised. The second one was a manual test 
engine that required a human to execute specified tasks. The manual test 
stack was much smaller and tested things like manually resizing 
graphics, choosing tools from the palette, using keyboard shortcuts, and 
other behaviors that couldn't be simulated via scripts.

I didn't write the original test engine, which someone on the HC team 
wrote way back at version 1.0. As the HC language grew, more tests were 
added for new syntax and behaviors. By the time I became involved with 
the test engine it was massive and had been worked on over the years by 
many people.

Exactly how it worked is both proprietary (and protected by NDA) and 
vague in my memory now, but the basic idea was this. The stack script 
contained a handler that used a "do" statement to execute the parameters 
that were sent to it. Each card used this handler to execute various 
tests. The handler then returned a result that the calling handler could 
check and log to an external text file. Each card in the stack tested 
one token (command, function, etc) of the language. Each card had a 
separate handler for each variation of the token. For example, a card to 
test the "go" command would have handlers for going to a different stack 
and back, going to a stack in a new window, going to a stack with screen 
locked, and so forth.

Each card had an "opencard" handler with a repeat loop that ran through 
all the permutations on that card. A main handler in the stack script 
also had a repeat loop that went to each card in sequence (which 
triggered each opencard handler.) As each test was performed, "the 
result" was checked and logged to the text file. At the completion of 
the opencard handler, control was returned to the stack script which 
then moved on to the next test card.

The scripts were complex and more elaborate than I've described. Each 
card test had to follow a particular format and contain handlers with 
predefined names. There were as many provisions as possible to bypass 
errors without bringing the testing to a halt; errors were logged, the 
specific handler was exited, and the next handler in the sequence began. 
The logging routines listed the name of the test, the card it came from, 
and the final result so that a human could read through it and check for 
unanticipated results. If the engine crashed, the last entry in the log 
showed what was executing at the time so it could be tracked and fixed.

The first card of the stack had a "Start" button that was clicked to 
begin the test sequence. From there, it ran on its own. If not stopped, 
it would begin over at the first card and retest everything all over 
again, running until a certain keyboard sequence stopped it. I heard 
they sometimes ran it for weeks at a time to test memory usage.

Funny this should come up, as I was recently talking to the person I 
shared the work with, and we wondered whether Apple would object to us 
converting the test engine to Revolution for the RR team. We don't know 
the status of our Apple NDA at this point, since HC is long gone. 
Conversion, however, would be a massive job; not only does Rev has three 
or four times as many features as HC, but many of the existing handlers 
would have to be rewritten to accomodate differences in how the engines 
work. It would be a project that could easily take a couple of years to 
finish. Just writing new tests for HC 2.3/2.4 took three of us some 
months, and we didn't have to change any of the existing card tests, we 
only had to add new ones. So, the project is probably outside the realm 
of possibility at this point.

-- 
Jacqueline Landman Gay         |     jacque at hyperactivesw.com
HyperActive Software           |     http://www.hyperactivesw.com



More information about the use-livecode mailing list