3
May 2002 11 A M O D N A R T S cience is supposed to learn more from its failures than its successes. Schools ac- tively teach trial and error: If at first you don’t succeed, try, try again. On the other hand, we’re told to look before we leap but that he who hesitates is lost—so maybe we’re all doomed no matter what. This notion that trying and failing is normal, and natural, can lead you astray in your professional life, how- ever. In practice, it’s a lot harder to get a conference paper accepted that is of the form, “We wanted to help solve important problem X, so we tried Y, but instead of the hoped-for result Z, we got yucky result Q.” And when I say “a lot harder,” I mean “there’s no way.” This is true in science as well as engineering, and it’s responsible for the overall impression you get when read- ing conference proceedings: Everything other people try works perfectly. You’re the only one who goes down blind alleys and suffers repeated fail- ures. There’s a good reason why confer- ences and journals are biased this way. Sometimes there’s a very fine line between good ideas that didn’t work out and just plain stupid ideas. And since there are a thousand wrong ways to do something for every way that works, a predisposition for papers with positive results is a quick, generally effective filter that saves conference committees a lot of time. (They need to save that time so they can use it later, debating how to break it to a fellow conference committee member that they want to reject his paper. As Dave Barry would say, I’m not making this up.) Of course, prospective authors know this, and they would be in dan- ger of biasing their results toward the most positive light if it weren’t for their dedication to the scientific ideal. (Are you buying any of that? Me neither.) Scientists get to run their experi- ments multiple times, and they can gradually reduce or eliminate sources of systematic errors that might other- wise bias their results. Engineers tend to get only one chance to do some- thing: one bridge to build, one CPU to design, one software development pro- ject to crank out late, slow, and.… Whoops. I got carried away there. ANTICIPATION Because engineers generally can’t test their creations to the point of satura- tion, they must make do with a lot of substitutions: anticipation of all possi- ble failure modes; a comprehensive set of requirements; dedicated validation and verification teams; designing with a built-in safety margin; formal verifi- cation where possible; and testing, test- ing, testing. If you didn’t test it, it doesn’t work. In some cases, computers have become fast enough to permit testing every combination of bit patterns. If you’re designing a single-precision floating-point multiplier, you know the unit is correct because you tested every possible way it could be used. Now all you have to do is worry about resets, protocol errors into and out of the unit, electrical issues, noise, thermals, crosstalk, and why your boss didn’t smile at you when she said good morn- ing yesterday. Testing to saturation But many, perhaps most, things you design can’t be tested to saturation. Double-precision floating-point num- bers have so many bit patterns that it’s hard to see when, if ever, such units could be tested to saturation. Two to the 64th power is an extremely large number, and there are usually two operands involved. And that’s for a functional unit with a fairly regular design and state sequence. Saturation testing of anything beyond trivial soft- ware is often even more out of the question. So it behooves us to try to anticipate how our designs will be used, certainly under nominal conditions, but also under non-nominal conditions, which usually place the system under higher stress. Programmers have a range of techniques at their disposal, such as defensive programming (checking input values before blindly using them), testing assertions (automatically checking for conditions the program- mer thought were impossible), and always checking function return val- ues. Engineers design in fail-safe or fail- soft mechanisms because they know that the unexpected happens all the time. If you’ve ever driven a car that had the check-engine-light illuminated, you were in one of those modes. If You Didn’t Test It, It Doesn’t Work Bob Colwell Most things you design can’t be tested to saturation.

If you didn't test it, it doesn't work

  • Upload
    b

  • View
    222

  • Download
    1

Embed Size (px)

Citation preview

Page 1: If you didn't test it, it doesn't work

May 2002 11

A MODNART

Science is supposed to learnmore from its failures thanits successes. Schools ac-tively teach trial and error:If at first you don’t succeed,

try, try again. On the other hand, we’retold to look before we leap but that hewho hesitates is lost—so maybe we’reall doomed no matter what.

This notion that trying and failing isnormal, and natural, can lead youastray in your professional life, how-ever. In practice, it’s a lot harder to geta conference paper accepted that is ofthe form, “We wanted to help solveimportant problem X, so we tried Y,but instead of the hoped-for result Z,we got yucky result Q.” And when Isay “a lot harder,” I mean “there’s noway.” This is true in science as well asengineering, and it’s responsible for theoverall impression you get when read-ing conference proceedings: Everythingother people try works perfectly.You’re the only one who goes downblind alleys and suffers repeated fail-ures.

There’s a good reason why confer-ences and journals are biased this way.Sometimes there’s a very fine linebetween good ideas that didn’t workout and just plain stupid ideas. Andsince there are a thousand wrong waysto do something for every way thatworks, a predisposition for papers withpositive results is a quick, generallyeffective filter that saves conferencecommittees a lot of time. (They needto save that time so they can use it later,debating how to break it to a fellowconference committee member thatthey want to reject his paper. As Dave

Barry would say, I’m not making thisup.) Of course, prospective authorsknow this, and they would be in dan-ger of biasing their results toward themost positive light if it weren’t for theirdedication to the scientific ideal. (Areyou buying any of that? Me neither.)

Scientists get to run their experi-ments multiple times, and they cangradually reduce or eliminate sourcesof systematic errors that might other-wise bias their results. Engineers tendto get only one chance to do some-thing: one bridge to build, one CPU todesign, one software development pro-ject to crank out late, slow, and.…Whoops. I got carried away there.

ANTICIPATIONBecause engineers generally can’t test

their creations to the point of satura-tion, they must make do with a lot ofsubstitutions: anticipation of all possi-ble failure modes; a comprehensive set

of requirements; dedicated validationand verification teams; designing witha built-in safety margin; formal verifi-cation where possible; and testing, test-ing, testing. If you didn’t test it, itdoesn’t work.

In some cases, computers havebecome fast enough to permit testingevery combination of bit patterns. Ifyou’re designing a single-precisionfloating-point multiplier, you know theunit is correct because you tested everypossible way it could be used. Now allyou have to do is worry about resets,protocol errors into and out of theunit, electrical issues, noise, thermals,crosstalk, and why your boss didn’tsmile at you when she said good morn-ing yesterday.

Testing to saturationBut many, perhaps most, things you

design can’t be tested to saturation.Double-precision floating-point num-bers have so many bit patterns that it’shard to see when, if ever, such unitscould be tested to saturation. Two tothe 64th power is an extremely largenumber, and there are usually twooperands involved. And that’s for afunctional unit with a fairly regulardesign and state sequence. Saturationtesting of anything beyond trivial soft-ware is often even more out of thequestion.

So it behooves us to try to anticipatehow our designs will be used, certainlyunder nominal conditions, but alsounder non-nominal conditions, whichusually place the system under higherstress. Programmers have a range oftechniques at their disposal, such asdefensive programming (checkinginput values before blindly usingthem), testing assertions (automaticallychecking for conditions the program-mer thought were impossible), andalways checking function return val-ues. Engineers design in fail-safe or fail-soft mechanisms because they knowthat the unexpected happens all thetime. If you’ve ever driven a car thathad the check-engine-light illuminated,you were in one of those modes.

If You Didn’t Test It, It Doesn’t WorkBob Colwell

Most things you design can’t be tested to saturation.

Page 2: If you didn't test it, it doesn't work

12 Computer

A t R a n d o m

Real-world usesOne of the more difficult tasks in

engineering is trying to imagine all ofthe conditions under which your cre-ation will be used in the real world—or, if you’re an aerospace engineer, inthe real universe. There are user inter-faces to consider: Automobile manu-facturers long ago resigned themselvesto designing to the lowest commondenominator of the human populationin terms of intelligence applied to theoperation of a motor vehicle. But idiot-proofing is difficult precisely becauseidiots are so clever.

There are also legal concerns—juriesthat award damages to people whothink driving with a hot cup of coffeebetween their legs is a reasonableproposition may well fail to grasp theengineering tradeoffs inherent in anydesign. How do we protect idiots fromthemselves?

DESIGNING WITH MARGINAs a conscientious designer, you try

to anticipate all the ways your designmay be used in the future. Since you dooutstanding work, you expect that youor someone else will reuse your codeor your silicon in the future. So you tryto guess what that future designer willwant and strive to provide it.

You’ve also been around this par-ticular block a few times before, soyou build in some flexibility becauseyou know that project managementwill change its mind about certainmajor project goals as you go along.And since debugging was such a bearlast time, you allow for lots of debughooks and performance counters. Ifyou keep traveling down this road,you’re going to make yourself crazy.Yes, more than you are now. You needto compromise.

The art of compromiseEngineering is the art of compro-

mise: now versus later, concise versusgeneralized, schedule versus thor-oughness. And all around you, theother designers are falling behind andasking for help—your help.

This is one area of engineering inwhich you can’t simply overwhelm theproblem with force. You want an ele-gant, intelligent, balanced creation thatmeets all of the specs and also the intentof the product, stated or otherwise—not a sandbagged, bloated, turkey of adesign. Yes, it will take compromises,but the magic is in getting those com-promises right. Engineers know elegantdesigns when they see them. Aspire toproduce them.

Tragedy of the commonsAlways keep in mind the “tragedy of

the commons.” The idea is that thereare shared resources, initially allocatedin the hope of having enough for all,but overall management of thoseresources is essentially a distributedlocal function.

If you’re a silicon chip designer, yourjob is to implement your piece of thedesign in the area you were allocated,and to do so within the power, sched-ule, and timing constraints. You knowthat those constraints are somewhatfungible: If you had more scheduletime, you could compact your real-estate footprint. With more thermalpower headroom, you could meet yourclocking constraint.

The tragedy of the commons occurswhen, faced with the same problemsyou face, your fellow designers decideto borrow more of the chip area thanthey were allocated. It appears to eachof them that they’re using only a tinyfraction of the unallocated chip space,while substantially boosting the diearea available to their particular func-tion. And so they are. But if everyonedoes that, the chip’s overall die sizewill exceed the target, with no margin

left to do anything about it. Thinklocally and globally.

VALIDATION AND VERIFICATIONDid I mention this: If you didn’t test

it, it doesn’t work. In DragonFly:NASA and the Crisis Aboard MIR(HarperCollins, New York, 1998),Bryan Burrough gives a rivetingaccount of several near catastropheson the Russian space station.

In one instance, an oxygen genera-tor catches fire, and a blowtorch-likeflame begins to bloom. Jerry Linenger,the sole American astronaut, grabs anoxygen mask, but it doesn’t work. Hegrabs another one that does work. Thestation commander leads Linenger tothe station module where the fire extin-guishers are kept. Linenger grabs one“but is startled to find it is secured tothe wall.” Both men pull at it, but thewall wins. They try another extin-guisher, with the same outcome.

Later analysis revealed that thetransportation straps for the fire extin-guishers were still installed—in theintervening 19 months of service, noone had thought to wield the wrenchrequired to remove them. In perfecthindsight, this also suggests that what-ever fire drills had been performed inthose months weren’t realistic enoughor the astronauts would have foundthe problem earlier.

An important distinctionNASA draws a useful distinction

between validation and verification.Verification is the act of testing a designagainst its specifications. Validationtests the system against its operationalgoals.

An example of why the distinctionis important comes from Endeavour’smaiden flight, when it tried to ren-dezvous with the Intelsat satellite. Asit turned out, the “state-space” part ofthe shuttle’s programming had beendone with double-precision floating-point arithmetic, and the “boundarychecking” part was done with single-precision arithmetic. The shuttle’sattempted rendezvous didn’t converge

The tragedy of the commons occurs when

designers decide to borrow more of the chip area than they

were allocated.

Page 3: If you didn't test it, it doesn't work

May 2002 13

levels of intelligent thought, and he usesnumerical sequences as the clearestanalogy-generators. You rememberthese—What comes next in this se-quence: 5, 10, 15? Yep, 20 would bemy bet too.

So what comes next in the followingsequence: 0, 1, 2? It has to be 3, doesn’t it? But wait, why would I giveyou a trivial sequence and claim it’sgoing to stimulate your brain? Whatelse could it be, if not 3?

Even when I give you the answer,you probably still won’t know wherethis is coming from. The answer is 0,1, 2, 720!. That’s 720 factorial: 0, 1, 2,720! = 0, 1, 2, 6!! = 0, 1!, 2!!, 3!!!.

The important thing about thisexample is not the actual mathinvolved per se. It’s that when you firstsaw the sequence 0, 1, 2, your imme-diate instinct probably was to answer“3” and move on. But some other partof your brain, your Validation Reflex,became instantly suspicious and said,“Not so fast. Something’s not quiteright here.”

That’s the voice you must learn tohear to do truly outstanding valida-tion. That’s the instinct that allows youto take a step back from the implicitassumptions that others around youare making so you can go in a differ-ent direction where you may see some-thing everyone else is missing.

Y eah, I know—I cheated a littlewith that math sequence. Itdoesn’t seem quite fair to throw

in factorials when the originalsequence didn’t have them. You couldcomplain to Hofstadter. But youshould read his book first.

In the meantime, what’s next in thissequence: If you didn’t test it, …? �

Bob Colwell was Intel’s chief IA32architect through the Pentium II, III,and 4 microprocessors. He is now anindependent consultant. Contact himat [email protected].

would happen if he tried to catch afield goal. For readers who don’t fol-low American football, a field goal isan attempt by the offensive team tokick the ball between the goal posts atthe end of the field. During a field goalattempt, the offense just tries to keepthe defense from blocking the kick.There would be no point in sendingany offensive players downfield, sonobody ever does.

The validator was unswayed by thespec that there’s no point in trying tocatch the ball after it has been kicked;in true validation hero fashion, he sim-ply noticed that it didn’t seem to beimpossible. After trying for a couple ofhours, he succeeded in doing it. Sincethe game’s designers had never con-ceived of anyone trying to do such agoofy thing, the game’s specs didn’tinclude a requirement for how to han-dle it. As the validator suspected, thegame didn’t handle the situation verywell—it locked up.

Stepping outside the box Having a validation mindset that

allows an independent thinker to stepoutside a design project’s orthodoxy isabsolutely vital. Such people often arethe last line of defense between some-thing that’s overlooked in design anda user who isn’t happy with the prod-uct—or worse.

Although it’s not a common experi-ence in most people’s daily lives, youknow what this mindset feels like. Anexample borrowed from DouglasHofstadter’s Fluid Concepts and Crea-tive Analogies (Basic Books, New York,1995) simulates it. Hofstadter believesthat analogies are among the highest

to a solution due to this mismatch;only a live patch from the groundsaved the mission.

If the specs called for double preci-sion, verification of the state-spacecode would never find this potentialproblem: different specs for differentcode. Validation, on the other hand,could reasonably have been expectedto detect this mismatch before theastronauts encountered it hundreds ofmiles above the earth.

I’m very fond of validation. Verifi-cation is absolutely necessary, and it’sas important (and difficult) a task asdesign or architecture. In my experi-ence, however, the job of making surea design correctly implements its specsisn’t likely to be forgotten or over-looked. Verification may not always bedone very well, but it probably won’tbe skipped entirely. Validation, on theother hand, is often misunderstood,and management sometimes doesn’tremember why it’s so necessary.

Seeing the big pictureIn my experience, after months or

years of intensive design and develop-ment, the designers are tired, they’rebeing pressured to hit their productionschedule, and they just want to finishthe project. Some part of them reallydoesn’t want to hear that there mightbe anything wrong with their baby.The architects often have moved intothe initial phase of another design, andthey may not even be available, muchless actively engaged in final testing. Sothe validation folks are the only onesleft who can try to see the Big Picture.

Especially on long design projects,fundamental changes may haveoccurred in the overall product land-scape that haven’t been fully incorpo-rated into the specs: Competition hasarisen, a lawsuit may have beenresolved in an unfavorable direction,last-minute design changes were madethat won’t have benefited from the yearsof testing the design otherwise enjoyed.

Here’s an example of what I mean.While testing the Madden NFL 99 PCgame, a validator wondered what

Having a validationmindset that allows anindependent thinker tostep outside a designproject’s orthodoxy is

absolutely vital.