35
1 How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner The Past & Future of Software Testing: How Many Lightbulbs Does It Take To Change A Tester? Cem Kaner, J.D., Ph.D. [email protected] www.kaner.com, www.testingeducation.org at the Software Testing Day Tampere University of Technology May 4, 2004 Research underlying these slides was partially supported by NSF Grant EIA-0113539 ITR/SY+PE: "Improving the Education of Software Testers." Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF). 1

The Past & Future of Software Testing - Cem Kaner · The Past & Future of Software Testing: ... I wrote Testing Computer Software to ... •Let’s face some realities

  • Upload
    lydieu

  • View
    231

  • Download
    0

Embed Size (px)

Citation preview

1How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

The Past & Future of Software Testing:How Many Lightbulbs Does It Take

To Change A Tester?

Cem Kaner, J.D., [email protected]

www.kaner.com, www.testingeducation.org

at the

Software Testing Day

Tampere University of Technology

May 4, 2004

Research underlying these slides was partially supported by NSF Grant EIA-0113539ITR/SY+PE: "Improving the Education of Software Testers." Any opinions, findings andconclusions or recommendations expressed in this material are those of the author(s) anddo not necessarily reflect the views of the National Science Foundation (NSF).

1

2How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Old Truths• Many years ago, the software development community

formed a model for the software testing effort. As I interactedwith it from 1980 onward, the model included several "bestpractices” and other shared beliefs about the nature oftesting.

• Beginning in 1983, I wrote Testing Computer Software tofoster rebellion against some of these. And to support manyothers.

The testing community has developed a culture

around these shared beliefs.

3How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Lightbulbs?• Over the years, many

– testing projects have failed– test managers have been fired– products have been successful

despite corporate refusal toconform to testing “standards”

• And yet, most of the same old lore isstill being repeated as the proper guideto testing culture and practice.

Don’t think the old stuff is still with us?Look at ISEB’s current syllabus for test practitioner certification:www1.bcs.org.uk/DocsRepository/00900/913/docs/practsyll.pdf

Or look at IEEE’s SWEBOK, www.swebok.orgThere are many other examples. . . .

What should it take,

for us to learn from

our experiences?

2

4How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

We Should Advocate for Waterfall or V models?• In these develop-in-stages lifecycle models, we try to finish each

(stage / phase) before moving on to the next:

from

Robin Goldsmith, TheForgotten Phase,Software Development,July 2002,http://www.sdbestpractices.com/documents/s=8815/sdm0207e/0207e.htm

5How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

We Should Advocate for Waterfall or V models?• These models are often harshly criticized as being inherently

high risk.

– they force the project team to lock down details longbefore the implications, costs, complexity, andimplementation difficulty of those details are known.

• The iterative approaches (spiral, RUP, evolutionarydevelopment, XP, etc.) are reactions to these risks.

Why would testers actively advocatefor this type of lifecycle?

6How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

We Should Advocate for Waterfall or V models?• Think of the tradeoffs:

-- Features -- Reliability -- Cost -- Time to Market• In the waterfall (and V), we lock down the feature set early, do

the design, write the code (and so spend most of the money thatwe ever will spend on it)

• So, toward the end of the project, what variables are still free?

Reliability versus Time

Sound familiar?Why should we set ourselves up for this grief?

Iterative models are designed to change this tradeoff

7How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Testers Should Refuse to Test if There is No Specification?• This was one of the assertions that motivated me to start writing

Testing Computer Software in 1983. (I disagreed with it.)• It’s still with us. I saw a leading consultant / author publicly give this

advice to a pair of newly-promoted test managers a couple years ago.• Let’s face some realities

– Many development groups choose not to write detailedspecifications. Is this always bad?

– Many product changes are made without updating thespecification, because the spec is not considered final authority inthat company. Is this always bad?

– A tester who refuses to proceed until the engineering process ischanged is hijacking the management of the project.

Jump in front of the train enough times and eventually you will get run over.

8How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Drive Tests From the Specification?• It’s easy to say that test cases should be based on documented

characteristics of the program, for example on the requirementsdocuments or the specifications.

• And there should be at least one thoroughly documented test forevery requirement item or specification item.

– Why only one test per item? Does one test each cover thespectrum of failure risks for these items? (No)

– What about all the items in the program that can’t be tracedback to the requirements docs or the design specification?

A specification is one source of fallible guidance.To the extent that the spec limits what you test,

it is a source of risk. 3

9How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Expected Results are Only Part of the Story

Expected Test Results

Postcondition Data

PostconditionProgram State

EnvironmentalResults

Test Oracle

System UnderTest

Test Inputs

Precondition Data

PreconditionProgram State

EnvironmentalInputs

Obtained Test Results

Postcondition Data

PostconditionProgram State

EnvironmentalResults

Reprinted with permission of Doug Hoffman

10How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

A Test Without an Expected Result is Not a Test?• A test that is defined in terms of one expected result is

undefined against the other types of results available fromthat test.

• High-volume automated tests may run until crash. The only“expected result” might be non-crash or conformance toother simple predictions. These techniques exposeinteresting problems that we do not know how to test forotherwise. The expected results have no relationship to theactual risks we are attempting to mitigate.– Are these improper tests?– Narrow visions of test design certainly steer us away

from them.– But they expose critical bugs.

11How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Design Most Tests Early in Development?• Why would anyone want to spend most of their test design

money early in development?

– The earlier in the project, the less we know about how itcan fail, and so the less accurately we can prioritize

One of the core problems of testing is the infinityof possible tests.

Good test design involves selection of a tiny subsetof these tests.

The better we understand the product and its risks,the more wisely we can pick those few tests.

12How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Design Most Tests Early in Development?• “Test then code” is fundamentally different from test-first

programming

Supports exploratory developmentof architecture & design

Supports understanding ofrequirements

Near-zero delay, communicationcost

Usual process inefficiencies anddelays (code, then deliver build,then wait for test results, slow,costly feedback)

Primarily unit tests and low-levelintegration

Primarily acceptance, or system-level tests

The programmer creates 1 test,writes code, gets the code working,refactors, moves to next test

The tester creates many testsand then the programmer codes

Test-first developmentTest then code

13How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Good Engineering RequiresEarly Lockdown of the User Interface?

• How many times do we hear this nonsense from GUI regressiontest tool vendors and their shills?

• The user interface is there for communication with the user.

• Usability engineering is highly iterative

When we do beta testingand discover,

as we always discover,that the product confuses / annoys the user,

Do we really want to push a process that will discourage developers

from making the improvementswe call for in our test results?

4

14How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

It’s Their Job to Keep the User Interface StableNot Our Job to Make Our Tests Maintainable

• WHY do some testers think programmers will restructure theirdevelopment process to make things convenient for testers?

• WHY would anyone expect programmers to show test code(and the tester) any respect if the code can’t cope with simplechanges?

• Change is inevitable. Deal with it.

Mommy, mommythose big, bad, nasty programmers

won’t let me do my job!

It’s THEIR fault I can’t get anything done, Mommy!

15How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Good Tests Should be Reused as Regression Tests?

• Let’s distinguish between the change-detectors at the codelevel and UI / System level regression tests

• Change detectors– writing these helped the TDD programmer think through the

design & implementation– near-zero feedback delay and near-zero communication cost

make these tests a strong support for refactoring• System-level regression

– provide no support for implementation / design– are run well after the code is put into a build that is released

to testing (long feedback delay)– run by someone other than the programmer (feedback cost)

16How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Good Tests Should be Reused as Regression Tests?• Maintenance of UI / system-level tests is not free

– change the design discover the inconsistency discoverthe problem is obsolescence of the test change the test

• So we have a cost/benefit analysis to consider carefully:– What information will we obtain from re-use of this test?– What is the value of that information?– How much does it cost to automate the test the first time?– How much maintenance cost for the test over a period of

time?– How much inertia does the maintenance create for the

project?– How much support for rapid does the test suite provide for

the project?

17How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

The Concept of Inertia• INERTIA: The resistance to change that we build into a project.

– Intentional inertia:• Change control boards

• User interface freezes– Process-induced inertia

• Costs of change that are imposed by the development process– rewrite the specification– rewrite the tests– re-run all the tests

• Testers advocate for late changes (aka bug fixes)

• What inertia do our processes induce in the project, and towhat extent does our inertia block needed improvements?

5

18How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Testers are The Advocates of Quality?How often have you heard statements like this?

• “Somebody has to look out for the user.”

• “The project needs one group that really cares about quality.”

• “Project managers can’t afford to care about quality.”

When you identify yourself as

The Advocate For Qualityyou’re saying that

“We care about QualityAND THEY DON’T.

That prophesy getsself-fulfilling after a while.

6

19How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Testers Should be Able to Veto a Product Release?• The power to veto shipment is often described as the thing

that distinguishes a “testing” group from a “qualityassurance” group.

• There are serious problems with this “power”– It takes accountability away from the project manager who is

making the development tradeoffs– It lets project managers (and others) play “release chicken”, setting

the test group up to be the bad guys for holding the release (andblowing the schedule)

– It sets testers up to be “the enemy” without any long-term qualitybenefit, and it makes testers the ones to blame if the product fails inthe field.

– The test group may well lack the knowledge to make the rightdecision

20How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Testers Should be Able to Veto a Product Release?

• To achieve better quality, build a better product.

– Better code, better user documentation, bettertraining, better support.)

• To assure better quality,take over management of thegroups that can make the product better.

Complaining about bad quality after the fact, even beating people with a stick after the fact,

will not assure better quality.

21How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

So What Do We Do?• If we're not the quality assurance group, what are we?

– Service providers.– We provide testing services.

Testing is a technical investigation of a product, anempirical search for quality-related information of

value to a project's stakeholders.

22How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Testers Should Work WithoutKnowledge of the Underlying Code?

• Why?

– Protects testers from bias? Huh?– Gives us an excuse for hiring people

who can’t code?• Maybe a better reason is discipline:

– Rather than learning about theproduct from the code, theblack box tester has to gather otherclasses of information that theprogrammer probably didn’t consider.

Are we really advocatingignorance-

based testing?

7

23How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

So, Testers SHOULD Work WithoutKnowledge of the Code?

• This is a practical question, not a question of principle:

– What does the project gain from a tester who focuses oncustomer benefits, configuration / compatibility, andcoordination with other products?

contrasted with– What does the project gain from a tester who can review code

to determine plausibility of some tests, and can implementcode-aware test tools / techniques (e.g. FIT, simulators usingprobes, event logs, etc.)

– What does the project gain from a tester who activelycollaborates with the programmers in the analysis anddebugging of the code?

24How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Programmers Can’t Catch Their Own Bugs?• A programmer’s public bug rate includes all bugs left in the code when she

gives it to someone else (such as a tester.) Rates of one to three bugs perhundred statements are not unusual.

• A programmer’s private bug rate includes all bugs she makes, including anyshe fixes before passing the program to testing.

• Estimates of private bug rates: 15-150 bugs per 100 statements (e.g.Beizer).

Therefore, programmers must be findingand fixing between 80% and 99.3% of theirown bugs before their code goes into test.

• What does this tell us about our task?

We find bugs by looking into theprogrammer’s (and her tools’) blind spots.

• Merely repeating the types of tests that the programmers did won’t yield morebugs.

• That’s one of the reasons that an alternative approach is so valuable.

25How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Testers Should Work IndependentlyFrom Programmers?

Benefits of collaboration?• Functional testing

Emphasis is on capability / benefits for the user.

The skilled functional tester often gains a deep knowledge ofthe needs of customers and customer-supporting stakeholders.

• Para-functional testingSecurity, usability, accessibility, supportability, localizability,interoperability, installability, performance, scalability.

The customer / user is not an expert in these attributes but hasgreat need of them.Effective testing will often require collaboration and mutualcoaching between programmers and testers.

• Preventative testing (programmer support)

26How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Testers Should Work Independently From Programmers?• Preventative testing (programmer support)

Test-driven development (TDD) can benefit from support fromtesters (pairing with programmers)

Benefits of TDD to the project?– Provides a structure for working from examples, rather than from an

abstraction. (Supports a common learning / thinking style.)– Provides concrete communication with future maintainers.– Provides a unit-level regression-test suite (change detectors)

• support for refactoring

• support for maintenance

– Makes bug finding / fixing more efficient• No roundtrip cost, compared to GUI automation and bug reporting.

• No (or brief) delay in feedback loop compared to external tester loop

– Provides support for experimenting with the language

27How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Our Job is to Find and ReportCoding Errors, Not Design Errors?

This is an old debate.

Programmer-centric vision of the role oftesting

Broad benefit-to-the-business visionof the role of testing

V

E

R

S

U

STesters are specialists, focusing oncoding mistakes and playing a juniorsupport role to programmers

Testers are generalists, learningabout all dimensions of qualityproblems

“Not my job” theory of testing – we shouldrecruit and train for a narrow set ofresponsibilities

We should build diverse test groupswith knowledge in many relevantdomains, including quality attributes

Testers don’t have skills to assessquality-characteristic problems (usability,security, accessibility, etc.).

We can assess for many types ofissue; few companies havespecialists (e.g. UI expert), so wecarry the problem.

A key risk is the long-term de-skilling of the test group.

28How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

The IEEE Software Engineering Standards are aBasis for Sound Testing Practice?

• We see this assertion regularly, but

– how many testers have ever read these standards? Have you? (atPNSQC, only half were familiar with the IEEE software processstandards and less than 1% advocated their use in their projects.How can this be an "industry standard"?)

– I think they have a strong bias toward heavyweight processes(massive paperwork, not much engineering).

– I think they have a strong bias toward waterfall, which I see as high-risk engineering.

– I think they have a strong bias toward a One True Way that doesn’tvary with the project context.

– On balance, I think Standard 829 has done more harm than good, andthe other process standards have been largely irrelevant. 8

29How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Document Manual Tests in Full Procedural Detail?

• The claim is that manual tests should be documented in greatprocedural detail so that they can be handed to lessexperienced or less skilled testers, who will

(a) repeat the tests consistently, in the way they wereintended,

(b) learn about test design from executing these tests, and(c) learn the program from testing it, using these tests.

I don’t see any reason to believe that we will achieve any of these benefits. I think this is as close as

we come to an industry worst practice.

30How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Most Testers are Stupid or Non-Technical?• Under this theory,

– We attempt to reduce testing to a set of routine steps• document intensely so that one smart person can

transmit The Great Thought to a bunch of lesser testers• automate intensely so that one smart person and a

machine can replace those lesser testers– It makes sense to replace testers with test-first programmers– Test automation languages should be ultra-simple– Testers’ time/work should be closely supervised with little room

for independent judgment

This vision cannot scale to current product / development complexity 9

31How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

Ultimately We Face a Conflict of Visions

Tools help us learn new thingsTools help us replace manual laborwith automated labor

We support and influence a broadgroup of stakeholders;communication skills are vital.

We support a process model or anarrow constituency (programmers orin-house end users or ...)

Breadth-first, then learning-guideddepth, until we run out of ideas ortime. We rely on cognitive triggersand peers to keep the flow ofideas steady (until we run out)

We want our infinite set of tests to feelfinite. We use progress measures (e.g.“coverage”) to gauge minimally-sufficient testing.

We are constantly trying to learnnew things about the product (andhow it can fail)

We are trying to develop a simple,easily taught, easily supervised,control process.

Active LearnerQA / QC

32How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

We Should Not Do Exploratory Testing?• Exploratory testing involves simultaneously

– executing tests– learning about the program, its market, its risks– designing new tests based on what we have just learned

• These tests can be automated or manual, whatever will teachus more about what we’re investigating in the moment

Exploratory testing is the tool of the active learner,

the technical investigator,

Rather than the QC automaton.

33How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

We Should Not Do Exploratory Testing?• Everyone explores to some degree

– follow-up testing (try to replicate or extent) a just-found bug– regression testing an allegedly-fixed bug

• Some tests provide all their value the first time you use them– Many scenario tests inform the tester of the design and provide

little new information (compared to new scenarios) on reuse• Straw man objections

– Exploratory testing should only be done by experts (SWEBOK)– All testing should be automated, and if impossible to automate,

should be made into a precise routine as if it were automated(Crispin)

– All tests should be captured and turned into regression tests

34How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

We Should Not Do Exploratory Testing?• Another way to avoid exploratory testing (and the vision of a

skilled technical investigator) is to find a way to recharacterizeexploration as something easily turned into a routine:– Exploration is “really” based on memorization of past failures: we could

get the same benefit from a fault catalog, a failure catalog, or applicationof PSP.

– Exploration is “really” based on a stereotyped set of attacks: maybe wecan build a program that will generate these attacks.

Reference materials and improved tools certainly help the explorer explore, but ultimately, exploratory testing is about

discovery of things we don’t know about the program and don’t yet have a routine, cost-effective system for exposing

(that can work in the present project’s context).

35How Many Lightbulbs to Change a Tester? Copyright © 2003 Cem Kaner

What’s With All This Outsourcing?• Peter Drucker, Managing in the Next Society, stresses that we

should manufacture remotely but provide services locally. Thelocal service provider is:

– more readily available, more responsive, and more ableto understand what is needed

• So why are companies so willing to obtain their softwaredevelopment services from halfway around the world?

If we adopt development processes that (at great cost)push all communication to paper, demand early decisions

and make late changes difficult and expensive, what benefit is left to the local service provider?

If we spend the money to create the formalistic infrastructure that we would need for outsourcing,

we may as well do the work where it is cheaper, because we have squandered our local advantages.

10