38
DAIMI (c) Henrik Bærbak Chris tensen 1 Test Planning

DAIMI(c) Henrik Bærbak Christensen1 Test Planning

  • View
    231

  • Download
    1

Embed Size (px)

Citation preview

DAIMI (c) Henrik Bærbak Christensen 1

Test Planning

DAIMI (c) Henrik Bærbak Christensen 2

Definition

  Plan: Document that provides a framework or approach for achieving a set of goals.

  Corollary: You have to define the goals in advance.

  Burnstein provides templates of a company testing policy that states overall goals.

DAIMI (c) Henrik Bærbak Christensen 3

Plan Contents

  A testing plan must address issues like:– Overall testing objectives: why are we testing, risks, etc.– Which pieces will be tested?– Who performs the testing?– How will testing be performed?– When will testing be performed?– How much testing is adequate?

  These dimensions are orthogonal (independent). Decisions must be made where to place your project within each of these dimensions.

  Each dimension is a continuum.

DAIMI (c) Henrik Bærbak Christensen 4

Which pieces will be tested?

  Continuum extremes:– every unit is tested– no testing at all (i.e. the users do it )

and variations– systematic approach for choosing which to test…– ROI (return on investment) important

• where does one test hour spent find most defects?• or ‘most annoying’ defects

  Strategies– “Defect Hunting”, “Allocate by Profile”

DAIMI (c) Henrik Bærbak Christensen 5

Who performs testing?

  Project roles:– Developer: construct products– Tester: detect failures in products

  Remember: roles - not persons.  Continuum extremes

– same persons do everything– roles always split between different persons

and all kinds of variations in between:– unit level: often same person has both roles

• XP pair programming often split roles in the pair

– system level: often separate teams

  Testing psychology: do not test your own code…

DAIMI (c) Henrik Bærbak Christensen 6

How will testing be performed?

  Continuum extremes– Specification only what black-box– Implementation alsohow white-box

  Levels– unit, integration, system

  Documentation– The XP way: move forward as fast as possible– The CMM way: make as much paper as possible

DAIMI (c) Henrik Bærbak Christensen 7

When will testing be performed?

  Continuum extremes– test every unit as it becomes available

• “high frequency integration”, test-driven development

– delay until all units available • “Big-bang integration”

  and variations– defects found early are usually cheaper to fix !!!

• why ???• Kent Beck says that this is not true !!!

– testing at the end of each increment / milestone

DAIMI (c) Henrik Bærbak Christensen 8

How much testing is adequate?

  Continuum from none to very thorough…  but when is enough enough ???

– life critical software; asset transaction handling– once used converter; research demo

  Adequacy– defect detection cost versus increase in quality– standards: drug manufacturing / furniture manufact.

  Coverage– code coverage– requirement coverage (use cases covered)

DAIMI (c) Henrik Bærbak Christensen 9

Test Plan Format

DAIMI (c) Henrik Bærbak Christensen 10

IEEE Test plan

  IEEE Std for test plan

  Template independent of particular testing level.– system, intg., unit

  If followed rigorously at every level the cost may be very high...

DAIMI (c) Henrik Bærbak Christensen 11

Features to be tested

  Items to be tested– “Module view”: actual units to be put under test.

  Features to be tested/not to be tested– “Use case view”:

• from the users’ perspective• use cases

DAIMI (c) Henrik Bærbak Christensen 12

Points to note

  Features not tested– incremental development means large code base is

relative stable...– additions + base changes

  But – what do we retest?– everything?– just added+changed code?

  Exercise: – Any ideas?– What influences our views?

Increment n

Inc n+1

Legend: ’blob’ measures code size; position whether code is additions or changes

DAIMI (c) Henrik Bærbak Christensen 13

Regression testing

  The simple answer is :  test everything all the time

  which is what XP says at the unit level.  However,

– some test run slowly (stress, deployment testing)– or are expensive to make (manual, hardware req.)

  The question is then– which test cases exercise code that is changed???

  Any views?

DAIMI (c) Henrik Bærbak Christensen 14

Test case traceability

  It actually points towards a very important problem, namely traceability between tests, specification, and code units.

  Simple model (ontology)– the problem is the multiplicity !

– tracing the dependencies ! test case

code unit

use case

derived-from

tested-by

exercise

implement

*

*

*

*

*

*

DAIMI (c) Henrik Bærbak Christensen 15

Side bar

  At the CSMR 2004 conference an interesting problem was stated:– Stock trading application– 80.000 test cases over 7½ million C++ code lines– no traceability between specification, units, and tests

  So – what to do?– Dynamic analysis

• Record time when each test case runs• Record time when each method is run (req. instrumentation)• compare the time stamps !

DAIMI (c) Henrik Bærbak Christensen 16

Approach

  Section 5– managerial information that defines the testing

process• degree of coverage, time and budget limitations, stop-test

criteria

– and the actual test cases!

  A bit weird to have both the framework of the test as well as the test itself in the same document. Usually the real test cases are in a separate document.

DAIMI (c) Henrik Bærbak Christensen 17

Pass/Fail Criteria

  Pass/Fail criteria– at unit level this is often a binary decision

• either it passes (computed = expected)• or fail

– higher levels require severity levels• “save” operation versus “reconfigure button panel”

• allows conditionally passing the test

– compare review terminology

DAIMI (c) Henrik Bærbak Christensen 18

Suspension/Resumption Criteria

  When to suspend testing– for instance if severity level 0 defect encountered

• “back to the developers, no idea to waste more time”

  When to resume:– redo all tests after a suspend? Or only those not

tested so far?

DAIMI (c) Henrik Bærbak Christensen 19

Contents

  Deliverables– what is the output

• test design specifications, test procedures, test cases• test incident reports, logs, ....

  Tasks– the work-break-down structure

  Environment– software/hardware/tools

  Responsibilities– roles

DAIMI (c) Henrik Bærbak Christensen 20

Contents

  Staff / Training Needs

  Scheduling– PERT and Gant

  Risks

DAIMI (c) Henrik Bærbak Christensen 21

Testing Costs

  Estimation, in the form of ”staff hours”, is known to be a hard problem.– historical project data important– still, underestimation more the rule than the exception

  Suggestion– ’prototype’ testing for ’typical’ use-cases/classes and

measure effort (staff hours)– count/estimate total use-cases and classes

  Burnstein– look at project + organization characteristics, use

models, gain experience

DAIMI (c) Henrik Bærbak Christensen 22

Section 5

  Section 5 contains the actual tests– The design of the tests, IDs– Test cases

• input, expected output, environment

– Procedure• how testing must be done

– especially important for manual

  Test result reports– Test log: “laboratory diary”– Incident report: Report defects

• alternatively in defect tracking tool like bugzilla

  Summary– summary and approval

DAIMI (c) Henrik Bærbak Christensen 23

Monitoring the Testing Process

DAIMI (c) Henrik Bærbak Christensen 24

Motivation

  Testing is a managed process.– Clear goals and planned increments/milestones to

achieve them

– Progress must be monitored to ensure plan is kept.

DAIMI (c) Henrik Bærbak Christensen 25

Terms

  Project monitoring: activities and tasks defined to periodically check project status.

  Project controlling: developing and applying corrective actions to get project on track.

  Usually we just use the term project management to cover both processes.

DAIMI (c) Henrik Bærbak Christensen 26

Measurements

  Measuring should of course be done for a purpose. Thus there are several issues to consider:– Which measures to collect?– For what purpose?– Who will collect them?– Which tools/forms will be used to collect data?– Who will analyze data?– Who will have access to reports?

DAIMI (c) Henrik Bærbak Christensen 27

Purpose

  Why collect data?  Data is important for monitoring:

– testing status• indirectly: quality assessment of product

– tester productivity– testing costs– failures

• so we can remove defects

DAIMI (c) Henrik Bærbak Christensen 28

Metrics

  Burnstein’s suggested metrics– Coverage– Test case development– Test execution– Test harness development

– Tester productivity

– Test cost

– Failure tracking

DAIMI (c) Henrik Bærbak Christensen 29

Coverage

  Whitebox metrics– statement (block), branch, flow, path,...– ratio

• actual coverage / planned coverage

  Blackbox metrics– # of requirements to be tested– # of requirements covered– ECs identified– ECs covered– ... and their ratios

DAIMI (c) Henrik Bærbak Christensen 30

Test Case Development

  Data to collect:– # of planned test cases

• based upon (time allocated/mean time to complete one test)?

– # of available test cases

– # of unplanned test cases

  So – what does the last measure mean?– heavy “water fall model” smell here?

DAIMI (c) Henrik Bærbak Christensen 31

Test Execution

  Data collected:– # test cases executed– ... and passed– # unplanned test cases executed– ... and passed

– # of regression tests executed– ... and passed

– and their ratios

DAIMI (c) Henrik Bærbak Christensen 32

XP example

  [From Jeffries’ paper]– Functional tests ≠ unit tests

• customer owned• feature oriented• not running at 100%

– Status• not developed• developed and

– pass

– fail

– expected output not validated

DAIMI (c) Henrik Bærbak Christensen 33

Test Harness Development

  Data collected– LOC of harness (planned, available)

  Comments?– Who commissions and develops the harness code?

DAIMI (c) Henrik Bærbak Christensen 34

Tester Productivity & Cost

  !

DAIMI (c) Henrik Bærbak Christensen 35

Defects

  Data collected on detected defects in order to– evaluate product quality– evaluate testing effectiveness– stop-test decision– cause analysis– process improvement

  Metrics– # of incident reports; solved/unsolved; severity levels;

defects/KLOC; # of failures; # defects repaired

DAIMI (c) Henrik Bærbak Christensen 36

Test Completion

  At one time, testing must stop...  The question is: when?  Criteria

– Planned tests pass• what about the unplanned ones?

– Coverage goals are met• branch coverage/unit; use case coverage/system

– Specific number of defects found• Estimates from historical data

– Defect detection rate falls below level• “less than 5 severity level > 3 defects per week.”

DAIMI (c) Henrik Bærbak Christensen 37

Test Completion Criteria

  Criteria– fault seeding ratios are favorable

• seed with “representative defects”• how many does testing find

– postulate:• found seed defects / total seed defects• =• found actual defects / total actual defects

DAIMI (c) Henrik Bærbak Christensen 38

Summary

  Plan testing– what, who, when, how, how much

• all are a continuum where decisions must be made

  Document testing– IEEE outline a document template that is probably no

worse than many others...

  Monitor testing– collect data to make sound judgements about

• progress• stop-testing criteria

  Record incidents – defects found/repaired