26
A CONTROL INSTRUMENTS COMPANY The Effectiveness of T- way Test Data Generation or Data Driven Testing Michael Ellims

A CONTROL INSTRUMENTS COMPANY The Effectiveness of T-way Test Data Generation or Data Driven Testing Michael Ellims

  • View
    216

  • Download
    1

Embed Size (px)

Citation preview

A CONTROL INSTRUMENTS COMPANY

The Effectiveness of T-way Test Data Generation

or Data Driven Testing

Michael Ellims

A CONTROL INSTRUMENTS COMPANY

Overview

The problem with testing

Experimental designs

Adequacy of tests

Experiments in effectiveness

Optimisation

A CONTROL INSTRUMENTS COMPANY

The problems with testing

Expensive– Estimated at 50% of project cost

Hard– No good theory on designing tests

Solution: automate the testing process?

A CONTROL INSTRUMENTS COMPANY

Automated testing

Generation is half the problem– we can generate data but...

We still need an “oracle”– test have to pass or fail

Options– embedded assertions– formal models– usually – some human

A CONTROL INSTRUMENTS COMPANY

Automated testing

We want a “simple” method– easy to understand– easy to use– inputs from development e.g. data

dictionary– data driven testing

Solution: design of experiments techniques?

A CONTROL INSTRUMENTS COMPANY

Design of experiments

Full factorial experiments– “in which every setting of every factor

appears with every setting of every other factor”

Factor == variable

Setting == level == value

A CONTROL INSTRUMENTS COMPANY

Design of experiments – Latin Square

v a r ia b le 2

v a r ia b le 1

A

B

C

D

B

C

D

A

D

A

B

C

C

D

A

B

A CONTROL INSTRUMENTS COMPANY

Design of experiments

we have a set of sixteen test vectors

• v1 .. v

16

• read from the matrix as follows:

v1 = {1, 1, A}

v2 = {1, 2, B}

v3 = {1, 3, C} …

v16

= {4, 4, C}

A CONTROL INSTRUMENTS COMPANY

t-way testing : example

Three variables a, b, c•a has three “valid” values, b has two c, has two•pairwise or 2-way adequate test set...

a1 a2 a3 a2 a1 a3 a1

b2 b1 b1 b2 b2 b2 b1

c1 c2 c1 c1 c2 c2 c1

A CONTROL INSTRUMENTS COMPANY

Evidence

Many papers on 2-way adequate test– mostly vs. coverage criteria (good not

great)– issues with coverage

• some work supports, some doesn’t

Kuhn et al. (series of papers)– Implied higher factors than 2-way needed– t = 5 or 6

Schroeder et al.

A CONTROL INSTRUMENTS COMPANY

Research Questions

• How good are t-way adequate test sets?

t = 2 to t =5

• Can we address oracle problem?

2283 vectors – can’t reviewed by hand!

A CONTROL INSTRUMENTS COMPANY

Problem...

• Compare against what?– coverage : too weak

• Statement coverage• Branch coverage• MCDC coverage

A CONTROL INSTRUMENTS COMPANY

Adequacy - code mutation

• Error based testing– for a limited set of errors– conceptually simple - coding errors

• Direct measure of test set “goodness”– Can test N find error X– Higher fidelity and statement coverage

• F1 : 12 lines but 81 code mutants• F2 : 33 lines but 669 code mutants• F3 : 51 lines but 1297 code mutants

A CONTROL INSTRUMENTS COMPANY

What are code mutants?

if ((a < b) && ((x + y) > q))) ff = jj + 34;

if ((a > b) && ((x + y) > q))) ff = jj + 34;

if ((a < b) || ((x + y) > q))) ff = jj + 34;

if ((a < b) && ((x - y) > q))) ff = jj + 34;

if ((a < b) && ((x * y) > q))) ff = jj + 34;

if ((a < b) && ((x + y) > q))) ff += jj + 34;

if ((a < b) && ((x + y) > q))) ff = jj + 35;

A CONTROL INSTRUMENTS COMPANY

Procedure

FOR each vector

FOR each mutant

run vector on un-mutated code // oracle!

run vector on mutant

compare results

ENDFOR

ENDFOR

A CONTROL INSTRUMENTS COMPANY

Experiment 1 - effectiveness

How good is automated testing?– t-way verses hand generated tests– t-way verses random tests– t-way verses random designs

A CONTROL INSTRUMENTS COMPANY

All methods – mutation score

0

20

40

60

80

100

120

2-way3-way4-way5-wayRdesignRandomBaseHand

A CONTROL INSTRUMENTS COMPANY

Selected methods – mutation score

0

20

40

60

80

100

120

5-way

Rdesign

Random

Hand

A CONTROL INSTRUMENTS COMPANY

Selected methods - raw data

0

100

200

300

400

500

600

700

800

5-way

Rdesign

Random

Hand

A CONTROL INSTRUMENTS COMPANY

Experiment 2 - minimisation

Can we reduce test set to a manageable size?

– oracle problem – the oracle is a person!– You can examine 1000's test vectors

Can we get it to run faster?– 2000 vectors over 2000 mutants– at two seconds per test...

A CONTROL INSTRUMENTS COMPANY

Optimization

FOR next t-way adequate test set // t = 2 .. 5

run remaining mutants vs. all remaining vectors

WHILE a test kill > 1 mutant remains

select test that kills most mutants

mark mutants as dead

ENDWHILE

ENDFOR

select vectors that kill remaining mutants

A CONTROL INSTRUMENTS COMPANY

Time Improvement

05000

100001500020000250003000035000400004500050000

max

min

A CONTROL INSTRUMENTS COMPANY

Size Improvement (x5)

0

500

1000

1500

2000

2500

3000

3500

_dip_debounce

_aip_median_filter

_sdc_fuel_control

aip_spike_filter

_thc_decide_state

_thc_autocal

_aip_apply_filters

_gov_rpm_err

_sdc_pre_start

_gov_gen_ffd_rpm

hand

max

min

A CONTROL INSTRUMENTS COMPANY

Conclusions

t-way adequate test sets are competitive with hand generated tests.

– 2-way adequate tests sets are not– t >= 3, t = 5 or t =6 is best

Random Testing...– Good but...– NOT reliable– serious implications for testing research

A CONTROL INSTRUMENTS COMPANY

Issues

Is mutation adequate?– Equivalent mutants

Too few functions

Simplistic data models

Structures– N dimensional arrays– Structures with structure– Sparse structures

A CONTROL INSTRUMENTS COMPANY

Random Ideas

Mutations as a measure of complexity?– complexity of code is hard to measure– possible too one dimensional

Mutations as a measure of robustness– is code that has easily killed mutants

“better”