View
217
Download
1
Category
Preview:
Citation preview
A CONTROL INSTRUMENTS COMPANY
The Effectiveness of T-way Test Data Generation
or Data Driven Testing
Michael Ellims
A CONTROL INSTRUMENTS COMPANY
Overview
The problem with testing
Experimental designs
Adequacy of tests
Experiments in effectiveness
Optimisation
A CONTROL INSTRUMENTS COMPANY
The problems with testing
Expensive– Estimated at 50% of project cost
Hard– No good theory on designing tests
Solution: automate the testing process?
A CONTROL INSTRUMENTS COMPANY
Automated testing
Generation is half the problem– we can generate data but...
We still need an “oracle”– test have to pass or fail
Options– embedded assertions– formal models– usually – some human
A CONTROL INSTRUMENTS COMPANY
Automated testing
We want a “simple” method– easy to understand– easy to use– inputs from development e.g. data
dictionary– data driven testing
Solution: design of experiments techniques?
A CONTROL INSTRUMENTS COMPANY
Design of experiments
Full factorial experiments– “in which every setting of every factor
appears with every setting of every other factor”
Factor == variable
Setting == level == value
A CONTROL INSTRUMENTS COMPANY
Design of experiments – Latin Square
v a r ia b le 2
v a r ia b le 1
A
B
C
D
B
C
D
A
D
A
B
C
C
D
A
B
A CONTROL INSTRUMENTS COMPANY
Design of experiments
we have a set of sixteen test vectors
• v1 .. v
16
• read from the matrix as follows:
v1 = {1, 1, A}
v2 = {1, 2, B}
v3 = {1, 3, C} …
v16
= {4, 4, C}
A CONTROL INSTRUMENTS COMPANY
t-way testing : example
Three variables a, b, c•a has three “valid” values, b has two c, has two•pairwise or 2-way adequate test set...
a1 a2 a3 a2 a1 a3 a1
b2 b1 b1 b2 b2 b2 b1
c1 c2 c1 c1 c2 c2 c1
A CONTROL INSTRUMENTS COMPANY
Evidence
Many papers on 2-way adequate test– mostly vs. coverage criteria (good not
great)– issues with coverage
• some work supports, some doesn’t
Kuhn et al. (series of papers)– Implied higher factors than 2-way needed– t = 5 or 6
Schroeder et al.
A CONTROL INSTRUMENTS COMPANY
Research Questions
• How good are t-way adequate test sets?
t = 2 to t =5
• Can we address oracle problem?
2283 vectors – can’t reviewed by hand!
A CONTROL INSTRUMENTS COMPANY
Problem...
• Compare against what?– coverage : too weak
• Statement coverage• Branch coverage• MCDC coverage
A CONTROL INSTRUMENTS COMPANY
Adequacy - code mutation
• Error based testing– for a limited set of errors– conceptually simple - coding errors
• Direct measure of test set “goodness”– Can test N find error X– Higher fidelity and statement coverage
• F1 : 12 lines but 81 code mutants• F2 : 33 lines but 669 code mutants• F3 : 51 lines but 1297 code mutants
A CONTROL INSTRUMENTS COMPANY
What are code mutants?
if ((a < b) && ((x + y) > q))) ff = jj + 34;
if ((a > b) && ((x + y) > q))) ff = jj + 34;
if ((a < b) || ((x + y) > q))) ff = jj + 34;
if ((a < b) && ((x - y) > q))) ff = jj + 34;
if ((a < b) && ((x * y) > q))) ff = jj + 34;
if ((a < b) && ((x + y) > q))) ff += jj + 34;
if ((a < b) && ((x + y) > q))) ff = jj + 35;
A CONTROL INSTRUMENTS COMPANY
Procedure
FOR each vector
FOR each mutant
run vector on un-mutated code // oracle!
run vector on mutant
compare results
ENDFOR
ENDFOR
A CONTROL INSTRUMENTS COMPANY
Experiment 1 - effectiveness
How good is automated testing?– t-way verses hand generated tests– t-way verses random tests– t-way verses random designs
A CONTROL INSTRUMENTS COMPANY
All methods – mutation score
0
20
40
60
80
100
120
2-way3-way4-way5-wayRdesignRandomBaseHand
A CONTROL INSTRUMENTS COMPANY
Selected methods – mutation score
0
20
40
60
80
100
120
5-way
Rdesign
Random
Hand
A CONTROL INSTRUMENTS COMPANY
Selected methods - raw data
0
100
200
300
400
500
600
700
800
5-way
Rdesign
Random
Hand
A CONTROL INSTRUMENTS COMPANY
Experiment 2 - minimisation
Can we reduce test set to a manageable size?
– oracle problem – the oracle is a person!– You can examine 1000's test vectors
Can we get it to run faster?– 2000 vectors over 2000 mutants– at two seconds per test...
A CONTROL INSTRUMENTS COMPANY
Optimization
FOR next t-way adequate test set // t = 2 .. 5
run remaining mutants vs. all remaining vectors
WHILE a test kill > 1 mutant remains
select test that kills most mutants
mark mutants as dead
ENDWHILE
ENDFOR
select vectors that kill remaining mutants
A CONTROL INSTRUMENTS COMPANY
Time Improvement
05000
100001500020000250003000035000400004500050000
max
min
A CONTROL INSTRUMENTS COMPANY
Size Improvement (x5)
0
500
1000
1500
2000
2500
3000
3500
_dip_debounce
_aip_median_filter
_sdc_fuel_control
aip_spike_filter
_thc_decide_state
_thc_autocal
_aip_apply_filters
_gov_rpm_err
_sdc_pre_start
_gov_gen_ffd_rpm
hand
max
min
A CONTROL INSTRUMENTS COMPANY
Conclusions
t-way adequate test sets are competitive with hand generated tests.
– 2-way adequate tests sets are not– t >= 3, t = 5 or t =6 is best
Random Testing...– Good but...– NOT reliable– serious implications for testing research
A CONTROL INSTRUMENTS COMPANY
Issues
Is mutation adequate?– Equivalent mutants
Too few functions
Simplistic data models
Structures– N dimensional arrays– Structures with structure– Sparse structures
Recommended