A CONTROL INSTRUMENTS COMPANY The Effectiveness of T-way Test Data Generation or Data Driven Testing...

A CONTROL INSTRUMENTS COMPANY

The Effectiveness of T-way Test Data Generation

or Data Driven Testing

Michael Ellims

Overview

The problem with testing

Experimental designs

Adequacy of tests

Experiments in effectiveness

Optimisation

The problems with testing

Expensive– Estimated at 50% of project cost

Hard– No good theory on designing tests

Solution: automate the testing process?

Automated testing

Generation is half the problem– we can generate data but...

We still need an “oracle”– test have to pass or fail

Options– embedded assertions– formal models– usually – some human

Automated testing

We want a “simple” method– easy to understand– easy to use– inputs from development e.g. data

dictionary– data driven testing

Solution: design of experiments techniques?

Design of experiments

Full factorial experiments– “in which every setting of every factor

appears with every setting of every other factor”

Factor == variable

Setting == level == value

Design of experiments – Latin Square

v a r ia b le 2

v a r ia b le 1

Design of experiments

we have a set of sixteen test vectors

• v1 .. v

• read from the matrix as follows:

v1 = {1, 1, A}

v2 = {1, 2, B}

v3 = {1, 3, C} …

= {4, 4, C}

t-way testing : example

Three variables a, b, c•a has three “valid” values, b has two c, has two•pairwise or 2-way adequate test set...

a1 a2 a3 a2 a1 a3 a1

b2 b1 b1 b2 b2 b2 b1

c1 c2 c1 c1 c2 c2 c1

Evidence

Many papers on 2-way adequate test– mostly vs. coverage criteria (good not

great)– issues with coverage

• some work supports, some doesn’t

Kuhn et al. (series of papers)– Implied higher factors than 2-way needed– t = 5 or 6

Schroeder et al.

Research Questions

• How good are t-way adequate test sets?

t = 2 to t =5

• Can we address oracle problem?

2283 vectors – can’t reviewed by hand!

Problem...

• Compare against what?– coverage : too weak

• Statement coverage• Branch coverage• MCDC coverage

Adequacy - code mutation

• Error based testing– for a limited set of errors– conceptually simple - coding errors

• Direct measure of test set “goodness”– Can test N find error X– Higher fidelity and statement coverage

• F1 : 12 lines but 81 code mutants• F2 : 33 lines but 669 code mutants• F3 : 51 lines but 1297 code mutants

What are code mutants?

if ((a < b) && ((x + y) > q))) ff = jj + 34;

if ((a > b) && ((x + y) > q))) ff = jj + 34;

if ((a < b) || ((x + y) > q))) ff = jj + 34;

if ((a < b) && ((x - y) > q))) ff = jj + 34;

if ((a < b) && ((x * y) > q))) ff = jj + 34;

if ((a < b) && ((x + y) > q))) ff += jj + 34;

if ((a < b) && ((x + y) > q))) ff = jj + 35;

Procedure

FOR each vector

FOR each mutant

run vector on un-mutated code // oracle!

run vector on mutant

compare results

ENDFOR

Experiment 1 - effectiveness

How good is automated testing?– t-way verses hand generated tests– t-way verses random tests– t-way verses random designs

All methods – mutation score

2-way3-way4-way5-wayRdesignRandomBaseHand

Selected methods – mutation score

Rdesign

Random

Selected methods - raw data

Rdesign

Random

Experiment 2 - minimisation

Can we reduce test set to a manageable size?

– oracle problem – the oracle is a person!– You can examine 1000's test vectors

Can we get it to run faster?– 2000 vectors over 2000 mutants– at two seconds per test...

Optimization

FOR next t-way adequate test set // t = 2 .. 5

run remaining mutants vs. all remaining vectors

WHILE a test kill > 1 mutant remains

select test that kills most mutants

mark mutants as dead

ENDWHILE

ENDFOR

select vectors that kill remaining mutants

Time Improvement

100001500020000250003000035000400004500050000

Size Improvement (x5)

_dip_debounce

_aip_median_filter

_sdc_fuel_control

aip_spike_filter

_thc_decide_state

_thc_autocal

_aip_apply_filters

_gov_rpm_err

_sdc_pre_start

_gov_gen_ffd_rpm

Conclusions

t-way adequate test sets are competitive with hand generated tests.

– 2-way adequate tests sets are not– t >= 3, t = 5 or t =6 is best

Random Testing...– Good but...– NOT reliable– serious implications for testing research

Issues

Is mutation adequate?– Equivalent mutants

Too few functions

Simplistic data models

Structures– N dimensional arrays– Structures with structure– Sparse structures

Random Ideas

Mutations as a measure of complexity?– complexity of code is hard to measure– possible too one dimensional

Mutations as a measure of robustness– is code that has easily killed mutants

“better”

A CONTROL INSTRUMENTS COMPANY The Effectiveness of T-way Test Data Generation or Data Driven Testing...

Documents

Data Mining. Instruments

Medical Instruments Data

DATA MANAGEMENT SOFTWARE - GMC-Instruments

SUMMARY OF SAFETY AND EFFECTIVENESS DATA · PDF fileSUMMARY OF SAFETY AND EFFECTIVENESS DATA (SSED) ... FDA Summary of Safety and Effectiveness Data Page 3 ... ISO 10993 and FDA’s

TEST & MEASUREMENT INSTRUMENTS - multimètre€¦ · instruments allowing professionals to test the reliability, safety and effectiveness of their devices, ... AENA, AIRBUS, DCNS,

SUMMARY OF SAFETY AND EFFECTIVENESS DATA (SSED) · PMA P170038: FDA Summary of Safety and Effectiveness Data Page 1 . SUMMARY OF SAFETY AND EFFECTIVENESS DATA (SSED) I. GENERAL INFORMATION

Melrose Teacher Action Research Action plan revision Data collection instruments Data collection instruments

Effectiveness of Physical Exercise in Early Pregnancy on ...€¦ · antenatal care. Instruments: There were four Instruments of data collection compromised I: A structured interviewing

SUMMARY OF SAFETY AND EFFECTIVENESS DATA (SSED) · PMA P150031: FDA Summary of Safety and Effectiveness Data Page 1 . SUMMARY OF SAFETY AND EFFECTIVENESS DATA (SSED) I. GENERAL INFORMATION

DATA ACQUBITION INSTRUMENTS: PsYcHoPHARMAcoLoGY

SUMMARY OF SAFETY AND EFFECTIVENESS DATA · PMA P130004: FDA Summary of Safety and Effectiveness Data Page 1 . SUMMARY OF SAFETY AND EFFECTIVENESS DATA. I. GENERAL INFORMATION. Device

SUMMARY OF SAFETY AND EFFECTIVENESS DATA (SSED) · PMA P150048/S012: FDA Summary of Safety and Effectiveness Data Page 1 . SUMMARY OF SAFETY AND EFFECTIVENESS DATA (SSED) I. GENERAL

Data Visualization Effectiveness Profile - Perceptual Edge · Critiquing the effectiveness of our own visualizations using a ... The data visualization effectiveness profile that

Measuring Effectiveness of Data Integrity Programs ...rx-360.org/.../Measuring-Effectiveness-of-Data-Integrity-by-Rebecca...Measuring Effectiveness of Data Integrity Programs:

Pesticide Reducing Instruments – An Interdisciplinary Analysis of effectiveness and ... · 2005. 6. 8. · Pesticide Reducing Instruments – An Interdisciplinary Analysis of effectiveness

TECHNICAL DATA SHEET - Sierra Instruments

Scottish Dental cep Clinical Effectiveness Programme · Scottish Dental Clinical Effectiveness Programme SD. cep. Cleaning of Dental Instruments. Dental Clinical Guidance. Second

SUMMARY OF SAFETY AND EFFECTIVENESS DATA (SSED… · PMA P130028: FDA Summary of Safety and Effectiveness Data Page 1 SUMMARY OF SAFETY AND EFFECTIVENESS DATA (SSED) I. GENERAL INFORMATION

Data Collection Instruments - MacArthur Foundation

SUMMARY OF SAFETY AND EFFECTIVENESS DATA (SSED) · PMA P120023: FDA Summary of Safety and Effectiveness Data Page 1 SUMMARY OF SAFETY AND EFFECTIVENESS DATA (SSED) I. GENERAL INFORMATION