Yves Le Traon 2002 Building Trust into OO Components using a Genetic Analogy Yves Le Traon and Benoit Baudry [email protected]

Yves Le Traon 2002

Building Trust into OO Components using a Genetic Analogy

Yves Le Traon and Benoit [email protected]

Yves Le Traon 2002

Summary

The problem: trustable components Component qualification with mutation

analysis Genetic algorithms for test enhancement A new model: bacteriological algorithms

for test enhancement A mixed-approach and tuning of the

parameters Conclusion

Yves Le Traon 2002

Testing for trust

Safely

Object-oriented paradigm

Reuse

Trustthe better you test, the more trustable your component is

Yves Le Traon 2002

Specification

Implementation

V & V: checking ImplementationagainstSpecification

Trust based onConsistency

Derived as executable contracts

Testing for trust

Yves Le Traon 2002

Trusting Components?

Building Unit Tests

Specifying the Component

Design-by-contractWhat is the accepted domain of use ?

What guarantees the component execution ?

Executable Contracts

The Design of a Trustable Component

Implementing it...

Implementation OK

Test Quality Estimate

debug

Embedding Test SuiteSelf-Tests

Trust

Validating it...

No Trust

Yves Le Traon 2002

How Can You Trust a Component?

Specification

Implementation

V & V: checking ImplementationagainstSpecification (oracle)

(e.g., embedded tests)

Measure of Trust based on

Consistency

Contract between the client and the component

Yves Le Traon 2002


The point of view of the user...

Components “off-the-shelf”

?

Yves Le Traon 2002


The point of view of the user...

Components “off-the-shelf”

85%

“replay” selftests

100%

55%

100%

Yves Le Traon 2002

Trusting Components? Plug-in the component in the system

SpecificationTest

Component

Impl.

System

selft

est

tests

tests

selftest

tests

ContinuityStrategy

Test dependencies between a component and its environment

Yves Le Traon 2002

Principle & Applications

Embed the test suite inside the component Implements a SELF_TESTABLE interface Component Unit Test suite =

Test data + activator Oracle (mostly executable assertions from the

component specification)

Useful in conjunction with Estimating the Quality of the Component Integration Testing

Yves Le Traon 2002

Example with UML

SET_OF_INTEGER

empty : Booleanfull : Booleanhas (x : Integer) : Booleanput (x : Integer) {pre: not full} {post: has (x); not empty}prune (x : Integer) {pre: has (x); not empty} {post: not has (x); not full}

Yves Le Traon 2002

Example with Eiffel class SET_OF_INTEGERS -- implements a simplified set of integers.

creation makefeature -- creation make -- allocate a set of integers

ensure empty: empty; feature {ANY} -- status report empty: BOOLEAN -- is the set empty? full: BOOLEAN -- is the set full ?feature {ANY} -- access has(x: INTEGER): BOOLEAN -- is x in the set ? feature {ANY} -- Element changes put(x: INTEGER) -- put x into the set require not_full: not full;

ensure has: has(x); not_empty: not empty;

prune(x: INTEGER) -- remove x from the set. require has: has(x); not_empty: not empty;

ensure not_has: not has(x); not_full: not full;

end -- class SET_OF_INTEGERS

inherit SELF_TESTABLE, self_test

feature {TEST_DRIVER} -- for testing purpose onlytst_add -- test addtst_prune -- test ‘prune’ and ‘add’tst_suite -- Called by template method ‘self_test’ -- to execute all tst_* methods.

Yves Le Traon 2002

Execution of class SET_OF_INTEGERS Self Test

Test sequence nb. 1 is add usable? put - empty - full -------------- - empty at create ... Ok - not empty after put ... Ok ..... >>>> DIAGNOSIS on class SET_OF_INTEGERS <<<< test(s) are OK. Method call statistics has : 37 full : 33 prune : 3 empty : 47 index_of : 40 put : 14 ....

Yves Le Traon 2002

Component qualification with mutation analysis

Mutation operatorsCase studyThe problem of automating the test enhancement process

Yves Le Traon 2002

The Triangle View of a component

Specification

Implementation

V & V: checking ImplementationagainstSpecification (oracle)

(e.g., embedded tests)

Measure of Trust based on

Consistency

Contract between the client and the component

Yves Le Traon 2002

Assessing Consistency

Assume the component passes all its tests according to its specification Component’s implementation quality linked to

its test & specification quality How do we estimate test & spec

consistency? Introduce faults into the implementation

mutant implementations

Check whether the tests catch the faults the tests kill the mutants

Yves Le Traon 2002

put (x : INTEGER) is -- put x in the set

require not_full: not full do1 if not has (x) then2 count := count + 13 structure.put (x, count)

end -- if ensure

has: has (x) not_empty: not empty

end -- put

- 1

Remove-inst

Limited Mutation Analysis

Yves Le Traon 2002

Class A

Selftest A

Generation of mutants

mutantA6 mutantA5

mutantA4 mutantA3

mutantA2 mutantA1

Test Execution

mutantAj alive

Diagnosis Equivalent mutant1

Incomplete specification

2

Consider Aj as3

Enhance Selftest

mutantAj killed

SelfTest OK !

Add contracts to the specification

Error detectedError not detected

Automated process

Non automated process

Overall Process

Yves Le Traon 2002

About Living Mutants

What if a mutant is not killed?

Tests inadequate => add more tests

Specification incomplete => add precision

Equivalent mutant => remove mutant (or

original!)

e.g., x<y ? x:y <=> x<=y ? x:y

Yves Le Traon 2002

Trust estimating : Score of mutation analysis

d : Number of killed mutantsm : Number of generated mutants which

are not equivalent

Component qualification

dm

MS =

Yves Le Traon 2002

Quality Estimate = Mutation Score Q(Ci) = Mutation Score for Ci = di/mi

di = number of mutants killed

mi = number of mutants generated for Ci WARNING: Q(Ci)=100% not=> bug free

Depends on mutation operators (see next slide)

Quality of a system made of components Q(S) = di / mi

Yves Le Traon 2002

SpecificationTest

Impl.

System

selft

est

tests

tests

selftest

tests

120/184

65/100

378/378

234/245

(184 + 100 + 378 + 245)

System test QualityQ(S)

=(120 + 65 + 378 + 234)

=87,8 %

Yves Le Traon 2002

Component qualificationType Description

EHF Exception Handling Fault

AOR Arithmetic Operator Replacement

LOR Logical Operator Replacement

ROR Relational Operator Replacement

NOR No Operation Replacement

VCP Variable and Constant Perturbation

MCR Methods Call Replacement

RFI Referencing Fault Insertion

Mutation operators

Yves Le Traon 2002

Mutation operators (1)

Exception Handling Fault causes an exception

Arithmetic Operator Replacement replaces e.g., ‘+’ by ‘-’ and vice-versa.

Logical Operator Replacement logical operators (and, or, nand, nor, xor)

are replaced by each of the other operators; expression is replaced by TRUE and FALSE.

Yves Le Traon 2002


Relational Operator Replacement relational operators (<, >, <=, >=, =, /=)

are replaced by each one of the other operators.

No Operation Replacement Replaces each statement by the Null statement.

Variable and Constant Perturbation Each arithmetic constant/variable: ++ / -- Each boolean is replaced by its complement.

Yves Le Traon 2002


Referencing Fault Insertion (Alias/Copy) Nullify an object reference after its

creation. Suppress a clone or copy instruction. Insert a clone instruction for each reference

assignment.

Yves Le Traon 2002

Outline of a Testing Process

Select either: Quality Driven: select wanted quality level = Q(Ci) Effort Driven: Maximum #test cases = MaxTC

Mutation Analysis and Test Cases Enhancement while Q(Ci) < Q(Ci) and nTC <= MaxTC

enhance the test cases (nTC++) apply test cases to each mutant

Eliminates equivalent mutants computes new Q(Ci)

Yves Le Traon 2002

A test report

119 mutants, 99 dead, 15 equivalents MS= 99/104=95%

id_mut EQ METHODE SOURCE MUTANT COMMENTAIRE

2 1 empty count = lower_bound – 1 count <= lower_bound – 1 jamais <

6 2 full count = upper_bound count >= upper_bound jamais >

16 3 index_of – loop variant count + 2 count * 2 même

24 4 index_of – loop until count or else structure count or structure court test

30 5 make count := 0 (nul) valeur défaut

45 6 make lower_bound, upper_bound (lower_bound – 1), upper_bound redondance

46 7 make lower_bound, upper_bound lower_bound, (upper_bound + 1) redondance

60 I full count = count + 1 = test insuf.

63 II full upper_bound; (upper_bound – 1); test insuf.

72 III put if not has (x) then if true then Spec inc.

75 IV put if not has (x) then if not false then Spec inc.

98 8 index_of – loop variant - Result) - Result + 1) même

99 9 index_of – loop variant - Result) - Result - 1) même

100 10 index_of – loop variant count + 2 - (count + 2 + 1) - même

101 11 index_of – loop variant count + 2 - (count + 2 – 1) - même

102 12 index_of – loop variant count + 2 - (count + 1) + 2 - même

103 13 index_of – loop variant count + 2 - (count – 1) + 2 - même

104 14 index_of – loop variant count + 2 - (count + 3 - même

105 15 index_of – loop variant count + 2 - (count + 1 - même

110 V index_of – loop until > count or > (count + 1) or test insuf.

NON EQUIVALENT

EQUIVALENT

Yves Le Traon 2002

Global Process

initial tests generation and bugs correction (tester’s work).

automatic optimization of the initial tests

set

measure contracts efficiency

Improvecontracts

Oracle functionreconstruction

equivalent mutants

suppressionremaining

bugs correction

1 2 3 4 5 6

contracts

testimpl.

contracts

testimpl.MS=trust

contracts

testimpl.

MS=trust

MS=

trus

t contracts

testimpl.

contracts

testimpl.

MS=

trus

t

contracts

testimpl.

Automated process

Yves Le Traon 2002

Oracle

Oracle par différence de comportements Des traces Des objets « programmes »

Oracle embarqués (contrats, assertions) Interne au composant

Oracle associé aux données de test Extérieur au composant Oracle spécifique à la donnée de test

Yves Le Traon 2002

A short cases study Building a self-testable library : date-

time

p_date_time.e

+make+is_equal(like current)+hash_code+set_date(x,y,z : integer)+set_time(x,y,z : integer)+set(a,b,c,x,y,z : integer)+is_valid+is_local+is_utc+timezone_bias+set_timezone_bias(integer)-local_timebias-internal_tz+to_iso, out+to_iso_long+to_rfc-add_iso_timezone(string, boolean)

p_date.e

+is_equal(like current)+hash_code+set_year(integer)+set_month(integer)+set_day(integer)+set(x,y,z : integer)+is_valid_date(x,y,z :integer)+is_valid+is_leap_year-max_day(x :integer)+to_iso, out+to_iso_long+to_rfc

p_time.e

+is_equal(like current)+hash_code+set_hour(integer)+set_minute(integer)+set_second(integer)+set(x,y,z : integer)+is_am+is_pm+to_iso, out+to_rfc

p_format.e+int_to-str_min(x,y : integer)

p_date_const.e

-Rfc_january-Rfc_february-Rfc_march-Rfc_april-Rfc_may-Rfc_june-Rfc_july-Rfc_august-Rfc_september-Rfc_october-Rfc_november-Rfc_december

comparable.ehashable.ep_text_object.e

Relation de clientèle

Relation d ’héritage

Yves Le Traon 2002

A short cases study Results

p_date.e p_time.e p_date_time.e

Total number of mutants 673 275 199

Nbr equivalent 49 18 15

Mutation score 100% 100% 100%

Initial contracts efficiency 10,35% 17,90% 8,7%Improved contracts

efficiency69,42% 91,43% 70,10%

First version test size 106 93 78Reduced tests size 72 33 44

Yves Le Traon 2002

A short cases study

Infected Component p_date.e p_time.enb of mutants 350 161

nb equivalent 33 8

nb of dead mutants 195 114

Robustness Score(% of mutants killed by clients tests)

61,51% 74,50%

Robustness of the selftest against an infected environment p_date_time selftest robustness

Yves Le Traon 2002

Partial conclusion An effective method to build (some level of) ‘trust’

estimate the quality a component based on the consistency of its 3 aspects:

specification (contract) tests implementation

be a good basis for integration and non-regression testing A tool is currently being implemented

for languages supporting DbC: Eiffel, Java (with iContract), UML (with OCL+MSC) ...

Yves Le Traon 2002

Mutation for Unit and System testing

System testing Combinatory explosion of the #mutants Determination of equivalent mutant

unrealistic

A subset of operators must be chosen to deal with these constraints

A strategy for injecting faults ? Specific operators ?

Yves Le Traon 2002


Mutation

Mutator

UnitMutator SystemMutator

Mutant

MutationOperator<<interface>>

UnitOperator

<<interface>>

SystemOperator

LOR

NORROR

AOR

EHF

VCP

MCR

RFI

generatedMutants

1*

1

*

1

*

UnitMutator SystemMutator

MutatorMutant*

generatedMutants

1

<<interface>>

SystemOperator

NORROR

AOR

EHF MCR

MutationOperator

VCP

<<interface>>

UnitOperator

LORRFI

*

1 1

*

Yves Le Traon 2002


Mutation

Mutator

Mutant

generatedMutants1*

Mutator

Mutant

generatedMutants

TestOptimizer

TestOptimizer

TestRunner

TestCase

ComponentUnderTest

ClassSystem

SystemTestCase UnitTestCase

1 *

TestOptimizer

1 11 1

1*

1

Yves Le Traon 2002

Genetic algorithms for test enhancement

Yves Le Traon 2002

Test enhancement

Easy to reach a mutation score of 60% Costly to improve this score Test enhancement Genetic

algorithms

Yves Le Traon 2002

Gas for test optimization

The issue : improving test cases automatically Non linear problem Gas may be adapted

Yves Le Traon 2002

Case studyClass A

Autotest A

MutantGeneration

mutantA6 mutantA5

mutantA4 mutantA3

mutantA2 mutantA1

Test Execution

Diagnosis

Incomplete specification

3

Automatic Improvementof selftest

Automatic process

Non automatic process

Genetic Algorithm

1st stepi-th version of Autotest A

Step i

New MS >Old MSyes

determinist Improvement of selftest

Add contracts to the specification

2 1 Look for equivalent mutants

no

Yves Le Traon 2002

Class A

mutantA5 mutantA6

mutantA7 mutantA8

mutantA9 mutantA10

mutantA1 mutantA2

mutantA3 mutantA4

Test Test1

Test2Test3Test4

Test5

Test6

Test enhancement

Praises population

Yves Le Traon 2002

Test enhancement

The analogy Test cases = predators = individuals

Mutants = population of prey

Yves Le Traon 2002

Test enhancement: Genetic algorithm

Genes Individual Operators

reproduction crossover mutation

Fitness function

Yves Le Traon 2002

Fitness function Mutation score Individual Tests cases For unit class testing

Gene {Initialization, Function calls}


Yves Le Traon 2002

ind1 = {G1 1, ... G1 i, G1 i+1, .. G1 m} ind2 = {G2 1, ... G2 i, G2 i+1, .. G2 m}

ind3 = {G1 1, ... G1 i, G2 i+1, .. G2 m} ind4 = {G2 1, ... G2 i, G1 i+1, .. G1 m}

G = [I , S] G = [I , Smut]

S = (m1(p1),…,mi(pi),…mn(pn)) Smut = (m1(p1),…,mi(pi mut),…mn(pn))

G1 = [I1 , S1] G2 = [I2 , S2]

G3 = [I2 , S1] G4 = [I1 , S2] G5 = [I1 , S1 S2] G6 = [I2 , S2 S1]

Operators Crossover

Mutation for unit testing


Yves Le Traon 2002


Gene modeling for system testing Must be adapted to the particular system

under test In the case of the studied system (a C# parser)

If there are x nodes N in the file a gene can be represented as follows

G = [N1,…,Nx] Mutation operator for system testing. The mutation

operator, chooses a gene at random in an individual and replaces a node in that gene by another one:

G = [N1,…, Ni,…, Nx] Gmut = [N1,…, Nimut,…, Nx]

Yves Le Traon 2002


•choose an initial population•calculate the fitness value for each individual

•reproduction•crossover•mutation on one or several individuals

•several stopping criteria : x number ofgenerations, a given fitness value reached …

Genetic loop

The global process of a genetic algorithm

Yves Le Traon 2002

Case study: unit testing

p_date_time.e

+make+is_equal(like current)+hash_code+set_date(x,y,z : integer)+set_time(x,y,z : integer)+set(a,b,c,x,y,z : integer)+is_valid+is_local+is_utc+timezone_bias+set_timezone_bias(integer)-local_timebias-internal_tz+to_iso, out+to_iso_long+to_rfc-add_iso_timezone(string, boolean)

p_date.e

+is_equal(like current)+hash_code+set_year(integer)+set_month(integer)+set_day(integer)+set(x,y,z : integer)+is_valid_date(x,y,z : integer)+is_valid+is_leap_year-max_day(x :integer)+to_iso, out+to_iso_long+to_rfc

p_time.e

+is_equal(like current)+hash_code+set_hour(integer)+set_minute(integer)+set_second(integer)+set(x,y,z : integer)+is_am+is_pm+to_iso, out+to_rfc

p_date_const.e

-Rfc_january-Rfc_february-Rfc_march-Rfc_april-Rfc_may-Rfc_june-Rfc_july-Rfc_august-Rfc_september-Rfc_october-Rfc_november-Rfc_december

comparable.ehashable.ep_text_object.e

Client Relation

Inheritance Relation

p_format.e

+int_to-str_min(x,y : integer)

Yves Le Traon 2002


p_date p_date_time p_time

# of generated mutants 673 199 275

mutation score (%) 53 58 58

Yves Le Traon 2002


Results for unit testing (Mutation rate : 10%)

50

52

54

56

58

60

62

64

66

68

70

1 6 11# generations

mu

tati

on

sc

ore

(%)

p_date

p_time

p_date_time

Yves Le Traon 2002

Case study: system testing

a .NET component : C# parser (32 classes )

TextCSPrettyPrinterHTMLPrettyPrinterStatsVisitor

CSNodeBuilderCSPrettyPrinter

<<interface>>NodeVisitor

ParcoursVisitorTokenizer

1

Destructor

For

*NamespaceCompileUnitInterfaceStatement

RTFCSPrettyPrinter

Switch WhileIfForEachCode

******

*

CSNode

Member

Field

BlocDo

statements

Type Class

Constructor

1

*Attributes

*

* *

Comment

Attribute

PropertyMethod

*

*

Yves Le Traon 2002

Score

0

10

20

30

40

50

60

70

80

90

0 50 100 150 200


initial population of 12 individuals of size 4 (mutation score = 55%).

Mutation rate : 2%

Yves Le Traon 2002


Mutation rate : 10 %

50

55

60

65

70

75

80

85

90

0 50 100 150 200# generation

mu

tati

on

sc

ore

(%

)

Yves Le Traon 2002


Mutation rate : 20 % !!!Score

0

10

20

30

40

50

60

70

80

90

100

0 50 100 150 200

Yves Le Traon 2002


The GA have many parameters: difficult tuning

Results are not stable The mutation plays a crucial role

Not a « classical » GA Some technical reasons for these

deceiving results No memorization to guarantee a growth of

the MS

Yves Le Traon 2002

A new model: bacteriological algorithms for test

enhancement

Yves Le Traon 2002

The bacteriological approach

an adaptive approach Test cases have to be adapted to a given

«environment » No cross-over

A new model taken from a biological analogy The bacteriological approach

Yves Le Traon 2002


•choose an initial set of bacteria

•compute the fitness value for each bacterium•memorization ofthe best bacterium•reproduction•mutation on one or several bacteria

• several stopping criteria : x number of generations, a given fitness value reached …

Bacteriological loop

Yves Le Traon 2002

Number of bacteria is constant (except the memorized ones)

The reproduction randomly choose bacteria with a probability proportional to its contribution to the new mutation score

The best bacterium is memorized (constrained by a given threshold)

Parameters of a bacteriological algorithm are max size of a bacterium size of the population

Yves Le Traon 2002


Generic UML modelTestOptimizer

TestOptimizer

Bacteriologic

MutationStrategy

Memory

<<interface>>

Bacterium

UnitMutationSystemMutation

1

1

n1

1

mutationOperator

1*

1

Mutation

MutantMutant

TestOptimizer

Bacteriologic

1

AliveMutants

*

KilledMutants*

TestCase

SystemTestCaseUnitTestCase

MutationStrategy

Memory

<<interface>>

Bacterium

UnitMutationSystemMutation

1

n

1

mutationOperator

1*

Yves Le Traon 2002


Results for unit testing

50

60

70

80

90

100

0 50 100 150 200

number of tests

mu

tati

on

sc

ore

p_time

p_date_time

p_date

Yves Le Traon 2002


Initial mutation score = 55%.

50

55

60

65

70

75

80

85

90

95

100

0 5 10 15 20 25 30# generations

mu

tati

on

sco

re(%

)

Yves Le Traon 2002

Comparison

Performances

reduced tuning effort: less parameters (size of an individual, selection of individuals for reproduction)

Algorithm # generation

mutationscore (%)

# mutants executed

Genetic 200 85 480000

Bacteriologic 30 96 46375

Yves Le Traon 2002

A mixed-approachTuning of the parameters

Yves Le Traon 2002

Tuning of models and clues for a mixed-approach

Fix the « size » of a bacterium Study of the C# parser Size of a bacterium = #syntactic nodes

Yves Le Traon 2002


Yves Le Traon 2002


Top mutation score trend curve

Yves Le Traon 2002


Mixed-Approach Loooking for an intermediate solution

between ‘pure’ GAs and bacteriological algorithms. From no memorization to a systematic

memorization

Trade-off between number of test cases = number of memorized

bacteria Convergence speed = number of generations

Yves Le Traon 2002

Clues for a mixed-approach

Let B be bacterium and threshold_value be the memorization thresholdif fitness_value(B)> threshold_value then memorize B

The category of the algorithm depends on the threshold value:if threshold_value = 100 then “pure ” geneticif threshold_value = 0 then “pure” bacteriologicalif 0 < threshold_value < 100 then mixed-approach

Yves Le Traon 2002

Clues for a mixed-approachConvergence speed

Yves Le Traon 2002


# of test cases set = # memorized Bacteria

Yves Le Traon 2002


A slice for the mixed-approach

50

55

60

65

70

75

80

85

90

95

100

0 5 10 15 20 25 30# generations

mu

tati

on

sc

ore

(%)

50

55

60

65

70

75

80

85

90

95

100

0 5 10 15 20 25 30# generations

mu

tati

on

sco

re(%

)

Bacteriological approach

Lowest convergence speed for the mixed-approachTest cases set reduced from 10 to 7But many parameters must be tuned => bact. still better

Yves Le Traon 2002

Method to build trustable components Automated test improvement

Gas not adapted BAs better and easier to calibrate Mixed-approach benefit not obvious

Conclusion

Documents

Yves Le Traon 2002 Building Trust into OO Components using a Genetic Analogy Yves Le Traon and Benoit Baudry [email protected]