1 CUSEC 2004 Ensuring the Dependability of Software Systems Dr. Lionel Briand, P. Eng. Canada Research Chair (Tier I) Software Quality Engineering Lab

1CUSEC 2004

Ensuring the Dependability of Software Systems

Dr. Lionel Briand, P. Eng.Canada Research Chair (Tier I)

Software Quality Engineering Lab.Carleton University, Ottawa

2CUSEC 2004

Carleton SE Programs

• Accredited B. Eng. In Software Engineering– Full course on verification and validation– Full course on software quality

management

• Graduate studies– SQUALL lab:

http://www.sce.carleton.ca/Squall/– Supported by a CRC chair

3CUSEC 2004

Objectives

• Overview

• Main practical issues

• Focus on testing

• Current solutions and research

• Future research agenda

4CUSEC 2004

Outline

• Background

• Issues

• Test strategies

• Testability

• Test Automation

• Conclusions and future work

5CUSEC 2004

Outline

• Background

• Issues

• Test strategies

• Testability

• Test Automation


6CUSEC 2004

Dependability• Dependability: Correctness, reliability, safety,

robustness• Correct but not safe or robust: the

specification is inadequate• Reliable but not correct: failures happen

rarely• Safe but not correct: annoying failures may

happen• Robust but not safe: catastrophic failures are

possible

7CUSEC 2004

Improving Dependability

Testing

Fault Handling

Fault AvoidanceFault Tolerance

Fault Detection

Debugging

ComponentTesting

IntegrationTesting

SystemTesting

VerificationConfigurationManagement

AtomicTransactions

ModularRedundancy

CorrectnessDebugging

PerformanceDebugging

InspectionsDesign

Methodology

8CUSEC 2004

Testing Process Overview

SW Representation

SW Code

Tests

Tests

CompareOracle

ExpectedResults

Results

9CUSEC 2004

Many Causes of Failures

• The specification may be wrong or have a missing requirement

• The specification may contain a requirement that is impossible to implement given the prescribed software and hardware

• The system design may contain a fault

• The program code may be wrong

10CUSEC 2004

Unittest

Unittest

Unittest

Integrationtest

Functiontest

Performancetest

Acceptancetest

Installationtest

Com

pon

en

t co

de

Com

pon

en

t co

de

.

.

.

Teste

d co

mponent

Test

ed c

om

ponent

Integratedmodules

Functioningsystem

Verified,validatedsoftware

Acceptedsystem

SYSTEMIN USE!

Designdescriptions

Systemfunctional

specifications

Othersoftware

specifications

Customerrequirements

Userenvironment

Pfleeger, 1998

11CUSEC 2004

Practice• No systematic test strategies• Very basic tools (e.g., capture and replay test

executions)• No clear test processes, with explicit

objectives• Poor testability• But a substantial part of the development

effort (between 30% and 50%) spent on testing

• SE must become an engineering practice

12CUSEC 2004

Ariane 5 – ESA Launcher

13CUSEC 2004

Ariane 5 – Root Cause• Source: ARIANE 5 Flight 501 Failure, Report by the Inquiry

Board

A program segment for converting a floating point number to a signed 16 bit integer was executed with an input data value outside the range representable by a signed 16 bit integer. This run time error (out of range, overflow), which arose in both the active and the backup computers at about the same time, was detected and both computers shut themselves down. This resulted in the total loss of attitude control. The Ariane 5 turned uncontrollably and aerodynamic forces broke the vehicle apart. This breakup was detected by an on-board monitor which ignited the explosive charges to destroy the vehicle in the air. Ironically, the result of this format conversion was no longer needed after lift off.

14CUSEC 2004

Ariane 5 – Lessons Learned• Rigorous reuse procedures, including usage-

based testing (based on operational profiles)• Adequate exception handling strategies

(backup, degraded procedures?)• Clear, complete, documented specifications

(e.g., preconditions, post-conditions)• Note this was not a complex, computing

problem, but a deficiency of the software engineering practices in place …

14

15CUSEC 2004

Outline

• Background

• Issues

• Test strategies

• Testability

• Test Automation


16CUSEC 2004

Software Characteristics

• No matter how rigorous we are, software is going to be faulty

• No exhaustive testing possible: based on incomplete testing, we must gain confidence that the system has the desired behavior

• Small differences in operating conditions will not result in dramatically different behavior: No continuity property.

• Dependability needs vary

17CUSEC 2004

Testing Requirements

• Effective at uncovering faults• Help locate faults for debugging• Repeatable so that a precise

understanding of the fault can be gained and corrections can be checked

• Automated so as to lower the cost and timescale

• Systematic so as to be predictable

18CUSEC 2004

Our Focus

• Test strategies: How to systematically test software?

• Testability: What can be done to ease testing?

• Test Automation: What makes test automation possible?

19CUSEC 2004

Outline

• Background

• Issues

• Test strategies & Their Empirical Assessment

• Testability

• Test Automation

• Conclusions and research

20CUSEC 2004

Test CoverageSoftware Representation

(Model) Associated Criteria

Test Data

Test cases must cover all the … in the model

Representation of

• the specification Black-Box Testing

• the implementation White-Box Testing

21CUSEC 2004

Empirical Testing Principle• Impossible to determine consistent and complete test

criteria from theory

• Exhaustive testing cannot be performed in practice

• Therefore we need test strategies that have been empirically investigated

• A significant test case is a test case with high error detection potential – it increases our confidence in the program correctness

• The goal is to run a sufficient number of significant test cases – that number should be as small as possible

22CUSEC 2004

Empirical Methods• Controlled Experiments (e.g., in university

settings) High control on application of techniques– Small systems & tasks

• Case studies (e.g., on industrial projects) Realism– Practical issues, little control

• Simulations Large number of test sets can be generated+ More refined analysis (statistical variation)– Difficult to automate, validity?

23CUSEC 2004

Test Evaluation based on Mutant Programs

• Take a program and test data generated for that program• Create a number of similar programs (mutants), each differing

from the original in one small way, i.e., each possessing a fault• E.g., replace addition operator by multiplication operator• The test data are then run through the mutants• If test data detect differences in mutants, then the mutants are

said to be dead, otherwise live. • A mutant remains live either because it is equivalent to the

original program (functionally identical though syntactically different – equivalent mutant) or the test set is inadequate to kill the mutant

• Evaluation in terms of mutation score

24CUSEC 2004

Simulation Process 1

25CUSEC 2004

Simulation Process 2

26CUSEC 2004 Cruise Control System

Inactive/Idle Active/Running

Standby/Running

Cruising/Running

engineOn/carSimulator.engineOn(),speedControl.clearSpeed

engineOff/carSimulator.engineOff(),speedControl.disableControl()

on/speedControl.recordSpeed(),speedControl.enableControl()

off/speedControl.disabelControl()

resume/speedControl.enableControl()

accelerator/carSimplator.accelerate(),

speedControl.disableControl()

brake/carSimulator.brake(),

speedControl.disable()



accelerator/carSimulator.accelerate(),

speedControl.disableControl()

brake/carSimulator.brake(),

speedControl.disable() brake/carSimulator.brake(),speedControl.disable()

accelerator/carSimulator.accelerate(), speedControl.disableControl()

on/speedControl.recordSpeed(), speedControl.enableControl()

on/speedControl.recordSpeed(),speedControl.enableControl()

27CUSEC 2004

Transition Tree:Cover all Round-

trip paths

Idle Running

Running

Running

Idle

Cruising

Standby

Standby

Cruising

Standby

Cruising

Cruising

Standby

Standby

engineOn

Idle

Idle

on

on

accelerator

accelerator

on

off

28CUSEC 2004

Transition Tree: Simulation Results

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Muta

tion S

core

0 10 20 30 40

Cumulative Length

Transition Tree

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Muta

tion S

core

0 10 20 30 40

Cumulative Length

Null Criterion

29CUSEC 2004

Comparing Criteria

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Cov

erag

e R

atio

0 10 20 30 40 50 60 70 80 90 100

Cumulative Length

AT

TT

ATP

FP

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Mut

atio

n S

core

.0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0

Coverage Ratio

AT

ATP

TT

FP

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Mut

atio

n S

core

0 10 20 30 40 50 60 70 80 90 100

Cumulative Length

FP

ATP

AT

TT

30CUSEC 2004

Outline

• Background

• Issues

• Test strategies

• Testability

• Test Automation


31CUSEC 2004

Testability • Controllability: ability to put an object in a chosen

state (e.g., by test driver) and to exercise its operations with input data

• Observability: ability to observe the outputs produced in response to a supplied test input sequence (where outputs may denote not only the output values returned by one operation, but also any other effect on the object environment: calls to distant features, commands sent to actuators, deadlocks …)

• These dimensions will determine the cost, error-proneness, and effectiveness of testing

32CUSEC 2004

Basic Techniques• Get/Set methods in class interfaces• Assertions checked at run time

– State / Class invariants– Pre-conditions– Post-conditions

• Equality Methods: Provides ability to report whether two object are equal – not as simple as it seems …

• Message sequence checking methods: Detect run-time violations of the class’s state specifications

• Testability depends in part on – Coding standards– Design practice – Availability of code instrumentation and analysis tools

33CUSEC 2004

Early Fault Detection and Diagnosis

Diagnosis scope

Diagnosis scope

Exc

epti

on tr

eatm

ent

(dia

gnos

is a

nd d

efau

lt m

ode)

An execution thread

Concurrent execution threads

Infection point: globalprogram state faulty after thispoint

Failure produced at the output

A contract

Exception handledby contract

Classicalsoftware

Designed bycontractsoftware

Baudry et al, 2001

34CUSEC 2004

Ocl2j*: An AOP-based Approach

InstrumentedBytecode

AspectJCompiler

Ocl2jAspectocl2j Tool

ProgramBytecode

UMLModelStage 1:

ContractCodeGeneration

Stage 2:ProgramInstrumentation

* Developed at Carleton University, SQUALL

35CUSEC 2004

Contract Assertions and Debugging

0.00

5.00

10.00

15.00

20.00

25.00

30.00

35.00

0 10 20 30 40 50 60 70 80

Mutant (killed)

Dia

gnos

abilit

y

Oracle

Contract

36CUSEC 2004

Outline

• Background

• Issues

• Test strategies

• Testability

• Test Automation


37CUSEC 2004

Objectives• Test plans should be derived from specification &

design documents• This helps avoid errors in the test planning process

and helps uncover problems in the specification & design

• With additional code analysis and suitable coding standards, test drivers can eventually be automatically derived

• There is a direct link between the quality of specifications & design and the testability of the system

• Test automation may be an additional motivation for model-driven development (e.g., UML-based)

38CUSEC 2004

Performance stress testingPerformance stress testing: to automate, based on the system task architecture, the derivation of test cases that maximize the chances of critical deadline misses within real-time systems

Aperiodic tasks

Periodic tasks

System

Event 1

Event 2

+GeneticAlgorithm

=

Test case

Event 1

Event 1

Event 2

time

39CUSEC 2004

Optimal Integration Orders• Briand and Labiche use Genetic Algorithms to identify

optimal integration orders (minimize stubbing effort). In OO systems.

• Most classes in OO systems have dependency cycles, sometimes many of them.

• Integration order has a huge impact on the integration cost: Cost of stubbing classes

• How to decide of an optimal integration order? This is a combinatorial optimization (under constraints) problem.

• Solutions for the TSP cannot be reused verbatim.

40CUSEC 2004

Example: Jakarta ANT

41CUSEC 2004

Results• We obtain, most of the time, (near) optimal

orders, I.e., orders that minimize stubbing efforts

• GA can handle, with reasonable results, the most complex cases we have been able to find (e.g., 45 classes, 294 dependencies, > 400000 dependency cycles).

• The GA approach is flexible in the sense that it is easy to tailor the objective/fitness function, add new constraints on the order, etc.

42CUSEC 2004

Further Automation• Meta-heuristic algorithms: Genetic

algorithms, simulated annealing• Generate test data based on constraints

– Structural testing– Fault-based testing – Testing exception conditions

• Analyze specifications (e.g., contracts) – Specification flaws (satisfy precondition and

violate postcondition)

43CUSEC 2004

Conclusions

• There are many opportunities to apply optimization and search techniques to help test automation

• Devising cost-effective testing techniques requires experimental research

• Achieving high testability requires:– Good analysis and instrumentation tools– Good specification and design practices

44CUSEC 2004

Thank you

Questions?

(en français or English)

Documents

1 CUSEC 2004 Ensuring the Dependability of Software Systems Dr. Lionel Briand, P. Eng. Canada Research Chair (Tier I) Software Quality Engineering Lab