Upload
fernando-brito-e-abreu
View
1.324
Download
2
Tags:
Embed Size (px)
Citation preview
Software Testing
Fernando Brito e Abreu
DCTI / ISCTE-IUL QUASAR Research Group
Software Engineering / Fernando Brito e Abreu 2 27-Sep-11
SWEBOK: the 10 Knowledge Areas
Software Requirements
Software Design
Software Construction
Software Testing
Software Maintenance
Software Configuration Management
Software Engineering Management
Software Engineering Process
Software Engineering Tools and Methods
Software Quality
Software Engineering / Fernando Brito e Abreu 3 27-Sep-11
Motivation - The Bad News ...
Software bugs cost the U.S. economy an
estimated $59.5 billion annually, or about 0.6%
of the gross domestic product.
Sw users shoulder more than half of the costs
Sw developers and vendors bear the remainder
of the costs.
Source:The Economic Impacts of Inadequate Infrastructure for
Software Testing, Technical Report, National Institute of
Standards and Technology, USA, May 2002
http://www.nist.gov/director/prog-ofc/report02-3.pdf
Software Engineering / Fernando Brito e Abreu 4 27-Sep-11
Motivation - The GOOD News!
According to the same report:
More than 1/3 of the costs (an estimated $22.2
billion) can be eliminated with earlier and more
effective identification and removal of software
defects.
Savings can mainly occur in the development
stage, when errors are introduced.
More than half of these errors aren't detected until
later in the development process or during post-sale
software use.
Software Engineering / Fernando Brito e Abreu 5 27-Sep-11
Motivation
Reliability is one of the most important software
quality characteristics
Reliability has a strong financial impact:
better image of producer
reduction of maintenance costs
celebration or revalidation of maintenance contracts,
new developments, etc.
The quest for Reliability is the aim of V&V !
Software Engineering / Fernando Brito e Abreu 6 27-Sep-11
Verification and Validation (V&V)
Verification - product correctness and
consistency in a given development phase, face
to products and standards used as input to that
phase - "Do the Job Right"
Validation - product conformity with specified
requirements - "Do the Right Job"
Basically two complementary V&V techniques :
Reviews (Walkthroughs, Inspections, ...)
Tests
Software Engineering / Fernando Brito e Abreu 7 27-Sep-11
Summary
Software Testing Fundamentals
Test Levels
Test Techniques
Test-related Measures
Test Process
Software Engineering / Fernando Brito e Abreu 8 27-Sep-11
Summary
Software Testing Fundamentals
Test Levels
Test Techniques
Test-related Measures
Test Process
Software Engineering / Fernando Brito e Abreu 9 27-Sep-11
Testing is …
… an activity performed for evaluating product quality, and for improving it, by identifying defects and problems.
… the dynamic verification of the behavior of a program on a finite set of test cases, suitably selected from the usually infinite executions domain, against the expected behavior.
Software Engineering / Fernando Brito e Abreu 10 27-Sep-11
Dynamic versus static verification
Testing always implies executing the program on
(valued) inputs; therefore is a dynamic technique
The input value alone is not always sufficient to determine a
test, since a complex, nondeterministic system might react to
the same input with different behaviors, depending on its state
Different from testing and complementary to it are static
techniques (described in the Software Quality KA)
Software Engineering / Fernando Brito e Abreu 11 27-Sep-11
Terminology issues Error
the human cause for defect existence (although bugs walk …)
Fault or defect (aka bug) incorrectness, omission or undesirable characteristic in a deliverable
the cause of a failure
Failure Undesired effect (malfunction) observed in the system’s delivered service
Incorrectness in the functioning of a system
See: IEEE Standard for SE Terminology (IEEE610-90)
Software Engineering / Fernando Brito e Abreu 12 27-Sep-11
Testing views
Testing for defect identification A successful test is one which causes a system to fail
Testing can reveal failures, but it is the faults (defects) that must be removed
Testing to demonstrate (that the software meets its specifications or other desired properties) A successful test is one where no failures are observed
Fault detection (e.g. in code) is often hard through failure exposure Identifying all failure-causing input sets (i.e. those sets of inputs that
cause a failure to appear) may not be feasible
Software Engineering / Fernando Brito e Abreu 13 27-Sep-11
Summary
Software Testing Fundamentals
Test Levels
Test Techniques
Test-related Measures
Test Process
Software Engineering / Fernando Brito e Abreu 14 27-Sep-11
Test Levels Objectives of testing
Testing can be aimed at verifying different properties: Checking if functional specifications are implemented right
aka conformance testing, correctness testing, or functional testing
Checking nonfunctional properties E.g. performance, reliability evaluation, reliability measurement,
usability evaluation, etc
Stating the objective in precise, quantitative terms allows control to be established over the test process Often objectives are qualitative or not even stated explicitly
Software Engineering / Fernando Brito e Abreu 15 27-Sep-11
Test Levels Objectives of testing
Acceptance /
Qualification testing
Installation testing
Alpha and beta testing
Conformance /
Functional /
Correctness testing
Reliability achievement
and evaluation
Regression testing
Performance testing
Stress testing
Back-to-back testing
Recovery testing
Configuration testing
Usability testing
Software Engineering / Fernando Brito e Abreu 16 27-Sep-11
Test Levels – Objectives of testing Acceptance / Qualification testing
Checks the system behavior against the
customer’s requirements
The customer may not exist yet, so someone has to
forecast his intended requirements
This testing activity may or may not involve the
developers of the system
Software Engineering / Fernando Brito e Abreu 17 27-Sep-11
Test Levels – Objectives of testing Installation testing
Installation testing can be viewed as system
testing conducted once again according to
hardware configuration requirements
Usually performed in the target environment at the
customer’s premises
Installation procedures may also be verified
e.g. is the customer local expert able to add a new
user in the developed system?
Software Engineering / Fernando Brito e Abreu 18 27-Sep-11
Test Levels – Objectives of testing Alpha and beta testing
Before the software is released, it is sometimes
given to a small, representative set of potential
users for trial use. Those users may be:
in-house (alpha testing)
external (beta testing)
These users report problems with the product
Alpha and beta use is often uncontrolled, and is not
always referred to in a test plan
Software Engineering / Fernando Brito e Abreu 19 27-Sep-11
Test Levels – Objectives of testing Conformance / Functional / Correctness testing
Conformance testing is aimed at validating
whether or not the observed behavior of the
tested software conforms to its specifications
Software Engineering / Fernando Brito e Abreu 20 27-Sep-11
Test Levels – Objectives of testing Reliability achievement and evaluation
Testing is a means to improve reliability
By randomly generating test cases according to
the operational profile, statistical measures of
reliability can be derived
Reliability growth models allow to express this
reality
Software Engineering / Fernando Brito e Abreu 21 27-Sep-11
Reliability growth models
Provide a prediction of reliability based on the failures observed under reliability achievement and evaluation
They assume, in general, that:
a growing number of well-succeeded tests increases our confidence on the system’s reliability
the faults that caused the observed failures are fixed after being found (thus, on average, product’s reliability has an increasing trend)
Software Engineering / Fernando Brito e Abreu 22 27-Sep-11
Reliability growth models
Many models were published, which are divided
into:
failure-count models
time-between failure models
Software Engineering / Fernando Brito e Abreu 23 27-Sep-11
Test Levels – Objectives of testing Regression testing (1/2)
Regression testing is:
The “selective retesting of a system or component to verify
that modifications have not caused unintended effects.”
(IEEE610.12-90)
Any repetition of tests intended to show that the software’s
behavior is unchanged, except insofar as required
A technique to combat side-effects!
In practice, the idea is to show that software which
previously passed the tests still does
Software Engineering / Fernando Brito e Abreu 24 27-Sep-11
Test Levels – Objectives of testing Regression testing (2/2)
A trade-off must be made between: the assurance given by regression every time a change is made
… and the resources required to do that
To allow regression tests we must build, incrementally, a test battery
Regression testing is more feasible if we have tools to record and playback test cases Several commercial user interface event-caption tools (black-
box testing) exist
Software Engineering / Fernando Brito e Abreu 25 27-Sep-11
Test Levels – Objectives of testing Performance testing / Stress testing
Aimed at verifying that the software meets the
specified performance requirements:
e.g. volume testing and response time
The performance degradation under increasingly
exigent scenarios should be plotted
If we exercise software at the maximum design
load (or beyond it), we call it stress testing
Software Engineering / Fernando Brito e Abreu 26 27-Sep-11
Test Levels – Objectives of testing Back-to-back testing
A single test set is performed on two
implemented versions of a software product
The results are compared
Whenever a mismatch occurs, then one of the two
versions (at least) is probably evidencing failure
Software Engineering / Fernando Brito e Abreu 27 27-Sep-11
Test Levels – Objectives of testing Recovery testing
Aimed at verifying software restart capabilities
after a “disaster”
Recovery testing is a fundamental step in
building a contingency plan
Software Engineering / Fernando Brito e Abreu 28 27-Sep-11
Test Levels – Objectives of testing Configuration testing
When software is built to serve different users,
configuration testing analyzes the software under
the various specified configurations
The problem is similar when the hardware of software
platform varies somehow (e.g. different mobile phone
versions, different browsers)
This is one of the main issues in software
product lines development
See: http://www.sei.cmu.edu/plp/framework.html
Software Engineering / Fernando Brito e Abreu 29 27-Sep-11
Test Levels – Objectives of testing Usability testing
This process evaluates how easy it is for end-
users to use and learn the software, including:
user documentation
initial installation and extension through add-ons
effectively support in user tasks
…
Software Engineering / Fernando Brito e Abreu 30 27-Sep-11
Test Levels The target of the test
Unit testing the target is a single module
Integration testing the target is a group of modules (related by purpose, use,
behavior, or structure)
System testing the target is a whole system
Software Engineering / Fernando Brito e Abreu 31 27-Sep-11
Test Levels – The target of the test Unit testing
Verifies the functioning in isolation of software pieces
which are separately testable
Depending on the context, they can be individual subprograms
or a larger component made of tightly related units
Typically, unit testing occurs with:
access to the code being tested
support of debugging tools
the programmers who wrote the code
Software Engineering / Fernando Brito e Abreu 32 27-Sep-11
Test Levels – The target of the test Integration testing
Is the process of verifying the interaction between software components Classical integration testing strategies
top-down or bottom-up, are used with hierarchically structured sw
Modern systematic integration strategies architecture-driven, which implies integrating the software
components or subsystems based on identified functional threads
Except for small, simple software, systematic, incremental integration testing strategies are usually preferred to putting all the components together at once The latter is called “big bang” testing
Software Engineering / Fernando Brito e Abreu 33 27-Sep-11
Test Levels – The target of the test System testing
The majority of functional failures should already have
been identified during unit and integration testing
Main concerns:
Assessing if the system complies to the non-functional
requirements, such as security, speed, accuracy, and reliability
Assess if the external interfaces to other applications,
utilities, hardware devices, or the operating environment are
performed well
Software Engineering / Fernando Brito e Abreu 34 27-Sep-11
Test Levels Identifying the test set
Test adequacy criteria
Is the test set consistent?
How much testing is enough?
How many test cases should be selected?
Test selection criteria
How is the test set composed?
Which test cases should be selected?
Software Engineering / Fernando Brito e Abreu 35 27-Sep-11
Test case selection
Proposed test techniques differ essentially in
how they select the test set, which may yield
vastly different degrees of effectiveness
In practice, risk analysis techniques and test
engineering expertise are applied to identify the
most suitable selection criterion under given
conditions
Software Engineering / Fernando Brito e Abreu 36 27-Sep-11
How large should a test battery be?
Even in simple programs, so many test cases are
theoretically possible that exhaustive testing
could require months or years to execute
In practice the whole test set can generally be
considered infinite
Testing always implies a trade-off:
limited resources and schedules on the one hand
inherently unlimited test requirements on the other
Software Engineering / Fernando Brito e Abreu 37 27-Sep-11
After testing …
Even after successful completion of extensive
testing, the software could still contain faults
The remedy for sw failures found after delivery is
provided by corrective maintenance actions
This will be covered in the Software Maintenance KA
Software Engineering / Fernando Brito e Abreu 38 27-Sep-11
Summary
Software Testing Fundamentals
Test Levels
Test Techniques
Test-related Measures
Test Process
Software Engineering / Fernando Brito e Abreu 39 27-Sep-11
Test Techniques
Functional / Black box (based on user’s intuition
and experience)
Based on tester's intuition and experience
Specification-based
Code-based
Usage-based
Fault-based
Based on nature of application
Selecting and combining techniques
Software Engineering / Fernando Brito e Abreu 40 27-Sep-11
Functional Tests (Black-Box) actors
A relevant aspect of black-
box testing is that it is not
compulsory to use
programming experts to
produce a test battery
Extensive invalid input
characterization heavily
relies on tester experience
Case study: Tool to capture GUI events
(functional test cases)
Software Engineering / Fernando Brito e Abreu 42 27-Sep-11
Functional Test Tools - Visual Test
Software Engineering / Fernando Brito e Abreu 43 27-Sep-11
Grouping of test cases
Test cases
Test battery
(test suite)
Reusable test code
Functional Test Tools
Software Engineering / Fernando Brito e Abreu 44 27-Sep-11
Functional Test Tools
Software Engineering / Fernando Brito e Abreu 45 27-Sep-11
Integration with other
Rational tools
Functional Test Tools
Software Engineering / Fernando Brito e Abreu 46 27-Sep-11
Functional Test Tools
Reported failures
Test cases to
execute in this suite
Integration with other
Rational tools
Software Engineering / Fernando Brito e Abreu 47 27-Sep-11
Software Engineering / Fernando Brito e Abreu 48 27-Sep-11
Assessing Functional Test Coverage
The ReModeler tool from the
QUASAR team takes an
innovative model-based
approach to represent this
kind of testing coverage
The color represents the
percentage of the scenarios of
each use case that were
executed by a given test suite
Software Engineering / Fernando Brito e Abreu 49 27-Sep-11
Test Techniques Based on tester's intuition and experience
Ad hoc testing
Perhaps the most widely practiced technique remains
ad hoc testing
Tests are derived relying on the software engineer’s
skill, intuition, and experience with similar programs
Ad hoc testing might be useful for identifying special
tests, those not easily captured by formalized
techniques
Software Engineering / Fernando Brito e Abreu 50 27-Sep-11
Test Techniques Based on tester's intuition and experience Exploratory testing
Simultaneous learning, test design and execution
The tests are not defined in advance in an established test
plan, but are dynamically designed, executed, and modified
The effectiveness of this approach relies on the tester
knowledge, which can be derived from many sources:
observed product behavior during previous version testing
familiarity with the application, platform, failure process
type of possible faults and failures
the risk associated with a particular product
…
Software Engineering / Fernando Brito e Abreu 51 27-Sep-11
Test Techniques Specification-based
Equivalence partitioning
Boundary-value analysis
Decision table
Finite-state machine-based
Testing from formal specifications
Random testing
Software Engineering / Fernando Brito e Abreu 52 27-Sep-11
Test Techniques – Specification-based Equivalence partitioning
The input domain is subdivided into a collection
of subsets, or equivalent classes, which are
deemed equivalent according to a specified
relation, and a representative set of tests
(sometimes only one) is taken from each class.
Software Engineering / Fernando Brito e Abreu 53 27-Sep-11
Test Techniques – Specification-based Boundary-value analysis
Test cases are chosen on and near the boundaries of
the input domain of variables, with the underlying
rationale that many faults tend to concentrate near the
extreme values of inputs
An extension of this technique is robustness testing,
wherein test cases are also chosen outside the input
domain of variables, to test program robustness to
unexpected or erroneous inputs
Case study: Equivalence partitioning and
boundary-value analysis
Software Engineering / Fernando Brito e Abreu 55 27-Sep-11
Triangle Classifier
Classic problem proposed in [Myers79] and
[Hetzel84]:
Distinct classification criteria:
dimension of sides - equilateral, isosceles or scalene
bigger angle - acute, rectangle or obtuse
Software Engineering / Fernando Brito e Abreu 56 27-Sep-11
Triangle Classifier: specification
Input:
dimensions of the three sides: three numbers,
separated by commas (or two angles instead).
Algorithm:
If the dimension of one side is superior to the sum
of the other two, then write ”Not a triangle!"
If it is a valid triangle, then write its classification:
according to the biggest angle - obtuse, rectangle or
acute
according to the side dimension - scalene, isosceles or
equilateral
Software Engineering / Fernando Brito e Abreu 57 27-Sep-11
Triangle Classifier: specification
Output: Write a test case battery for the triangle
classifier
For each test case, specify:
input values (including invalid or unexpected
conditions)
corresponding expected output values
Example: 3,4,5 -> scalene, rectangle
Software Engineering / Fernando Brito e Abreu 58 27-Sep-11
Triangle Classifier equivalence partitioning
For a complete test battery, we need to:
divide the solution space in partitions
identify typical cases for each partition
identify frontier cases
identify extreme cases
identify invalid cases.
Now it is your time to work ...
Don’t turn the page until you finished!
Software Engineering / Fernando Brito e Abreu 59 27-Sep-11
Triangle Classifier partitions and typical cases
SCALENE ISOSCELES EQUILATERAL
OBTUSE 10, 6, 5 12, 7, 7 Impossible
RECTANGLE 5, 4, 3 18 , 3, 3 Impossible
ACUTE 6, 5, 2 7,7,4 6, 6, 6
SCALENE ISOSCELES EQUILATERAL
OBTUSE 120º, 40º (20º) 120º, 30º (30º) Impossible
RECTANGLE 90º, 40º (50º) 90º, 45º (45º) Impossible
ACUTE 30º, 70º (80º) 30º, 75º (75º) 60º, 60º (60º)
Software Engineering / Fernando Brito e Abreu 60 27-Sep-11
Triangle Classifier boundary values
4.001, 4, 3.999 almost equilateral (scalene acute)
4.0001, 4, 4 almost equilateral (isosceles acute)
3, 4.9999, 5 almost isosceles (scalene acute)
9, 4.9999, 5 almost isosceles (scalene obtuse)
5, 3.9999, 3 almost rectangle (scalene acute)
5.0001, 4, 3 almost rectangle (scalene obtuse)
1, 1, 1.4141 almost rectangle (isosceles acute)
1, 1, 1.4143 almost rectangle (isosceles obtuse)
Software Engineering / Fernando Brito e Abreu 61 27-Sep-11
Triangle Classifier extreme cases
1, 2, 3 line segment!
0, 0, 0 point!
Note: extreme cases are not invalid!
Software Engineering / Fernando Brito e Abreu 62 27-Sep-11
Triangle Classifier: invalid cases 6, 4, 0 null side!
12, 4, 3 not a triangle!
5, 3, 2, 5 four sides!
2, 5 one side missing!
3.45 only one side!
No value!
3, , 4, 6 incorrect format
4A, 3, 7 invalid value
6, -1, 4 negative value
Software Engineering / Fernando Brito e Abreu 63 27-Sep-11
Triangle Classifier invalid cases
As we saw, apparently simple problems, often
have some subtleties that make testing more
complex than expected!
Frontier values and invalid input state spaces are
the most likely situations producing failures
Software Engineering / Fernando Brito e Abreu 64 27-Sep-11
Test Techniques – Specification-based Decision table
Decision tables represent logical relationships between
conditions (roughly, inputs) and actions (roughly,
outputs)
Test cases are systematically derived by considering
every possible combination of conditions and actions
A related technique is cause-effect graphing
Software Engineering / Fernando Brito e Abreu 65 27-Sep-11
Test Techniques – Specification-based Finite-state machine-based
By modeling a program as a finite state machine,
tests can be selected in order to cover states and
transitions on it
Software Engineering / Fernando Brito e Abreu 66 27-Sep-11
Test Techniques – Specification-based Testing from formal specifications
Giving the specifications in a formal language
allows for automatic derivation of functional
test cases
At the same time, provides a reference output, an
oracle, for checking test results
This is an active research topic
Software Engineering / Fernando Brito e Abreu 67 27-Sep-11
Test Techniques – Specification-based Random testing
Tests are generated in a stochastic (non-deterministic)
way
This form of testing falls under the heading of the
specification-based entry, since at least the input
domain must be known, to be able to pick random
points within it
Software Engineering / Fernando Brito e Abreu 68 27-Sep-11
Test Techniques – Specification-based Random testing
We simulate the data input by generating sequences
of values that may occur in practice
This process must be repeated on and on because, in the
long term, we can generate all possible input combinations
This approach is only feasible with a tool, a case test
generator - its input is some sort of description of
possible input values input, their sequence and
probability of occurrence
Software Engineering / Fernando Brito e Abreu 69 27-Sep-11
Test Techniques – Specification-based Random testing
Random tests are often used to test compilers,
through the generation of random programs
The description of possible input sequences can be made
with BNF (Backus Naur Form)
Random testing can also be used in testing
communications protocol software
The description of possible input sequences can be made
out of the state machines that describe each of the involved
parties
Software Engineering / Fernando Brito e Abreu 70 27-Sep-11
Test Techniques Code-based (aka white box)
Control-flow-based criteria
Data flow-based criteria
Software Engineering / Fernando Brito e Abreu 71 27-Sep-11
Test Techniques – Code-based Control-flow-based criteria
Several testing tools allow the generation of
Control Flow Graphs from source code.
By instrumenting source code these tools allow to
verify graphically the execution of each edge and
node in the network
Software Engineering / Fernando Brito e Abreu 72 27-Sep-11
Test Techniques – Code-based Control-flow-based criteria
The strongest control-flow-based criteria is path testing, which aims executing all entry-to-exit control flow paths in the flowgraph
Full path testing is generally not feasible because of loops
Software Engineering / Fernando Brito e Abreu 73 27-Sep-11
Test Techniques – Code-based Control-flow-based criteria
Control-flow-based coverage criteria is aimed at covering all the statements or blocks of statements in a program Several coverage criteria have been proposed, like
condition/decision coverage
A test battery coverage is the percentage of the total code (e.g. statements or branches/decisions coverage) which is exercised by that battery
Code coverage is a much less stringent criteria than path coverage
Case study: Graph-based control flow
testing techniques
Software Engineering / Fernando Brito e Abreu 75 27-Sep-11
Control flow graphs Are a graphical representation of programs that
traduces the ways they can be transversed
during execution
nodes represent decisions
oriented edges represent sets of sequential
instructions
In more complex code segments, the graph looks like
spaghetti - more tests are needed
Software Engineering / Fernando Brito e Abreu 76 27-Sep-11
Control flow graphs
Software Engineering / Fernando Brito e Abreu 77 27-Sep-11
Example: tax calculation
Consider an IRS tax system that reads annual
income revenues and determines the
corresponding tax due:
If the total income is less than 25K EUROS no tax is
deducted
If it is above that, but less than 100K EUROS, the tax
is 7%
otherwise is 15%
Software Engineering / Fernando Brito e Abreu 78 27-Sep-11
Example: tax calculation Function Calculates_Tax ( Int n)
Array of Int income;
Int total,tax;
1. total, tax = 0;
2. for i=1 to n
3. {read(income[i]);
4. total = total + income[i]};
5. if total >= 25000 then
6. tax = total * 0.07
7. else if total >= 100000 then
8. tax = total * 0.15;
9. return( tax)
12
3
4
5 6
7
8 9
2
5
7
9
Note: the problem solution is wrong,
because the condition for the 100K
EURO limit should be tested first.
This defect would be caught by
structural testing.
Software Engineering / Fernando Brito e Abreu 79 27-Sep-11
Example: how many test cases?
Based on graph theory, Tom McCabe proposed
the cyclomatic complexity metric that
expresses the minimum number of test cases for
100% test coverage: v(G) = # edges - # nodes + # inputs and outputs
In the current case we obtain:
11 - 9 + 2 = 4 (complete graph)
6 - 4 + 2 = 4 (reduced graph)
Therefore we should be able to produce 4 test cases
that when applied would lead to a 100% coverage.
Software Engineering / Fernando Brito e Abreu 80 27-Sep-11
Call graphs
Are a graphical representation of the
dependences of functions, procedures or
methods on each other
nodes (boxes) represent functions, methods, etc
oriented edges represent invocations made
This kind of white box testing is often used for
profiling execution snapshots
Software Engineering / Fernando Brito e Abreu 81 27-Sep-11
Call graph based testing
Software Engineering / Fernando Brito e Abreu 82 27-Sep-11
Call graph based testing
Software Engineering / Fernando Brito e Abreu 83 27-Sep-11
Call graph based testing
Colors are often used to
represent coverage
percentages
Software Engineering / Fernando Brito e Abreu 84 27-Sep-11
Assessing structural test coverage
The ReModeler tool from
the QUASAR team uses
a model-based approach
to represent this kind of
testing coverage
Each class or package is
colored according to the
percentage of executed
methods
Software Engineering / Fernando Brito e Abreu 85 27-Sep-11
Test Techniques – Code-based Data-flow-based criteria In data-flow-based testing, the control flowgraph is
annotated with information about how the program
variables are defined, used, and killed (undefined)
The strongest criterion, all definition-use paths, requires
that, for each variable, every control flow path segment
from a definition of that variable to a use of that
definition is executed
In order to reduce the number of paths required, weaker
strategies such as all-definitions and all-uses are
employed
Software Engineering / Fernando Brito e Abreu 86 27-Sep-11
Test Techniques Fault-based
With different degrees of formalization, fault-
based testing techniques devise test cases
specifically aimed at revealing categories of likely
or predefined faults
Two main techniques exist:
Error guessing
Mutation testing
Software Engineering / Fernando Brito e Abreu 87 27-Sep-11
Test Techniques – Fault-based Error guessing
In error guessing, test cases are specifically
designed by software engineers trying to figure
out the most plausible faults in a given program
A good source of information is the history of
faults discovered in earlier projects, as well as
the software engineer’s expertise
Software Engineering / Fernando Brito e Abreu 88 27-Sep-11
Test Techniques – Fault-based Mutation testing
A mutant is a slightly modified version of the program under test, differing from it by a small, syntactic change
Every test case exercises both the original and all generated mutants: if a test case is successful in identifying the difference between the program and a mutant, the latter is said to be “killed”
Originally conceived as a technique to evaluate a test set, mutation testing is also a testing criterion in itself: either tests are randomly generated until enough mutants have been killed, or tests are specifically designed to kill surviving mutants In the latter case, mutation testing can also be categorized as a code-based technique
The underlying assumption of mutation testing, the coupling effect, is that by looking for simple syntactic faults, more complex but real faults will be found
For the technique to be effective, a large number of mutants must be automatically derived in a systematic way.
Software Engineering / Fernando Brito e Abreu 89 27-Sep-11
Test Techniques Usage-based
Operational profile
Software Reliability Engineered Testing
Software Engineering / Fernando Brito e Abreu 90 27-Sep-11
Test Techniques – Usage-based Operational profile In testing for reliability evaluation, the test
environment must reproduce the operational environment of the software as closely as possible
The idea is to infer, from the observed test results, the future reliability of the software when in actual use
To do this, inputs are assigned a probability distribution, or profile, according to their occurrence in actual operation
Software Engineering / Fernando Brito e Abreu 91 27-Sep-11
Test Techniques – Usage-based Software Reliability Engineered Testing
Software Reliability Engineered Testing (SRET)
is a testing method encompassing the whole
development process, whereby testing is
“designed and guided by reliability objectives and
expected relative usage and criticality of different
functions in the field.”
Software Engineering / Fernando Brito e Abreu 92 27-Sep-11
Test Techniques Based on nature of application Object-oriented testing
Component-based testing
Web-based testing
GUI testing
Testing of concurrent programs
Protocol conformance testing
Testing of real-time systems
Testing of safety-critical systems
Software Engineering / Fernando Brito e Abreu 93 27-Sep-11
Test Techniques Selecting and combining techniques Specification-based and code-based test
techniques are often contrasted as functional vs.
structural testing
These two approaches to test selection are not to
be seen as alternative but rather complementary
in fact, they use different sources of information and
have proved to highlight different kinds of problems
they should be used in combination, depending on
budgetary considerations
Software Engineering / Fernando Brito e Abreu 94 27-Sep-11
Automatic Construction of Test Cases
Test generation is possible from: model-based specifications
algebraic (formal) specifications
Segmentation (“slicing”) and ramification
(“branch analysis”) techniques are used to
identify partitions
Software Engineering / Fernando Brito e Abreu 95 27-Sep-11
Automatic Construction of Test Cases TTCN (Tree and Tabular Combined Notation)
1983: ISO TC 97/SC 16 and later in ISO/IEC JTC 1/SC 21 and in CCITT SG VII as part of the work on OSI conformance testing methodology and framework Has been widely used since then for describing protocol
conformance test suites in standardization organizations such as ITU-T, ISO/IEC, ATM Forum, ETSI and industry
1998: TTCN-2, in ISO/IEC and in ITU-T New features: concurrency mechanism, concepts of module and package,
manipulation of ASN.1 encoding
TTCN-3
Software Engineering / Fernando Brito e Abreu 96 27-Sep-11
Automatic Construction of Test Cases TTCN (Tree and Tabular Combined Notation)
TTCN is a standardized test case format
The main characteristics of TTCN are that: its Tabular Notation allows its user to describe easily and
naturally in a tree form all possible scenarios of stimulus and various reactions to it between the tester and the target
its verdict system is designed such that to facilitate conformance judgment on the test result agrees against the test purpose, and
it provides a mechanism to describe appropriate constraints on received messages so that conformance of the received messages can be automatically evaluated against the test purpose
TTCN-3 example
The following is an example of an
Abstract Test Suite (ATS)
where we are trying to test a
weather service
The tester sends a request
consisting of a location, a date
and a kind of report to some on-
line weather service and receives
a response with confirmation of
the location and date along with
the temperature, the wind
velocity and the weather
conditions at this location
Software Engineering / Fernando Brito e Abreu 97 27-Sep-11
TTCN-3 example
A TTCN-3 ATS is always composed of four sections:
1. type definitions: data structures like in C but also an easy to use
concept of lists and sets
2. template definitions: A TTCN-3 template consists of two separate
concepts merged into one:
test data definition
test data matching rules
3. test cases definitions: specifies the sequences and alternatives of
sequences of messages sent and received to and from the System Under
Test (SUT)
4. test control definitions: defines the order of execution of various test
cases
Software Engineering / Fernando Brito e Abreu 98 27-Sep-11
Sample TTCN-3 Abstract Test Suite module SimpleWeather {
type record weatherRequest {
charstring location,
charstring date,
charstring kind
}
template weatherRequest
ParisWeekendWeatherRequest := {
location := "Paris",
date := "15/06/2006",
kind := "actual"
}
type record weatherResponse {
charstring location,
charstring date,
charstring kind,
integer temperature,
integer windVelocity,
charstring conditions
}
template weatherResponse ParisResponse := {
location := "Paris",
date := "15/06/2006",
kind := "actual",
temperature := (15..30),
windVelocity := (0..20),
conditions := "sunny"
}
Software Engineering / Fernando Brito e Abreu 99 27-Sep-11
Sample TTCN-3 Abstract Test Suite
type port weatherPort message {
in weatherResponse;
out weatherRequest;
}
type component MTCType {
port weatherPort weatherOffice;
}
testcase testWeather() runs on MTCType {
weatherOffice.send(ParisWeekendWeatherRequest);
alt {
[] weatherOffice.receive(ParisResponse) {
setverdict(pass)
}
[] weatherOffice.receive {
setverdict(fail)
}
}
}
control {
execute (testWeather())
}
}
Software Engineering / Fernando Brito e Abreu 100 27-Sep-11
Software Engineering / Fernando Brito e Abreu 101 27-Sep-11
Automatic Construction of Test Cases
Implies the resolution of several problems:
program decomposition (slicing)
classification of partitions found
selection of test paths
test case generation to exercise those paths
validation of generated cases
Last problem is solved by the construction of an oracle
(software) whose function is to find if, for a given test,
the program responds according to its specification.
Software Engineering / Fernando Brito e Abreu 102 27-Sep-11
Automatic Construction of Test Cases An example
Who? » Siemens + Swiss PTT
What? » SAMSTAG (Sdl And Msc
baSed Test cAse Generation)
How to model system & tests? Target system (SDL) Test scenarios (MSC)
SDL - Specification and Description
Language [ITU Z.100]
MSC - Message Sequence Chart
[ITU Z.120]
TTCN (Tree and Tabular Combined
Notation) [ISO/IEC JTC1/SC21]
Software Engineering / Fernando Brito e Abreu 103 27-Sep-11
Automatic Construction of Test Cases Some tools
Validator (Aonix)
SoftTest (?)
ObjectGEODE TestComposer
(Verilog)
TestFactory (Rational)
Software Engineering / Fernando Brito e Abreu 104 27-Sep-11
Summary
Software Testing Fundamentals
Test Levels
Test Techniques
Test-related Measures
Test Process
Software Engineering / Fernando Brito e Abreu 105 27-Sep-11
Test-related Measures Evaluation of the program under test
Program measurements to aid in planning and designing testing
To guide testing we may use measures based on: program size
E.g. SLOC or function points
program structure E.g. McCabe’s metrics or frequency with which modules
call each other
Software Engineering / Fernando Brito e Abreu 106 27-Sep-11
Test-related Measures Evaluation of the program under test Fault types, classification, and statistics
Testing literature is rich in classifications and taxonomies of faults
To make testing more effective, it is important to know: which types of faults could be found in the software under test
the relative frequency with which these faults have occurred in the past
This information can be very useful in making quality predictions, as well as for process improvement
Software Engineering / Fernando Brito e Abreu 107 27-Sep-11
Test-related Measures Evaluation of the program under test
Fault density
A program under test can be assessed by counting and
classifying the discovered faults by their types
For each fault class, fault density is measured as the
ratio between the number of faults found and the size of
the program
Software Engineering / Fernando Brito e Abreu 108 27-Sep-11
Test-related Measures Evaluation of the tests performed
Coverage/thoroughness measures
Several test adequacy criteria require that the test cases systematically exercise a set of elements identified in the program or in the specifications
To evaluate the thoroughness of the executed tests, testers can monitor the elements covered, so that they can dynamically measure the ratio between covered elements and their total number For example, it is possible to measure the percentage of covered branches
in the program flowgraph, or that of the functional requirements exercised among those listed in the specifications document
Code-based adequacy criteria require appropriate instrumentation of the program under test
Example: Static and dynamic metrics
used to guide white-box testing
Software Engineering / Fernando Brito e Abreu 110 27-Sep-11
Static Metrics Collection
Some examples collected by White-box tools:
– Number of private, protect and public attributes
– Overloading, overriding and visibility of operations
– Comments density (eg. JavaDoc comments per class)
– Inheritance metrics (ex: depth,width,inherited features)
– MOOSE metrics (Chidamber and Kemerer)
– MOOD metrics (Brito e Abreu)
– QMOOD metrics (Jagdish Bansiya)
Software Engineering / Fernando Brito e Abreu 111 27-Sep-11
Static Metrics - ex: Cantata++
Software Engineering / Fernando Brito e Abreu 112 27-Sep-11
Dynamic Metrics Collection
Class, Operation, Branch, Exception clause coverage
Example: Multiple Condition Coverage
Measures whether each combination of condition outcomes for a decision has been exercised; are f() and g() called in the following code extract? if ((a == b || f()) && (c == d || g())) x(); else y();
Note that the expression can be evaluated to true without
calling f() or g().
Software Engineering / Fernando Brito e Abreu 113 27-Sep-11
Test-related Measures Evaluation of the tests performed Fault seeding
Some faults are artificially introduced into the program before test
When the tests are executed, some of these seeded faults will be revealed, and possibly some faults which were already there will be as well depending on which of the artificial faults are discovered, and how many,
testing effectiveness can be evaluated, and the remaining number of genuine faults can be estimated
Problems: distribution and representativeness of seeded faults relative to original ones
small sample size on which any extrapolations are based
inserting faults into software involves the obvious risk of leaving them there
Software Engineering / Fernando Brito e Abreu 114 27-Sep-11
Test-related Measures Evaluation of the tests performed
Mutation score
In mutation testing, the ratio of killed mutants to
the total number of generated mutants can be a
measure of the effectiveness of the executed test
set
Software Engineering / Fernando Brito e Abreu 115 27-Sep-11
Summary
Software Testing Fundamentals
Test Levels
Test Techniques
Test-related Measures
Test Process
Software Engineering / Fernando Brito e Abreu 116 27-Sep-11
Test Process – Practical Considerations Attitudes / Egoless programming
A very important component of successful testing is a collaborative attitude towards testing and quality assurance activities
Managers have a key role in fostering a generally favorable reception towards failure discovery during development and maintenance for instance, by preventing a mindset of code ownership
among programmers, so that they will not feel responsible for failures revealed by their code
Software Engineering / Fernando Brito e Abreu 117 27-Sep-11
Test Process – Practical Considerations Test guides
The testing phases could be guided by various
aims:
in risk-based testing, which uses the product risks
to prioritize and focus the test strategy
in scenario-based testing, in which test cases are
defined based on specified software scenarios
Software Engineering / Fernando Brito e Abreu 118 27-Sep-11
Test Process – Practical Considerations Test documentation and work products
Documentation is an integral part of the formalization of the test process
Test documents may include: Test Plan
Test Design Specification
Test Procedure Specification
Test Case Specification
Test Log
Test Incident or Problem Report
Software Engineering / Fernando Brito e Abreu 119 27-Sep-11
Test Process – Practical Considerations Internal vs. independent test team
External members, may bring an unbiased, independent
perspective
Decision on internal, external or a blend of teams,
should be based upon considerations of:
cost
schedule
maturity levels of the involved organizations
criticality of the application
Software Engineering / Fernando Brito e Abreu 121 27-Sep-11
Test Process – Practical Considerations Cost/effort estimation and other process measures
Several measures related to the resources spent
on testing, as well as to the relative fault-finding
effectiveness of the various test phases, are
used by managers to control and improve the
test process, such as:
number of test cases specified
number of test cases executed
number of test cases passed
number of test cases failed
Software Engineering / Fernando Brito e Abreu 122 27-Sep-11
Test Process – Practical Considerations Cost/effort estimation and other process measures
Evaluation of test phase reports can be combined with root cause analysis to evaluate test process effectiveness in finding faults as early as possible Such an evaluation could be associated with the analysis of
risks
Moreover, the resources that are worth spending on testing should be commensurate with the use/criticality of the application: different techniques have different costs and yield different
levels of confidence in product reliability
Software Engineering / Fernando Brito e Abreu 123 27-Sep-11
Test Process – Practical Considerations Termination
A decision must be made as to how much testing is enough and when a test stage can be terminated
Thoroughness measures, such as … achieved code coverage
functional completeness
estimates of fault density or of operational reliability
… provide useful support, but are not sufficient in themselves
Software Engineering / Fernando Brito e Abreu 124 27-Sep-11
Test Process – Practical Considerations Termination
The decision also involves considerations about the
costs and risks incurred by the potential for remaining
failures, as opposed to the costs implied by continuing
to test
There are two possible approaches to this problem
Termination based on test efficiency
Termination based on test effectiveness
Software Engineering / Fernando Brito e Abreu 125 27-Sep-11
Test efficiency-based termination
To decide on test termination or to compare distinct V&V procedures and tools we need to know their Efficiency
walkthroughs, inspections, black-box, white-box ?
Efficiency = work produced / resources spent
» Test efficiency = defects found / effort spent
= benefit / cost
Software Engineering / Fernando Brito e Abreu 126 27-Sep-11
Test efficiency-based termination
As testing proceeds …
defect density decreases
test efficiency decreases - more and more
effort is spent (cost) to find new defects
(benefit)
reliability grows - probability that users
experience defect effects (failures) reduces
Software Engineering / Fernando Brito e Abreu 127 27-Sep-11
Case Study Testing Effort
Testing effort spent per week
0
500
1000
1500
2000
2500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Week
Cumulative Effort (Cost)
0
5000
10000
15000
20000
25000
30000
35000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Week
Software Engineering / Fernando Brito e Abreu 128 27-Sep-11
Case Study Defects Found
Defects found per week
0
50
100
150
200
250
300
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Week
Cumulative Defects (Benefit)
0
500
1000
1500
2000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Week
Software Engineering / Fernando Brito e Abreu 129 27-Sep-11
Benefit / Cost Ratio (Test Efficiency)
0
20
40
60
80
100
120
140
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Week
Cost / Benefit Ratio
0
500
1000
1500
2000
2500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Week
Case Study Test Efficiency
These ratios can be
used to set test
stopping thresholds
Software Engineering / Fernando Brito e Abreu 130 27-Sep-11
Test effectiveness-based termination
"Testing can only show the presence of bugs but never their absence"
Dijkstra
Is this statement correct ?
Software Engineering / Fernando Brito e Abreu 131 27-Sep-11
Test effectiveness-based termination
Test effectiveness allows to decide when tests
should be stopped
test plan should indicate that level (e.g. 90%)
Effectiveness = achieved effect / desired effect
» Test effectiveness = percentage of total defects found
Software Engineering / Fernando Brito e Abreu 132 27-Sep-11
Weekly % of Defects Found
(Weekly Test Effectiveness)
0%
2%
4%
6%
8%
10%
12%
14%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Week Cumulative % of Defects Found
(Cumulative Test Effectiveness)
0%
20%
40%
60%
80%
100%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Week
Test Effectiveness - Case Study
Conclusion: it is not
worth testing beyond a
certain point; that point
can be based on a
given effectiveness
threshold
Software Engineering / Fernando Brito e Abreu 133 27-Sep-11
Test Effectiveness To calculate it we need to know:
the total number of defects
or the number of remaining defects
total = found + remaining
Remaining defects can be known à posteriori
Simply wait by user action (not a good choice ...)
Even then, we have to set an observation period
Obs. period = f (system complexity, transaction rate)
some defects may only cause failures after intensive use
Software Engineering / Fernando Brito e Abreu 134 27-Sep-11
Defect Injection Technique
This technique allows to estimate remaining defects and
therefore obtain test effectiveness
1. A member of the development team (not necessarily the
producer) includes deliberately some defects in the target
system, neither condensed nor in a captious way.
2. He documents and describes the localization of injected
defects and delivers that information to the project leader.
3. The target system is passed on to the testing team.
4. Test process efficiency is verified through the percentage of
injected defects that were found.
5. Remaining defects (not injected) are then estimated
Software Engineering / Fernando Brito e Abreu 135 27-Sep-11
Defect Injection (continued)
Before the beginning of the test we have:
DOi Original Defects (unknown !)
DIi Injected Defects (known)
At all moments after the beginning of the test we have:
DOe Original defects found
DIe Injected defects found
DOr = DOi - Doe Original defects remaining (not found)
DIr = DIi - DIe Injected defects remaining (not found)
Software Engineering / Fernando Brito e Abreu 136 27-Sep-11
Defect Injection (continued)
Let:
ERO = DOe / DOi Effectiveness in Original Defects
Removal (unknown !)
ERI = DIe / DIi Effectiveness in Injected Defects Removal
(known !)
Considering ERO ERI which will be close to truth if the
number of injected defects is sufficiently large:
DOi = DOe / ERO DOe / ERI
DOr = DOi ( 1 - ERO ) = DOe ( 1 / ERO - 1 ) DOe ( 1 / ERI - 1 )
Software Engineering / Fernando Brito e Abreu 148 27-Sep-11
Test Process – Test activities Defect tracking
Detected defects can be analyzed to determine:
when they were introduced into the software
what kind of error caused them to be created
E.g. poorly defined requirements, incorrect variable
declaration, memory leak, programming syntax error, …
when they could have been first observed in the
software
Software Engineering / Fernando Brito e Abreu 149 27-Sep-11
Test Process – Test activities Defect tracking
Defect-tracking information is used to determine what aspects of software engineering need improvement and how effective previous analyses and testing have been
This causal analysis allows introducing prevention actions
Prevention is better than the cure and is a typical characteristic of higher levels of maturity in the software development process
Software Engineering / Fernando Brito e Abreu 150 27-Sep-11
Defect prevention in CMMI
Software Engineering / Fernando Brito e Abreu 151 27-Sep-11
Bibliography [Bec02] K. Beck, Test-Driven
Development by Example, Addison-Wesley, 2002.
[Bei90] B. Beizer, Software Testing Techniques, International Thomson Press, 1990, Chap. 1-3, 5, 7s4, 10s3, 11, 13.
[Jor02] P. C. Jorgensen, Software Testing: A Craftsman's Approach, second edition, CRC Press, 2004, Chap. 2, 5-10, 12-15, 17, 20.
[Kan99] C. Kaner, J. Falk, and H.Q. Nguyen, Testing Computer Software, 2nd ed., John Wiley & Sons, 1999, Chaps. 1, 2, 5-8, 11-13, 15.
[Kan01] C. Kaner, J. Bach, and B. Pettichord, Lessons Learned in Software Testing, Wiley Computer Publishing, 2001.
[Lyu96] M.R. Lyu, Handbook of Software Reliability Engineering, Mc-Graw-Hill/IEEE, 1996, Chap. 2s2.2, 5-7.
[Per95] W. Perry, Effective Methods for Software Testing, John Wiley & Sons, 1995, Chap. 1-4, 9, 10-12, 17, 19-21.
[Pfl01] S. L. Pfleeger, Software Engineering: Theory and Practice, 2nd ed., Prentice Hall, 2001, Chap. 8, 9.
[Zhu97] H. Zhu, P.A.V. Hall and J.H.R. May, “Software Unit Test Coverage and Adequacy,” ACM Computing Surveys, vol. 29, iss. 4 (Sections 1, 2.2, 3.2, 3.3), Dec. 1997, pp. 366-427.
Software Engineering / Fernando Brito e Abreu 152 27-Sep-11
Applicable standards (IEEE610.12-90) IEEE Std 610.12-
1990 (R2002), IEEE Standard Glossary of Software Engineering Terminology, IEEE, 1990.
(IEEE829-98) IEEE Std 829-1998, Standard for Software Test Documentation, IEEE, 1998.
(IEEE982.1-88) IEEE Std 982.1-1988, IEEE Standard Dictionary of Measures to Produce Reliable Software, IEEE, 1988.
(IEEE1008-87) IEEE Std 1008-1987 (R2003), IEEE Standard for Software Unit Testing, IEEE, 1987.
(IEEE1044-93) IEEE Std 1044-1993 (R2002), IEEE Standard for the Classification of Software Anomalies, IEEE, 1993.
(IEEE1228-94) IEEE Std 1228-1994, Standard for Software Safety Plans, IEEE, 1994.
(IEEE12207.0-96) IEEE/EIA 12207.0-1996 // ISO/IEC12207:1995, Industry Implementation of Int. Std. ISO/IEC 12207:95, Standard for Information Technology-Software Life Cycle Processes, IEEE, 1996.
Software Engineering / Fernando Brito e Abreu 153 27-Sep-11
Black-Box Tools - Web Links JavaStar (http://www.sun.com/workshop/testingtools/javastar.html)
JavaLoad (http://www.sun.com/workshop/testingtools/javaload.html)
VisualTest, Scenario Recorder, Test Suite Manager (http://www.rational.com/)
SoftTest (http://www.softtest.com/pages/prod_st.htm)
AutoTester (http://www.autotester.com/)
WinRunner (http://www.merc-int.com/products/winrunguide.html)
LoadRunner (http://www.merc-int.com/products/loadrunguide.html)
QuickTest (http://www.mercury.com)
TestComplete (http://www.automatedqa.com)
S-Unit test framework (http://sunit.sourceforge.net)
eValid™ Automated Web Testing Suite (http://www.soft.com/eValid/)
Software Engineering / Fernando Brito e Abreu 154 27-Sep-11
White-Box Tools - Web Links JavaScope
(http://www.sun.com/workshop/testingtools/javascope.html)
JavaSpec ( http://www.sun.com/workshop/testingtools/javaspec.html )
Cantata++ ( http://www.iplbath.com/ )
PureCoverage, Quantify, Purify ( http://www.rational.com/ )
LDRA ( http://www.luna.co.uk/~elverex/ldratb.htm )
McCabe Test (http://www.mccabe.com/?file=./prod/test/data.html )
ATTOL Coverage ( http://www.attol-testware.com/coverage.htm )
Quality Works ( http://www.segue.com )
Panorama (http://www.softwareautomation.com/)