Upload
lynette-lane
View
224
Download
0
Tags:
Embed Size (px)
Citation preview
Rob OshanaSouthern Methodist
University
SoftwareTesting
Why do we Test ?
• Assess Reliability
• Detect Code Faults
Industry facts
Software testing accounts for 50% of pre-release costs,and 70% of post-release costs [Cigital
Corporation]
30-40% of errors detected after deployment are run-time errors [U.C. Berkeley, IBM’s TJ Watson
Lab]
The amount of software in a typical device doubles every 18 months [Reme Bourguignon, VP of Philips
Holland]
Defect densities are stable over the last 20 years : 0.5 - 2.0 sw failures / 1000 lines [Cigital
Corporation]
Critical SW Applications
Critical software applications which have failed :
Mariner 1 NASA 1962Missing ‘-’ in ForTran code Rocket bound for Venus destroyed
Therac 25 Atomic Energy of Canada Ltd 1985-87Data conversion error Radiation therapy machine for cancer
Long Distance Service AT&T 1990A single line of bad code Service outages up to nine hours long
Patriot Missiles U.S. military 1991Endurance errors in tracking system 28 US soldiers killed in barracks
Tax Calculation Program InTuit 1995Incorrect results SW vendor payed tax penalties for users
Good and successful testing
• What is a good test case?
• A goodgood test case has a high probability of finding an as-yet undiscovered errorundiscovered error
• What is a successful test case?
• A successful testsuccessful test is one that uncovers an as-yet undiscovered errorundiscovered error
Understands the systembut, will test “gently”and, is driven by “delivery”
Must learn about the system, but, will attempt to break it and, is driven by quality
developer independent tester
Who tests the software better ?
Testability – can you develop a program for testability?
• Operability - “The better it works, the more efficiently it can be tested”
• Observability - the results are easy to see, distinct output is generated for each input, incorrect output is easily identified
• Controllability - processing can be controlled, tests can be automated & reproduced
• Decomposability - software modules can be tested independently
• Simplicity - no complex architecture and logic
• Stability - few changes are requested during testing
• Understandability - program is easy to understand
Did You Know...
• Testing/Debugging can worsen reliability?
• We often chase the wrong bugs?
• Testing cannot show the absence of faults, only the existence?
• The cost to develop software is directly proportional to the cost of testing?– Y2K testing cost $600 billion
Did you also know...
• The most commonly applied software testing techniques (black box and white box) were developed back in the 1960’s
• Most Oracles are human (error prone)!!
• 70% of safety critical code can be exceptions– this is the last code written!
Testing Problems
• Time
• Faults hides from tests
• Test Management costs
• Training Personnel
• What techniques to use
• Books and education
“Errors are more common, more pervasive, and more troublesome in software than with other technologies”
David Parnas
What is testing?
• How does testing software compare with testing students?
What is testing?
• “Software testing is the process of comparing the invisible to the ambiguous as to avoid the unthinkable.” James Bach, Borland corp.
What is testing?
• Software testing is the process of predicting the behavior of a product and comparing that prediction to the actual results." R. Vanderwall
Purpose of testing
• Build confidence in the product
• Judge the quality of the product
• Find bugs
Finding bugs can be difficult
x
x
x
x
x
x
x
x
x
x
x
Mine field
A path through themine field (use case) A path through the
mine field (use case)
Why is testing important?
• Therac25: Cost 6 lives
• Ariane 5 Rocket: Cost $500M
• Denver Airport: Cost $360M
• Mars missions, orbital explorer & polar lander: Cost $300M
Why is testing so hard?
Reasons for customer reported bugs
• User executed untested code• Order in which statements were executed
in actual use different from that during testing
• User applied a combination of untested input values
• User’s operating environment was never tested
Interfaces to your software
• Human interfaces
• Software interfaces (APIs)
• File system interfaces
• Communication interfaces– Physical devices (device drivers)– controllers
Selecting test scenarios
• Execution path criteria (control)– Statement coverage– Branching coverage
• Data flow – Initialize each data structure– Use each data structure
• Operational profile • Statistical sampling….
What is a bug?
• Error: mistake made in translation or interpretation ( many taxonomies exist to describe errors)
• Fault: manifestation of the error in implementation (very nebulous)
• Failure: observable deviation in behavior of the system
Example
• Requirement: “print the speed, defined as distance divided by time”
• Code: s = d/t; print s
Example
• Error; I forgot to account for t = 0
• Fault: omission of code to catch t=0
• Failure: exception is thrown
Severity taxonomy
• Mild - trivial
• Annoying - minor
• Serious - major
• Catastrophic - Critical
• Infectious - run for the hills
What is your taxonomy ?
IEEE 1044-1993
Life cycle
Requirements
Design
Code
Testing
error
error
error
Errors can be introduced ateach of these stages
Resolve
Isolate
Classify
error
error
error
error
Testing and repair process can bejust as error prone as the developmentProcess (more so ??)
Ok, so lets just design our systems with “testability” in
mind…..
Testability
• How easily a computer program can be tested (Bach)
• We can relate this to “design for testability” techniques applied in hardware systems
JTAG
A standard Integrated Circuit
CoreIC
Logic
Test access portcontroller
Test mode Select (TMS)
Test clock (TCK)
Test data out (TDO)
Test data in (TDI)
BoundaryScan cells
BoundaryScan path
I/O pads
Data in
Data out
TDI TDOcell
Operability
• “The better it works, the more efficiently it can be tested”– System has few bugs (bugs add
analysis and reporting overhead)– No bugs block execution of tests– Product evolves in functional stages
(simultaneous development and testing)
Observability
• “What you see is what you get”– Distinct output is generated for each input– System states and variables are visible and
queriable during execution– Past system states are ….. (transaction logs)– All factors affecting output are visible
Observability
– Incorrect output is easily identified– Internal errors are automatically
detected through self-testing mechanisms
– Internal errors are automatically reported
– Source code is accessible
Visibility Spectrum
DSPvisibility
GPPvisibility
Factoryvisibility
End customervisibility
Controllability
• “The better we can control the software, the more the testing can be automated and optimized”– All possible outputs can be generated
through some combination of input– All code is executable through some
combination of input
Controllability
– SW and HW states and variables can be controlled directly by the test engineer
– Input and output formats are consistent and structured
Decomposability
• “By controlling the scope of testing, we can more quickly isolate problems and perform smarter testing”– The software system is built from
independent modules– Software modules can be tested
independently
Simplicity
• “The less there is to test, the more quickly we can test it”– Functional simplicity (feature set is
minimum necessary to meet requirements)
– Structural simplicity (architecture is modularized)
– Code simplicity (coding standards)
Stability
• “The fewer the changes, the fewer the disruptions to testing”– Changes to the software are infrequent,
controlled, and do not invalidate existing tests
– Software recovers well from failures
Understandability
• “The more information we have, the smarter we will test”– Design is well understood– Dependencies between external, internal, and
shared components are well understood– Technical documentation is accessible, well
organized, specific and detailed, and accurate
“Bugs lurk in corners and congregate at boundaries”
Boris Beizer
Types of errors
• What is a Testing error?– Claiming behavior is erroneous when it
is in fact correct– ‘fixing’ this type of error actually breaks
the product
Errors in classification
• What is a Classification error ?– Classifying the error into the wrong
category
• Why is this bad ?– This puts you on the wrong path for a
solution
Example Bug Report
• “Screen locks up for 10 seconds after ‘submit’ button is pressed”
• Classification 1: Usability Error • Solution may be to catch user events and
present an hour-glass icon• Classification 2: Performance error• solution may be a modification to a sort
algorithm (or visa-versa)
Isolation error
• Incorrectly isolating the erroneous modules
• Example: consider a client server architecture. An improperly formed client request results in an improperly formed server response
• The isolation determined (incorrectly) that the server was at fault and was changed
• Resulted in regression failure for other clients
Resolve errors
• Modifications to remediate the failure are themselves erroneous
• Example: Fixing one fault may introduce another
What is the ideal test case?
• Run one test whose output is "Modify line n of module i."
• Run one test whose output is "Input Vector v produces the wrong output"
• Run one test whose output is "The program has a bug" (Useless, we know this)
More realistic test case
• One input vector and expected output vector– A collection of these make of a Test Suite
• Typical (naïve) Test Case– Type or select a few inputs and observe output– Inputs not selected systematically– Outputs not predicted in advance
Test case definition
• A test case consists of;– an input vector– a set of environmental conditions– an expected output.
• A test suite is a set of test cases chosen to meet some criteria (e.g. Regression)
• A test set is any set of test cases
Testing Software Intensive Systems
V&V
• Verification– are we building the product right?
• Validation– are we building the right product?– is the customer satisfied?
• How do we do it?• Inspect and Test
What do we inspect and test?
• All work products!
• Scenarios
• Requirements
• Designs
• Code
• Documentation
Defect Testing
A Testing Test
• Problem– A program reads three integer values from
the keyboard separated by spaces. The three values are interpreted as representing the lengths of the sides of a triangle. The program prints a message that states whether the triangle is scalene, isosceles or equilateral.
• Write a set of test cases to adequately test this program
Static and Dynamic V&V
Requirementsspecification
High-Leveldesign
Detaileddesign
Program
Static Verification
Dynamic Verification
Prototype
Techniques
• Static Techniques– Inspection– Analysis– Formal verification
• Dynamic Techniques– Testing
SE-CMMPA 07: Verify & Validate
System• Verification: perform comprehensive
evaluations to ensure that all work products meet requirements– Address all work products: from user
needs an expectations through production and maintenance
• Validation - meeting customer needs - continues throughout product lifecycle
V&V Base Practices
• Establish plans for V&V– objectives, resources, facilities, special equipment– come up with master test plan
• Define the work products to be tested (requirements, design, code) and the methods (reviews, inspections, tests) that will be used to verify
• Define verification methods– test case input, expected results, criteria– connect requirements to tests
V&V Base Practices...• Define how to validate the system
– includes customer as user/operator– test conditions– test environment– simulation conditions
• Perform V&V and capture results– inspection results; test results; exception reports
• Assess success– compare results against expected results– success or failure?
Testing is...
• The process of executing a program with the intent of finding defects
• This definition implies that testing is a destructive process - often going against the grain of what software developers do, i.e.. construct and build software
• A successful test run is NOT one in which no errors are found
Test Cases
• A successful test case finds an error
• An unsuccessful test case is one that causes the program to produce the correct result
• Analogy: feeling ill, going to the doctor, paying $300 for a lab test only to be told that you’re OK!
Testing demonstrates the presence not the absence of
faults
Iterative Testing Process
Acceptancetesting
Sub systemtesting
Unittesting
Moduletesting
Systemtesting
Component TestingIntegration Testing
User Testing
It is impossible to completely test a program
Testing and Time
• Exhaustive testing is impossible for any program with low to moderate complexity
• Testing must focus on a subset of possible test cases
• Test cases should be systematically derived, not random
Testing Strategies• Top Down testing
– use with top-down programming; stubs required; difficult to generate output
• Bottom Up testing– requires driver programs; often combined with top-down testing
• Stress testing– test system overload; often want system to fail-soft rather than
shut down– often finds unexpected combinations of events
Test-Support Tools
• Scaffolding– code created to help test the software
• Stubs– a dummied-up low-level routine so it
can be called by a higher level routine
Stubs Can Vary in Complexity
• Return, no action taken• Test the data fed to it• Print/echo input data• Get return values from interactive input• Return standard answer• Burn up clock cycles• Function as slow, fat version of ultimate
routine
Driver Programs
• Fake (testing) routine that calls other real routines
• Drivers can:– call with fixed set of inputs
– prompt for input and use it
– take arguments from command line
– read arguments from file
• main() can be a driver - then “remove” it with preprocessor statements. Code is unaffected
System Tests Should be Incremental
A
B
test 1test 2
Modules
System Tests Should be Incremental
A
B
test 1test 2
test 3C
Modules
System Tests Should be Incremental
A
B
test 1test 2
test 3test 4C
D
Modules
Not Big-Bang
Approaches to Testing• White Box testing
– based on the implementation - the structure of the code; also called structural testing
• Black Box testing– based on a view of the program as a function of
Input and Output; also called functional testing
• Interface Testing– derived from program specification and
knowledge of interfaces
White Box (Structural) Testing
• Testing based on the structure of the code
…
if x then j = 2else k = 5…...
start with actual program code
White Box (Structural) Testing
• Testing based on the structure of the code
…
if x then j = 2else k = 5…...
Test data
Tests
Test output
White Box Technique:Basis Path Testing
• Objective-– test every independent execution path through
the program• If every independent path has been executed
then every statement will be executed• All conditional statements are tested for
both true and false conditions• The starting point for path testing is the flow
graph
Flow Graphs
if then else
Flow Graphs
if then else loop while
Flow Graphs
if then else loop while case of
1)j = 2;
2) k = 5;
3) read (a);
4) if a=2
5) then j = a
6) else j = a*k;
7) a = a + 1;
8) j = j + 1;
9) print (j);
How many paths thru this program?
1) j = 2;
2) k = 5;
3) read (a);
4) if a=2
5) then j = a
6) else j = a*k;
7) a = a + 1;
8) j = j + 1;
9) print (j);
1, 2, 3
4
5 6
7,8,9
How many paths thru this program?
How Many Independent Paths?
• An independent path introduces at least one new statement or condition to the collection of already existing independent paths
• Cyclomatic Complexity (McCabe)
• For programs without GOTOs,Cyclomatic Complexity
= Number of decision nodes + 1
also called predicate nodes
The Number of Paths
• Cyclomatic Complexity gives an upper bound on the number of tests that must be executed in order to cover all statements
• To test each path requires– test data to trigger the path– expected results to compare against
1) j = 2;
2) k = 5;
3) read (a);
4) if a=2
5) then j = a
6) else j = a*k;
7) a = a + 1;
8) j = j + 1;
9) print (j);
1, 2, 3
4
5 6
7,8,9
1) j = 2;
2) k = 5;
3) read (a);
4) if a=2
5) then j = a
6) else j = a*k;
7) a = a + 1;
8) j = j + 1;
9) print (j);
1, 2, 3
4
5 6
7,8,9
Test 1input: 2
expected output: 3
1) j = 2;
2) k = 5;
3) read (a);
4) if a=2
5) then j = a
6) else j = a*k;
7) a = a + 1;
8) j = j + 1;
9) print (j);
1, 2, 3
4
5 6
7,8,9
Test 1input: 2
expected output: 3
Test 2input: 10
expected output: 51
What Does Statement Coverage Tell You?
• All statements have been executed at least once
so?
What Does Statement Coverage Tell You?
• All statements have been executed at least once
Coverage testing may lead to the false illusion that the software has been comprehensively tested
The Downside of Statement Coverage
• Path testing results in the execution of every statement
• BUT, not all possible combinations of paths thru the program
• There are an infinite number of possible path combinations in programs with loops
The Downside of Statement Coverage
• The number of paths is usually proportional to program size making it useful only at the unit test level
Black Box Testing
Forget the code details!
Forget the code details!
Treat the program as a
Black Box
In Out
Black Box Testing
• Aim is to test all functional requirements
• Complementary, not a replacement for White Box Testing
• Black box testing typically occurs later in the test cycle than white box testing
Defect Testing Strategy
Input Test Data Locate inputs
causingerroneous output
OutputOutput indicating defects
Black Box Techniques
• Equivalence partitioning
• Boundary value testing
Equivalence Partitioning
• Data falls into categories
• Positive and Negative Numbers
• Strings with & without blanks
• Programs often behave in a comparable way for all values in a category -- also called an equivalence class
invalid input valid input
System
invalid input valid input
System
Choose test cases from partitions
Specification determines Equivalence Classes
• Program accepts 4 to 8 inputs
• Each is 5 digits greater than 10000
Specification determines Equivalence Classes
• Program accepts 4 to 8 inputs
• Each is 5 digits greater than 10000
less than 4 4 thru 8 more than 8
Specification determines Equivalence Classes
• Program accepts 4 to 8 inputs
• Each is 5 digits greater than 10000
less than 4 4 thru 8 more than 8
less than 10000
10000 thru 99999
more than 99999
Specification determines Equivalence Classes
• Program accepts 4 to 8 inputs
• Each is 5 digits greater than 10000
less than 4 4 thru 8 more than 8
less than 10000
10000 thru 99999
more than 99999
Specification determines Equivalence Classes
• Program accepts 4 to 8 inputs
• Each is 5 digits greater than 10000
less than 4 4 thru 8 more than 8
less than 10000
10000 thru 99999
more than 99999
Boundary Value Analysis • Complements equivalence partitioning• Select test cases at the boundaries of a
class• Range Boundary a..b
– test just below a and just above b
• Input specifies 4 values– test 3 and 5
• Output that is limited should be tested above and below limits
Other Testing Strategies
• Array testing
• Data flow testing
• GUI testing
• Real-time testing
• Documentation testing
Arrays
• Test software with arrays of one value
• Use different arrays of different sizes in different tests
• Derive tests so that the first, last and middle elements are tested
Data Flow testing
• Based on the idea that data usage is at least as error-prone as control flow
• Boris Beizer claims that at least half of all modern programs consist of data declarations and initializations
Data Can Exist in One of Three States
• Defined– initialized, not used
a = 2
• Usedx = a * b + c;
z = sin(a)
• Killedfree (a)– end of for loop or block where is was defined
Entering & Exiting
• Terms describing context of a routine before doing something to a variable
• Entered– control flow enter the routine before
variable is acted upon
• Exited– control flow leaves routine immediately
after variable is acted upon
Data Usage Patterns
• Normal– define variable; use one or more times;
perhaps killed
• Abnormal Patterns– Defined-Defined– Defined-Exited
• if local, why?
– Defined-Killed• wasteful if not strange
More Abnormal Patterns
• Entered-Killed• Entered-Used
– should be defined before use
• Killed-Killed– double kills are fatal for pointers
• Killed-Used– why really are you using?
• Used-Defined– what’s its value?
Define-Use Testingif (condition-1)
x = a;
elsex = b;
if (condition-2)y = x + 1;
elsey = x - 1;
Path Testing
Test1: condition-1 TRUE
condition-2 TRUE
Test2: condition-1 FALSE
condition-2 FALSE
WILL EXERCISE EVERY LINE OF CODE … BUT will NOT test the DEF-USE combinations
x=a / y = x-1
x=b/ y = x + 1
GUIs
• Are complex to test because of their event driven character
• Windows– move, resized and scrolled– regenerate when overwritten and then
recalled– menu bars change when window is active– multiple window functionality available?
GUI.. Menus
• Menu bars in context?
• Submenus - listed and working?
• Are names self explanatory?
• Is help context sensitive?
• Cursor changes with operations?
Testing Documentation
• Great software with lousy documentation can kill a product
• Documentation testing should be part of every test plan
• Two phases– review for clarity– review for correctness
Documentation Issues
• Focus on functional usage?• Are descriptions of interaction
sequences accurate• Examples should be used• Is it easy to find how to do something• Is there trouble shooting section• Easy to look up error codes• TOC and index
Real-Time Testing
• Needs white box and black box PLUS– consideration of states, events,
interrupts and processes
• Events will often have different effects depending on state
• Looking at event sequences can uncover problems
Real-Time Issues• Task Testing
– test each task independently
• Event testing– test each separately; then in context of state
diagrams;– scenario sequences and random sequences
• Intertask testing– Ada rendezvous– message queuing; buffer overflow
Other Testing Terms
• Statistical testing– running the program against expected usage
scenarios
• Regression testing– retesting the program after modification
• Defect testing– trying to find defects (aka bugs)
• Debugging• the process of discovering and removing defects
Summary V&V
• Verification– Are we building the system right?
• Validation– Are we building the right system?
• Testing is part of V&V• V&V is more than testing... • V&V is plans, testing, reviews, methods,
standards, and measurement
Testing Principles
• The necessary part of a test case is a definition of the expected output or result– the eye often sees what it wants to see
• Programmers should avoid testing their own code
• Organizations should not test their own programs
• Thoroughly inspect the results of each test
Testing Principles
• Test invalid as well as valid conditions
• The portability of errors in a section code is proportional to the number of errors already found there
Testing Principles• Tests should be traceable to customer
requirements• Tests should be planned before testing begins• The Pareto principle applies - 80% of all errors
is in 20% of the code• Begin small, scale up• Exhaustive testing is not possible• The best testing is done by a 3rd party
Guidelines• Testing capabilities is more important than
testing components– users have a job to do; tests should focus on
things that interfere with getting the job done, not minor irritations
• Testing old capabilities is more important then testing new features
• Testing typical situations is more important than testing boundary conditions
System Testing
Ian Summerville
System Testing
• Testing the system as a whole to validate that it meets its specification and the objectives of its users
Development testing
• Hardware and software components should be tested;– as they are developed
– as sub-systems are created.
• These testing activities include:– Unit testing. – Module testing– Sub-system testing
Development testing
• These tests do not cover:– Interactions between components or
sub-systems where the interaction causes the system to behave in an unexpected way
– The emergent properties of the system
System testing• Testing the system as a whole instead of
individual system components• Integration testing
– As the system is integrated, it is tested by the system developer for specification compliance
• Stress testing– The behavior of the system is tested under
conditions of load
System testing
• Acceptance testing– The system is tested by the customer to
check if it conforms to the terms of the development contract
• System testing reveals errors which were undiscovered during testing at the component level
System Test Flow
Requirementsspecification
Systemdesign
Detaileddesign
Acceptancetest plan
SystemIntegrationtest plan
Sub-systemIntegrationtest plan
Service Acceptancetest
SystemIntegrationtest
Sub-systemIntegrationtest
Unit code and test
Systemspecification
Integration testing• Concerned with testing the system
as it is integrated from its components
• Integration testing is normally the most expensive activity in the systems integration process
Integration testing
• Should focus on – Interface testing where the interactions
between sub-systems and components are tested
– Property testing where system properties such as reliability, performance and usability are tested
Integration Test Planning• Integration testing is complex and time-
consuming and planning of the process is essential
• The larger the system, the earlier this planning must start and the more extensive it must be
• Integration test planning may be the responsibility of a separate IV&V (verification and validation) team– or a group which is separate from the development
team
Test planning activities• Identify possible system tests using the
requirements document
• Prepare test cases and test scenarios to run these system tests
• Plan the development, if required, of tools such as simulators to support system testing
• Prepare, if necessary, operational profiles for the system
• Schedule the testing activities and estimate testing costs
Interface Testing
• Within a system there may be literally hundreds of different interfaces of different types. Testing these is a major problem.
• Interface tests should not be concerned with the internal operation of the sub-system although they can highlight problems which were not discovered when the sub-system was tested as an independent entity.
Two levels of interface testing
• Interface testing during development when the developers test what they understand to be the sub-system interface
• Interface testing during integration where the interface, as understood by the users of the subsystem, is tested.
Two levels of interface testing
• What developers understand as the system interface and what users understand by this are not always the same thing.
Interface Testing
TestCases
A B
C
Interface Problems
• Interface problems often arise because of poor communications within the development team or because of poor change management procedures
• Typically, an interface definition is agreed but, for good reasons, this has to be chanegd during development
Interface Problems
• To allow other parts of the system to cope with this change, they must be informed of it
• It is very common for changes to be made and for potential users of the interface to be unaware of these changes – problems arise which emerge during
interface testing
What is an interface?• An agreed mechanism for
communication between different parts of the system
• System interface classes– Hardware interfaces
• Involving communicating hardware units
– Hardware/software interfaces• Involving the interaction between hardware
and software
What is an interface?
– Software interfaces• Involving communicating software
components or sub-systems
– Human/computer interfaces• Involving the interaction of people and the
system
– Human interfaces• Involving the interactions between people in
the process
Hardware interfaces
• Physical-level interfaces– Concerned with the physical connection of
different parts of the system e.g. plug/socket compatibility, physical space utilization, wiring correctness, etc.
• Electrical-level interfaces– Concerned with the electrical/electronic
compatibility of hardware units i.e. can a signal produced by one unit be processed by another unit
Hardware interfaces
• Protocol-level interfaces– Concerned with the format of the
signals communicated between hardware units
Software interfaces
• Parameter interfaces– Software units communicate by setting
pre-defined parameters
• Shared memory interfaces– Software units communicate through a
shared area of memory– Software/hardware interfaces are
usually of this type
Software interfaces
• Procedural interfaces– Software units communicate by calling
pre-defined procedures
• Message passing interfaces– Software units communicate by passing
messages to each other
Parameter Interfaces
Subsystem 1 Subsystem 2
Parameterlist
Shared Memory Interfaces
Shared memory area
SS1 SS2 SS3
Procedural Interfaces
Subsystem 1 Subsystem 2Defined procedures(API)
Message Passing Interfaces
Subsystem 1 Subsystem 2
Exchangedmessages
Interface errors
• Interface misuse– A calling component calls another
component and makes an error in its use of its interface e.g. parameters in the wrong order
• Interface misunderstanding– A calling component embeds
assumptions about the behavior of the called component which are incorrect
Interface errors
• Timing errors– The calling and called component
operate at different speeds and out-of-date information is accessed
Stress testing• Exercises the system beyond its
maximum design load– The argument for stress testing is that system
failures are most likely to show themselves at the extremes of the system’s behavior
• Tests failure behavior– When a system is overloaded, it should
degrade gracefully rather than fail catastrophically
Stress testing
• Particularly relevant to distributed systems– As the load on the system increases, so
too does the network traffic. At some stage, the network is likely to become swamped and no useful work can be done
Acceptance testing
• The process of demonstrating to the customer that the system is acceptable
• Based on real data drawn from customer sources. The system must process this data as required by the customer if it is to be acceptable
Acceptance testing
• Generally carried out by customer and system developer together
• May be carried out before or after a system has been installed
Performance testing• Concerned with checking that the
system meets its performance requirements
• Number of transactions processed per second– Response time to user interaction– Time to complete specified operations
Performance testing
• Generally requires some logging software to be associated with the system to measure its performance
• May be carried out in conjunction with stress testing using simulators developed for stress testing
Reliability testing• The system is presented with a large number
of ‘typical’ inputs and its response to these inputs is observed
• The reliability of the system is based on the number of incorrect outputs which are generated in response to correct inputs
• The profile of the inputs (the operational profile) must match the real input probabilities if the reliability estimate is to be valid
Security testing
• Security testing is concerned with checking that the system and its data are protected from accidental or malicious damage
• Unlike other types of testing, this cannot really be tested by planning system tests. The system must be secure against unanticipated as well as anticipated attacks
Security testing
• Security testing may be carried out by inviting people to try to penetrate the system through security loopholes
Some Costly and Famous Software Failures
Mariner 1 Venus probe loses its way: 1962
Mariner 1
• A probe launched from Cape Canaveral was set to go to Venus
• After takeoff, the unmanned rocket carrying the probe went off course, and NASA had to blow up the rocket to avoid endangering lives on earth
• NASA later attributed the error to a faulty line of Fortran code
Mariner 1
• “... a hyphen had been dropped from the guidance program loaded aboard the computer, allowing the flawed signals to command the rocket to veer left and nose down…
• The vehicle cost more than $80 million, prompting Arthur C. Clarke to refer to the mission as "the most expensive hyphen in history."
Therac 25 Radiation Machine
Radiation machine kills four: 1985 to 1987
• Faulty software in a Therac-25 radiation-treatment machine made by Atomic Energy of Canada Limited (AECL) resulted in several cancer patients receiving lethal overdoses of radiation
• Four patients died
Radiation machine kills four: 1985 to 1987
• A lesson to be learned from the Therac-25 story is that focusing on particular software bugs is not the way to make a safe system,”
• "The basic mistakes here involved poor software engineering practices and building a machine that relies on the software for safe operation."
AT&T long distance service fails
AT&T long distance service fails: 1990
• Switching errors in AT&T's call-handling computers caused the company's long-distance network to go down for nine hours, the worst of several telephone outages in the history of the system
• The meltdown affected thousands of services and was eventually traced to a single faulty line of code
Patriot missile
Patriot missile misses: 1991
• The U.S. Patriot missile's battery was designed to head off Iraqi Scuds during the Gulf War
• System also failed to track several incoming Scud missiles, including one that killed 28 U.S. soldiers in a barracks in Dhahran, Saudi Arabia
Patriot missile misses: 1991• The problem stemmed from a software error
that put the tracking system off by 0.34 of a second
• System was originally supposed to be operated for only 14 hours at a time– In the Dhahran attack, the missile battery had
been on for 100 hours– errors in the system's clock accumulated to the
point that the tracking system no longer functioned
Pentium chip
Pentium chip fails math test: 1994
• Pentium chip gave incorrect answers to certain complex equations– bug occurred rarely and affected only a tiny
percentage of Intel's customers
• Intel offered to replace the affected chips, which cost the company $450 million
• Intel then started publishing a list of known "errata," or bugs, for all of its chips
New Denver airport
New Denver airport misses its opening: 1995
• The Denver International Airport was intended to be a state-of-the-art airport, with a complex, computerized baggage-handling system and 5,300 miles of fiber-optic cabling
• Bugs in the baggage system caused suitcases to be chewed up and drove automated baggage carts into walls
New Denver airport misses its opening: 1995
• The airport eventually opened 16 months late, $3.2 billion over budget, and with a mainly manual baggage system
The millennium bug: 2000
• No need to discuss this !!
Ariane 5 Rocket
Ariane 5
• The failure of the Ariane 501 was caused by the complete loss of guidance and attitude information 37 seconds after start of the main engine ignition sequence (30 seconds after lift- off)
• This loss of information was due to specification and design errors in the software of the inertial reference system
Ariane 5
• The extensive reviews and tests carried out during the Ariane 5 Development Programme did not include adequate analysis and testing of the inertial reference system or of the complete flight control system, which could have detected the potential failure
More on Testing
From Beatty – ESC 2002
Agenda
• Introduction
• Types of software errors
• Finding errors – methods and tools
• Embedded systems and RT issues
• Risk management and process
Introduction
• Testing is expensive
• Testing progress can be hard to predict
• Embedded systems have different needs
• Desire for best practices
Method
• Know what you are looking for
• Learn how to effectively locate problems
• Plan to succeed – manage risk
• Customize and optimize the process
Entomology
• What are we looking for ?
• How are bugs introduced?
• What are their consequences?
Entomology – Bug Frequency
• Rare
• Less common
• More common
• Common
Entomology – Bug severity
• Non functional; doesn’t affect object code• Low: correct problem when convenient• High: correct as soon as possible• Critical: change MUST be made
– Safety related or legal issue
Domain Specific !
Entomology - Sources
• Non-implementation error sources– Specifications– Design– Hardware– Compiler errors
• Frequency; common – 45 to 65%
• Severity; Non-functional to critical
Entomology - Sources
• Poor specifications and designs are often;– Missing– Ambiguous– Wrong– Needlessly complex– Contradictory
Testing can fix these problems !
Entomology - Sources
• Implementation error sources;– Algorithmic/processing bugs– Data bugs– Real-time bugs– System bugs– Other bugs
Bugs may fit in more than one category !
Entomology – Algorithm Bugs
• Parameter passing– Common only in complex invocations– Severity varies
• Return codes– Common only in complex functions or libraries
• Reentrance problem– Less common– Critical
Entomology – Algorithm Bugs
• Incorrect control flow– Common– Severity varies
• Logic/math/processing error– Common– High
• Off by “1”– Common– Varies, but typically high
Example of logic error
If (( this AND that ) OR ( that AND other )AND NOT ( this AND other ) AND NOT( other OR NOT another ))
Boolean operations and mathematical calculations can be easily misunderstoodIn complicated algorithms!
Example of off by 1
for ( x = 0;, x <= 10; x++)
This will execute 11 times, not 10!
for ( x = array_min; x <= array_max; x++)
If the intention is to set x to array_maxon the last pass through the loop, thenthis is in error!
Be careful when switching between 1 basedlanguage (Pascal, Fortran) to zero based (C)
Entomology – Algorithm bugs
• Math underflow/overflow– Common with integer or fixed point
math– High severity– Be careful when switching between
floating point and fixed point processors
Entomology – Data bugs
• Improper variable initialization– Less common– Varies; typically low
• Variable scope error– Less common– Low to high
Example - Uninitialized dataint some_function ( int some_param ) { int j;if (some_param >= 0){ for ( j=0; j<=3; j++) { /* iterate through some process */ }} else {
if (some_param <= -10){
some_param += j; /* j is uninitialized */ }
return some_param; } return 0;}
Entomology – Data bugs
• Data synchronization error– Less common– Varies; typically high
Example – synchronized data
struct state { /* an interrupt will trigger */GEAR_TYPE gear; /* sending snapshot in a message */U16 speed;U16 speed_limit;U8 last_error_code;
} snapshot;
snapshot.speed = new_speed; /* …somewhere in code */
snapshot.gear = new gear; /* somewhere else */snapshot.speed_limit = speed_limit_tb[ gear ];
Interrupt splitting these two would be bad
Entomology – Data bugs
• Improper data usage– Common– Varies
• Incorrect flag usage– Common when hard-coded constants
used– varies
Example – mixed math error
unsigned int a = 5;int b = -10;
/* somewhere in code */if ( a + b > 0 ){
a+b is not evaluated as –5 !the signed int b is converted to an unsigned int
Entomology – Data bugs
• Data/range overflow/underflow– Common in asm and 16 bit micro– Low to critical
• Signed/unsigned data error– Common in asm and fixed point math– High to critical
• Incorrect conversion/type cast/scaling– Common in complex programs– Low to critical
Entomology – Data bugs
• Pointer error– Common– High to critical
• Indexing problem– Common– High to critical
Entomology – Real-time bugs
• Task synchronization– Waiting, sequencing, scheduling, race
conditions, priority inversion– Less common– Varies
• Interrupt handling– Unexpected interrupts– Improper return from interrupt
• Rare• critical
Entomology – Real-time bugs
• Interrupt suppression– Critical sections– Corruption of shared data– Interrupt latency
• Less common• critical
Entomology – System bugs
• Stack overflow/underflow– Pushing, pulling and nesting
• More common in asm and complex designs• Critical
• Resource sharing problem– Less common– High to critical– Mars pathfinder
Entomology – System bugs
• Resource mapping– Variable maps, register banks,
development maps– Less common– Critical
• Instrumentation problem– Less common– low
Entomology – System bugs
• Version control error– Common in complex or mismanaged
projects– High to critical
Entomology – other bugs
• Syntax/typing– if (*ptr=NULL) Cut&paste errors– More common– Varies
• Interface– Common– High to critical
• Missing functionality– Common– high
Entomology – other bugs
• Peripheral register initialization– Less common– Critical
• Watchdog servicing– Less common– Critical
• Memory allocation/de-allocation– Common when using malloc(), free()– Low to critical
Entomology – Review
• What are you looking for ?
• How are bugs being introduced ?
• What are their consequences ?
Form your own target list!
Finding the hidden errors…
• All methods use these basic techniques;– Review; checking– Tests; demonstrating– Analysis; proving
These are all referred to as “testing” !
Testing
• “Organized process of identifying variances between actual and specified results”
• Goal: zero significant defects
Testing axioms
• All software has bugs• Programs cannot be exhaustively
tested• Cannot prove the absence of all
errors• Complex systems often behave
counter-intuitively• Software systems are often brittle
Finding spec/design problems
• Reviews / Inspections / Walkthroughs
• CASE tools
• Simulation
• Prototypes
Still need consistently effective methods !
Testing – Spec/Design Reviews
• Can be formal or informal– Completeness– Consistency– Feasibility– Testability
Testing – Evaluating methods
• Relative costs– None– Low– Moderate– High
• General effectiveness– Low– Moderate– High– Very high
Testing – Code reviews
• Individual review– Effectiveness: high– Cost; Time – low,
material - none
• Group inspections– Effectiveness: very
high– Cost; Time –
moderate, material - none
Testing – Code reviews
• Strengths– Early detection of errors– Logic problems– Math errors– Non-testable requirement or paths
• Weaknesses– Individual preparation and experience– Focus on details, not “big picture”– Timing and system issues
Step by step execution
• Exercise every line of code or every branch condition
• Look for errors– Use simulator, ICE, logic analyzer– Effectiveness; moderate – dependent on
tester– Cost; time is high, material is low or
moderate
Functional (Black Box)
• Exercise inputs and examine outputs
• Test procedures describe expected behavior
• Subsystems tested and integrated– Effectiveness is moderate– Cost; time is moderate, material varies
Tip; where functional testing finds problemslook deeper in that area !
Functional (Black Box)
• Strengths– Requirements problems– Interfaces– Performance issues– Most critical/most used features
• Weaknesses– Poor coverage– Timing and other problems masked– Error conditions
Functional test process• ID requirements to test• Choose strategy
– 1 test per requirement– Test small groups of requirements– Scenario; broad sweep of many requirements
• Write test cases– Environment– Inputs– Expected outputs
• Traceability
Structural (White box)
• Looks at how code works• Test procedures• Exercise paths using many data values• Consistency between design and
implementation– Effectiveness; high– Cost; time is high, material low to moderate
Structural (White box)
• Strengths– Coverage– Effectiveness– Logic and structure problems– Math and data errors
• Weaknesses– Interface and requirements– Focused; may miss “big picture”– Interaction with system– Timing problems
Structural (White box)
• Test rigor based on 3 levels of Risk (FAA)
• C – reduced safety margins or functionality– Statement coverage– Invoke every statement at least once
Structural (White box)
• Test rigor based on 3 levels of Risk (FAA)
• B – Hazardous – Decision Coverage– Invoke every statement at least once– Invoke every entry and exit– Every control statement takes all
possible outcomes– Every non-constant Boolean expression
evaluated to both a True and a False result
Structural (White box)
• Test rigor based on 3 levels of Risk (FAA)
• A – Catastrophic – Modified Condition Decision Coverage– Every statement has been invoked– Every point of entry and exit has been
invoked
Structural (White box)
– Every control statement has taken all possible outcomes
– Every Boolean expression has evaluated to both a True and a False result
– Every condition in a Boolean expression has evaluated to both True/False
– Every condition in a Boolean expression has been shown to independently affect that expression’s outcome
Unit test standards
• What is the white box testing plan?
• What do you test?
• When do you test it?
• How do you test it?
Structural test process
• ID all inputs• ID all outputs• ID all paths• Set up test cases
– Decision coverage– Boundary value analysis– Checklist– Weaknesses
Structural test process
• Measure worst case execution time
• Determine worst case stack depth
• Bottom up
Integration
• Combines elements of white and block box– Unexpected return codes or
acknowledgements– Parameters – boundary values– Assumed initial conditions/state– Unobvious dependencies– Aggregate functionality
Integration
• Should you do this…when?– Depends on the complexity of the
system– Boundary values of parameters in
functions– Interaction between units– “interesting” paths
• Errors• Most common
Verification
• Verify the structural integrity of the code
• Find errors hidden at other levels of examination
• Outside of requirements
• Conformance to standards
Verification
• Detailed inspection, analysis, and measurement of code to find common errors
• Examples– Stack depth analysis– Singular use of flags/variables– Adequate interrupt suppression– Maximum interrupt latency– Processor-specific constraints
Verification
• Strengths– Finds problems that testing and inspection
can’t– Stack depth– Resource sharing– Timing
• Weaknesses– Tedious– Focused on certain types of errors
Verification
• Customize for your process/application– What should be checked– When– How– By whom
Stress/performance
• Load the system to maximum…and beyond!
• Helps determine “factor of safety”
• Performance to requirements
Stress/performance
• Examples– Processor utilization– Interrupt latency– Worst time to complete a task– Periodic interrupt frequency jitter– Number of messages per unit time– Failure recovery
Other techniques
• Fault injection• Scenario testing• Regression
– Critical functions– Most functionality with the least tests– Automation– Risk of not re-testing is higher than the cost
• Boundary value testing
Tools
ICE Simulator Logic analyzer
Step through code
X X X
Control execution
X X
Modifying data
X X
Coverage X X X
Timing analysis
X X X
Code Inspection Checklist
Code Inspection Checklist
• Code correctly implements the document software design
• Code adheres to coding standards and guidelines
• Code is clear and understandable• Code has been commented
appropriately• Code is within complexity guidelines
– Cyclomatic complexity < 12
Code Inspection Checklist
• Macro formal parameters should not have side affects (lint message 665)
• Use parenthesis to enhance code robustness, use parenthesis around all macro parameters (665, 773)
• Examine all typecasts for correct operation
• Examine affects of all implicit type conversions (910-919)
Code Inspection Checklist
• Look for off-by-one errors in loop counters, arrays, etc
• Assignment statements within condition expressions (use cchk)
• Guarantee that a pointer can never be Null when de-referencing it
• Cases within a switch should end in a break (616)
Code Inspection Checklist
• All switch statements should have a default case (744)
• Examine all arguments passed to functions for appropriate use of pass by value, pass by reference, and const
• Local variables must be initialized before use
• Equality test on floating point numbers may never be True (777)
Code Inspection Checklist
• Adding and subtracting floats of different magnitudes can result in lost precision
• Insure that division by zero cannot occur
• Sequential multiplications and divisions may produce round-off errors
Code Inspection Checklist
• Subtracting nearly equal values can produce cancellation errors
• C rounds towards zero – is this appropriate here ?
• Mathematical underflow/overflow potential
• Non-deterministic timing constructs
Unit test standards
Unit test standards
• 1. Each test case must be capable of independent execution, i.e. the setup and results of a test case shall not be used by subsequent test cases
• 2. All input variables shall be initialized for each test case. All output variables shall be given an expected value, which will be validated against the actual result for each test case
Unit test standards
• 3. Initialize variables to valid values taking into account any relationships among inputs. In other words, if the value of a variable A affects the domain of variable B, select values for A and B which satisfy the relationship
• 4. Verify that the minimum and maximum values can be obtained for each output variable (i.e. select input values that produce output values as close to the max/min as possible)
Unit test standards
• 5. Initialize output variables according to the following;– If an input is expected to change, set its
initial value to something other than the expected result
– If an output is not expected to change, set its initial value to its expected value
• 6. Verify loop entry and exit criteria
Unit test standards
• 7. Maximum loop iterations should be executed to provide worst case timing scenarios
• 8. Verify that the loss of precision due to multiplication or division is within acceptable tolerance
Unit test standards
• 9. The following apply to conditional expressions– “OR” expressions are evaluated by
setting all predicates “FALSE” and then setting each one “TRUE” individually
– “AND” expressions are evaluated by setting all predicates “TRUE” and then setting each one “FALSE” individually
Unit test standards
• 10. Do not stub any functions that are simple enough to include within the unit test
• 11. Non-trivial tests should include an explanation of what is being tested
Unit test standards
• 12. Unit test case coverage is complete when the following criteria are satisfied (where applicable)– 100% function and exit coverage– 100% call coverage– 100% statement block coverage– 100% decision coverage– 100% loop coverage– 100% basic condition coverage– 100% modified condition coverage
Unit test checklistCommon coding error checks
Status Note(s)
Mathematical expression underflow/overflow
<Pass/Fail/NA>
Off-by-one errors in loop counters
<Pass/Fail/NA>
Assignment statements within conditional expressions
<Pass/Fail/NA> May be detected by compiler, lint, cchk
Floats are not compared solely for equality
<Pass/Fail/NA> Lint message 777
Variables and calibrations use correct precision and ranges in calculations
<Pass/Fail/NA>
Unit test checklistCommon coding error checks
Status Note(s)
Pointers initialized and de-referenced properly
<Pass/Fail/NA>
Intermediate calculations are not stored in global variables
<Pass/Fail/NA>
All declared local variables are used in the function
<Pass/Fail/NA> May be detected by compiler or lint
Typecasting has been done correctly
<Pass/Fail/NA>
Unreachable code has been removed
<Pass/Fail/NA> Lint message 527
Common coding error checks
Status Note(s)
All denominators are guaranteed to be zero (no divide by 0)
<Pass/Fail/NA>
Switch statement handle every case of the control variable (have DEFAULT paths). Any cases that “fall through” to the next case are intended to do so
<Pass/Fail/NA> Lint message 744, 787
Fall through 616
Static variables are used for only one purpose
<Pass/Fail/NA>
All variables have been properly initialized before being used assume a value of “0” after power –up
<Pass/Fail/NA>