22
Software Testing 10-April-2003 SENG 623 Software Quality Management Yuhang (Henry) Wang Winter Term 2003 Scott Thornton Software Testing Submitted to Alfred Hussein as part of the course requirements for SENG 623 –Software Quality Management University of Calgary Winter, 2003 submitted by Yuhang (Henry) Wang Scott Thornton April 10, 2003

Software Testing

Embed Size (px)

Citation preview

Page 1: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management Yuhang (Henry) WangWinter Term 2003 Scott Thornton

Software TestingSubmitted to

Alfred Hussein

as part of the course requirements for

SENG 623 –Software Quality ManagementUniversity of Calgary

Winter, 2003

submitted by

Yuhang (Henry) WangScott Thornton

April 10, 2003

Page 2: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management i Yuhang (Henry) WangWinter Term 2003 Scott Thornton

ABSTRACT

Software Testing happens immediately after the source code of the software has been generated.Within the field of Software Quality Management, Software Testing is an important approach toSoftware Quality Assurance. It represents the last defense to correct deviations from specificationand errors in design or code implementation when compared to other techniques such as inspection,walkthroughs and other reviews.

Software testing is examined from a quality management context. A brief history of the theory oftesting is provided in order to frame the current approach and techniques. Testing fundamentals andthe four testing levels or stages of the software development cycle: unit, integration, system, andacceptance testing are described. Organizational issues associated with software testing areprovided, with several keys and several pitfalls to testing provided in the concluding section.

Page 3: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management ii Yuhang (Henry) WangWinter Term 2003 Scott Thornton

Table of Contents1 INTRODUCTION........................................................................................................................................... 1

1.1 HISTORY OF SOFTWARE TESTING ................................................................................................................... 1

2 TESTING FUNDAMENTALS ....................................................................................................................... 3

2.1 TESTING OVERVIEW ...................................................................................................................................... 32.2 TESTING STAGES ........................................................................................................................................... 42.3 TESTING TECHNIQUES ................................................................................................................................... 6

2.3.1 Static and Dynamic Testing.................................................................................................................. 62.3.2 Black Box and White (Glass) Box Testing............................................................................................. 62.3.3 Equivalence Partitioning ..................................................................................................................... 72.3.4 Boundary Value Analysis ..................................................................................................................... 72.3.5 Path Testing ........................................................................................................................................ 7

2.4 TESTING VERSUS INSPECTIONS ...................................................................................................................... 8

3 UNIT TESTING.............................................................................................................................................. 9

3.1 PROCEDURES................................................................................................................................................. 93.2 METRICS ....................................................................................................................................................... 9

4 INTEGRATION TESTING ...........................................................................................................................11

4.1 PROCEDURES................................................................................................................................................114.2 METRICS ......................................................................................................................................................11

5 SYSTEM TESTING.......................................................................................................................................12

5.1 PROCEDURES................................................................................................................................................125.2 METRICS ......................................................................................................................................................12

6 VALIDATION TESTING..............................................................................................................................13

6.1 PROCEDURES................................................................................................................................................136.2 METRICS ......................................................................................................................................................13

7 TESTING ORGANIZATION ISSUES..........................................................................................................14

7.1 WHO TESTS?................................................................................................................................................147.2 WHEN SHOULD TESTING STOP? ....................................................................................................................157.3 ORGANIZATIONAL ISSUES .............................................................................................................................16

8 DISCUSSION AND CONCLUSIONS ...........................................................................................................17

9 REFERENCES...............................................................................................................................................18

Page 4: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management iii Yuhang (Henry) WangWinter Term 2003 Scott Thornton

List of Figures

Figure 1 Verification and Validation Software Lifecycle Model......................................................5Figure 2 Unit Testing Infrastructure ................................................................................................9

List of Tables

Table 1 Testing Levels verus Quality Views ...................................................................................5Table 2 Typical Tester versus Testing Level .................................................................................15

Page 5: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 1 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

1 Introduction

Software Testing happens immediately after the source code of the software has been generated. Itis performed to uncover and correct as many of the potential errors as possible before delivery tothe customer. Within the field of Software Quality Management, Software Testing is an importantapproach to Software Quality Assurance. It represents the last defense to correct deviations fromspecification and errors in design or code implementation when compared to other techniques suchas inspection, walkthroughs and other reviews.

This paper will examine software testing from a quality management context. It begins with a briefhistory of the theory of testing in order to frame the current approach and techniques. Section Twoprovides some testing fundamentals, describing what constitutes a valid test, various testingtechniques and the four testing levels or stages of the software development cycle: unit, integration,system, and acceptance testing. Sections Three through Six delve into the details of each of thesestages. Section Seven addresses organizational issues associated with software testing, with adiscussion and conclusions wrapping up the paper in Section Eight.

1.1 History of Software TestingThroughout the history of software development, there have been many divergent definitions forSoftware Testing. First, in the 1950s, testing was defined as “what programmers did to find bugs intheir programs” [Hetzel 1988]. Today, this definition is much too restrictive in that softwaretesting has been extended to include not only determining that a program functions correctly, butalso that the functions themselves are correct.

As the science of software engineering matured through the 1960s and 1970s, the definition oftesting underwent a revision. Consideration was given to exhaustive testing of the software in termsof the possible paths through the code, or by enumerating the possible input datasets. Even with thecomplexity of the software systems being developed at that time, this was impractical, if nottheoretically impossible. The 1950s concepts were extended to include “what is done todemonstrate correctness of a program” [Goodenough 1975] or to define testing as “the process ofestablishing confidence that a program or system does what it is supposed to do” [Hetzel 1973].Although this concept is valid in theory, in practice it is insufficient. If only simple, straightforwardtests are performed, it is easy to show that the software “works”. Since these tests may not exercisea significant portion of the software, large number of defects may remain to be discovered duringactual operational use. It was therefore concluded that correctness demonstrations are an ineffectivemethod of testing during software development. There is still a need for correctness demonstrations(acceptance testing for example), as will be seen later in this paper.

The 1980s saw the definition of testing extended to include defect prevention. According to BorisBeizer [Beizer 1983], the act of designing tests is one of the most effective bug preventers known.As well, with the blossoming costs and effort dedicated to testing, it was recognized that a testingmethodology was required, specifically that testing must include reviews and that it should be amanaged process.

Page 6: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 2 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

The power of early test design was recognized in the beginning of the 1990s. Testing was redefinedto be “the planning designing, building, maintaining and executing tests and test environments”[Hetzel 1991]. This incorporated all of the ideas to date in that good testing is a managed process,a total life cycle concern with testability [Beizer 1990].

Page 7: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 3 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

2 Testing FundamentalsThis section will provide the basis for the remainder of the paper. Some basic definitions will beprovided, along with identifying the purposes and objectives of testing in the first subsection.Subsection Two shows how testing is applied to software in stages, starting at the smallest testableunit and ending with a complete demonstration of a system’s functionality and capabilities. Thevarious levels are introduced, with further specifics being provided in later sections. In the lastsubsection looks at common testing techniques and the types of errors they are likely to find.

2.1 Testing OverviewAs the name suggests, the Software Testing includes the running of a series of dynamic executionsof the software product. For a test to be a test (sometimes called a test case) it must have thefollowing properties:

1. A controlled/observed environment that will allow a test to be reproducible, so as toverify that the defect was corrected,

2. A set of sample input that is (generally) a small subset of the possible values,3. Predicted results that are expected from the set of sample input, so as to verify correct or

incorrect operation. The predicted result must be available before the execution of thetest case.

4. An analyzed outcome, that compares the predicted and actual results for each executionof the test.

Tests are run to assess the level of quality in a software product. They determine what a systemactually does, and to what level of performance. As well, testing helps to achieve software qualityby finding defects, along with other activities in the software development process. Finally testinghelps to preserve quality by enabling a modified system to be retested, ensuring that what workedbefore still works.

Glen Myers [Myers 1979] states that

1. Testing is a process of executing a program with the intent of finding an error.2. A good test case is one that has a high probability of finding an as-yet-undiscovered error.3. A successful test is one that uncovers an as-yet-undiscovered error.

By uncovering errors in the software, their cause can be determined and then a fix implemented.This results in a software product that contains fewer defects, achieving a greater level of quality.Additional tests may or may not uncover errors, which leads to the fundamental limitation oftesting. Testing can only show the presence of a defect. It can never prove the absence of all errors.

As is well known, producing error free software is extremely difficult. The developer requires anassessment of the quality of the software before releasing it to customers. This assessment isderived from testing. If no or few errors are found, then the quality of the software is assumedgood, providing some confidence that the product is working well.

Page 8: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 4 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

Therefore, from this it can be seen that software testing has two main objectives:

1. To uncover errors in the software product before its delivery to the customer, (or the nextstate of development), and

2. To give confidence that the product is working well.

It is interesting to note that the two testing objectives result in what is known as the “TestingParadox”. If the first objective is to uncover errors in the software product, how can there beconfidence that the product is working well, since it was just proved that it is, in fact is notworking!

It must also be noted that test cases that pass (i.e. produce the expected results) do not, in and ofthemselves, improve the quality of the software product. Nothing has changed in the software undertest, so given that there is no change there can be no improvement or degradation. Rather, thesetests only improve our confidence in the software product.

2.2 Testing StagesOverall testing of a software system can be divided into essentially four levels or stages. Each stageparallels the level of complexity found within a software product during development.

At the lowest and simplest level is Unit Testing. Here, the basic units of the software are tested inisolation. (A unit is defined to be the smallest testable piece of software [Beizer 1990].) Theobjectives are to find errors in these units in either the logic or data.

Units are assembled into larger aggregates called components. When two or more testedcomponents or units are combined, the testing done on the aggregate is called ComponentIntegration Testing, or just Integration Testing. The tests done at this level look for errors in theinterfaces between the components. As well, the functions, which can now be performed by theaggregate, that were not testable individually must also be examined.

After the all the components have been assembled, the entire system can be tested as a whole, andas such this is called System Testing. It might be argued that “System Testing” is just the finalstage of component integration testing. However, at this stage the functional and/or requirementsspecification is used to generate the test cases. System Testing looks for errors in the end to endfunctionality of the system as well as errors in the non-functional requirements such asperformance, reliability, and security.

The last stage of testing is Validation or Acceptance Testing. Here the system is handed over to theend-user or customers. The purpose of this testing is to give confidence that the system is ready foroperational use, rather than to find errors. Thus, this is more correctly called a demonstration ratherthan a test.

Page 9: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 5 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

Each level of testing addresses different views of the quality of the software product as shown inTable 1.

Software Testing Levels Quality ViewsUnit Testing Manufacturing View of Quality

Integration Testing Manufacturing View of Quality

System Testing Product View of Quality

Validation Testing User View of Quality

Table 1 Testing Levels verus Quality Views

A “Verification and Validation” software life cycle model is usually used to demonstrate the goalsof the different testing stages. In this model, Verification and Validation is used to refer the testingactivities. Verification is the actions in the testing that ensure a specification or function is correctlyimplemented. The activities that ensure the software that has been built is traceable to the originalrequirement specification are known as Validation. Boehm [Boehm 1981] states this in two simplesentences:

Verification: “Are we building the product right?”.Validation: “Are we building the right product?”

Figure 1 Verification and Validation Software Lifecycle Model

Page 10: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 6 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

2.3 Testing TechniquesSoftware Testing Techniques provide systematic approaches for designing tests that exercise theinternal logic of software components and the input and output domains of the program to uncovererrors in program functions, behaviours and performances. Therefore, testing techniques are notonly performed on the functional areas but also non-functional areas of the software. The next fewsections will examine several types of testing techniques.

2.3.1 Static and Dynamic TestingThe primary difference between static and dynamic testing is that static techniques don not exercisethe software in its execution environment. On the other hand, dynamic testing involves theoperation of the software with a set of test inputs, and results in a test output.

Static analysis testing developed from compiler technology, resulting in a significant number oferrors being uncovered. This analysis can examine the control and data flow, check for dead orunreachable code, and identify infinite loops, uninitialized or unused variables, and standardsviolations. Certain measures such as McCabe’s Cyclomatic Complexity are calculated staticallyand provide an assessment of the testability of the software entity.

Dynamic test techniques can be classified into Functional or Structural techniques. These are oftenreferred to as Black Box or White (Glass) Box testing, respectively, and are described in the nextsection.

2.3.2 Black Box and White (Glass) Box TestingBlack-box tests are executed to validate the functional requirements without analysis of the internallogic of the program. In other words, the tester does not consider the how a software componentperforms its function, but rather just that it does. The ideal black-box tests are executed based on aset of test cases that satisfy the following criteria [Myers 1979]:

1. test cases that reduce, by a count that is greater than one, the number of additional test casesthat must be designed to achieve reasonable testing, and

2. test cases that tell us something about the presence or absence of classes of errors, ratherthan an error associated only with the specific test at hand.

The goal of black box testing is to find errors such as

1. incorrect or missing functions,2. interface errors,3. behaviour or performance errors, and,4. initialization and termination errors.

White box tests are executed to verify whether the program is running correctly by considering theinternal logical structure. Test items are chosen to exercise the required parts of the structure. Thetest cases designed for performing white box testing should follow the following criteria:

1. guarantee that all independent paths within a module are all covered at least once,2. execute all the logical decisions for true and false,

Page 11: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 7 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

3. exercise all loops at their boundaries and within their operational bounds, and4. ensure internal data structures are validated.

One concrete white box testing technique is “Basis Path Testing”, as proposed by [McCabe 1976].There are also other techniques, such as Condition Testing [Tai 1989], Data Flow Testing, andLoop testing.

White box testing and black box testing are complementary techniques. Only using one of them isimpractical in the real software testing world. They are typically combined to provide an approachthat validates both the interface and the internal workings of the software.

2.3.3 Equivalence PartitioningTesting is more effective if the various tests are distributed over the complete range of possibilitiesrather than being drawn from just a few. Input values that are processed in an equivalent fashioncan be regarded as being equivalent. If the input domain is then partitioned into equivalent subsets,then a good set of cases would draw an input value from each rather than concentrate on only a fewof the subsets.

2.3.4 Boundary Value AnalysisBoundary Value Analysis is related to equivalence partitioning. These values lie on the boundariesof an equivalence partition. As was suggested above, test cases should be drawn from eachequivalence partition. Values on each side of a partition should also be tested, addressing one of themost common coding errors, that of being “off by one”. It might be argued that since any valuewithin a particular equivalence set is good as another, boundary testing is superfluous. It must benoted however that the equivalence partitions are defined by these boundary values.

Boundaries can exist on output data ranges as well. Tests should be designed to produce valid and(attempt to produce) invalid outputs.

Hidden boundaries such as maximum string lengths or data set sizes also need to be determined andverified that appropriate responses are generated by the software under test.

2.3.5 Path TestingA path is defined to be a sequence of program statements that are executed by the software undertest in response to a specific input. In most software units, there is a potentially (near) infinitenumber of different paths through the code, so complete path coverage is impractical. Notwithstanding that, a number of structural techniques that involve the paths through the code canlead to a reasonable test result.

2.3.5.1 Branch TestingThe structure of the software under test can be shown using a control flow diagram, illustratingstatements and decision points. The number of unique paths from the start to the end of the code isequal to the number of decision points plus one. This number is just the Cyclomatic Complexity ofthe program [McCabe 1976]. Using the control flow diagram, test cases can be designed such thateach exercises at least one new segment of the control flow graph. In theory the number of test

Page 12: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 8 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

cases required should be equal to the cyclomatic complexity, however in practice it is extremelydifficult to do this operationally.

2.3.5.2 Condition TestingIn branch testing, only the value of the complete boolean expression at each decision point is takeninto consideration, regardless if that expression is a simple one or is a compound expression. InCondition Testing, a test case is designed for each component of the boolean expression involved ina decision point.

2.4 Testing Versus InspectionsThere are many debates around the idea that software inspections at the code level can replacetesting. Experience has shown them to be a powerful defect prevention methodology. It is oftensaid that the best way to come to understand a topic is to have to teach or explain it to another.Inspections encourage just that philosophy. However, inspections are a static analysis of thesoftware being examined. Given the complexity of software systems being developed today, it isunlikely that this static analysis will be capable of detecting defects that involve significantinteractions between multiple components, or be able to assess its performance characteristics. TheSuch assessments can only be determined from a dynamic form of testing.

These leads to the potential conclusion that inspections may effectively address the software qualityissues at the unit test and some portion of the integration test levels. However, as the complexity ofthe system under test increases, there is a threshold where static analysis must give way to thedynamic nature of testing. In other words, inspections and testing are complementary techniques toassess or improve the quality of the software product. Each should be employed where its return oninvestment is greatest, or where the alternate technique is incapable of being completely successful.

Page 13: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 9 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

3 Unit TestingUnit testing is performed on the smallest unit of software, the software module. The module-leveldesign descriptions are normally used as the guideline for the unit testing. The internal paths aretested to uncover the errors within the boundary of the module. The interface, local data structures,boundary conditions, independent paths and error handling paths are examined at this testing level.

3.1 ProceduresFor unit testing, driver and/or stub software are developed for executing the unit testing. Driversconsist of a “main program” that accepts input test case data and output the corresponding results.Stubs are the constructed to provide the module interface that is called by the component to betested. The whole infrastructure for unit testing is showed as the following:

Figure 2 Unit Testing Infrastructure

The Unit Test has a dynamic, white-box orientation.

3.2 MetricsMetrics that are typically applied at the Unit Test level are focused at Product Quality Assessmentor at Test Quality Assessment.

Code complexity metrics such as McCabe’s [McCabe 1976] are used to help design test cases. TestEffectiveness Ratios such as

• Statement Coverage (TER 1),• Branch and Decision Coverage (TER 2),• Decision Condition Coverage, and• Linear Code Sequence and Jump (LCSAJ) Coverage (TER 3)•

are commonly used to assess the effectiveness of a given test set.

It is believed that a thorough test of a module should exercise every statement at least once, so100% of statement coverage (TER 1) should be achieved.

Test Cases

Module Under Test

Test Driver

Stub Stub Stub

Results

Page 14: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 10 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

Complete branch coverage does not necessarily mean that all statements in a unit have beenexecuted. For example if the test cases only exercise the “true” branch of all decisions, 100%branch coverage will have been done, however not all statements (those in the ‘”false” branch) willnot have been touched.

Defect tracking metrics such as

• Defect Arrival Rate• Defect Densities• Cumulative Defects by Severity• Defect Closure Rates•

are an important assessment of the quality of the product and of the rework/repair process.

Test Completeness metrics determine the progress of the testing effort. This is required by both thedeveloper and the project manager in assessing the level of resources required to achieve thedesired level of quality.

Time/cost metrics for test design, debugging tests, test execution and analysis, and in defectresolution are important to determine the true costs of testing and hence the true cost of quality.

Page 15: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 11 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

4 Integration TestingThe testing done on the combination of two or more tested components or units called IntegrationTesting. It is a systematic technique for conducting tests to uncover the errors associated withinterfacing between software units. As well, it will expose errors related to the larger functionalityof the individual components under test. Integration testing follows unit testing, in that thecomponents being integrated need to unit tested first.

4.1 ProceduresThere are two common ways to conduct integration testing. One is non-incremental integration,which uses a “big bang” approach. All the software units are assembled into the entire program.This assembly is then tested as a whole from the beginning, usually resulting in a chaotic situation,as the causes of defects are not easily isolated and corrected.

Alternatively, a much superior way to conduct the integration testing is to follow an incrementalapproach. The program is constructed and tested in small increments by adding a minimum numberof components at each interval. Therefore, the errors are easier to isolate and correct, and theinterfaces are more likely to be tested completely.

Two different approaches have been identified as means of performing incremental integrationtesting: Top-Down integration and Bottom-up integration. In Top-Down integration, modules areintegrated from the main module (main program) to the subordinate modules either in the depth-first or breadth-first manner. Bottom-up integration, as the name suggests, has the lowest level sub-modules integrated and tested first, then the successively superior level components are added andtested, transiting the hierarchy from the bottom, upwards.

When conduction integration testing, the new functionality of the integrated set must be confirmed.In addition, previously confirmed functionality must also be tested again. This is because the newlyintegrated components may have broken the previously tested set. Such testing is known asRegression Testing. (Some authors suggest that it should really be known as “Anti-RegressionTesting”; i.e. confirming that the integrated components have not regressed in their level ofquality!)

Integration Testing is typically dynamic, and is usually black box.

4.2 MetricsMetrics that can be collected during Integration Testing include

• Error Rates in design and implementation

As well, several of the metrics described in Section 3.2 Unit Test Metrics may also be collectedduring integration testing. Specifically these are the Test Execution Progress, Time/Cost, andDefect Tracking metrics.

Page 16: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 12 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

5 System TestingAfter the all the components have been assembled, the entire system can be tested as a whole, andas such this is called System Testing. The software must be tested under a certain context (forexample with certain hardware, users, or environment) that should be as similar to the end usecontext as possible. System Testing looks for errors in the end to end functionality of the system aswell as errors in the non-functional requirements. At this stage, the functional and/or requirementsspecifications are used to generate the test cases. They should be a series of different tests whoseprimary purpose is to fully exercise the computer-based system.

5.1 ProceduresAs indicated in the previous section, the test cases developed for system testing are derived fromthe functional and/or requirement specifications. The procedures used for functional system testingare essentially equivalent to those used in Integration Testing (Section 4.2).

Depending on the type of software system being tested, many non-functional tests may be required.These include items such as

1. Recovery Testing2. Security Testing3. Stress Testing4. Performance Testing5. Maintainability6. Usability

5.2 MetricsProcess metrics at this level of testing are those described in Section 4.2, Integration TestingMetrics.

Page 17: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 13 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

6 Validation TestingOnce System Testing has been completed, the software product is handed over to the customers orusers. Validation Testing marks the transition of ownership of the software from the developers tothese users. As such, this level of testing differs from the previous three in the following ways:

1. Typically, the Validation Testing is the responsibility of the users or customer ratherthan the developers. (Having stated that, it is often the case that the developmentorganization will write the Validation Test procedures for approval and then executionby the customer.)

2. The intent of Validation Testing is to give confidence to the customer that the deliveredsystem is working as intended, rather than to find errors in the implementation. As suchit is more correctly called a demonstration rather than a test.

3. Validation Testing can often include the testing of the customers work practices toensure that the software is compatible with the organizations internal procedures andprocesses.

A simple definition for Validation Testing is that validation succeeds when software functions in amanner that can be reasonably expected by the customer.

6.1 ProceduresThere are three common forms for Validation Testing. The first one is “Acceptance Testing”,where an extensive series of formal tests with specific procedures and well documented pass/failcriteria are planned and then executed.

A second form of Validation Testing is known as an “Alpha Test”. Here the customer is placed in acontrolled environment usually at the developers’ site. In the Alpha Test, any errors will berecorded and corrected promptly, since the developers and the customer work together. Typically,there are no formal test procedures. Rather, the customers use the software in a manner thatrepresents their intended use in their environment. As such, if errors are found the exact procedurewhich elicits the error may be difficult to repeat.

“Beta testing”, the third form of Validation Testing, is conducted at one or more customer sites bythe end-users. Since the developers are not on site when the software is used, the errors are reportedback and then fixed. As with Alpha Tests, Beta Tests involve the execution of the software in thecustomer’s environment with typical use and without formal test procedures. Beta Testing is mostoften used with mass-market software. It is usually the final test performed before the software isreleased.

6.2 MetricsDuring validation testing, the Defect Tracking metrics can be critical. They, coupled with the timecost metrics provide an ongoing evaluation of the Cost of Quality due to Software Testing.

Page 18: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 14 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

7 Testing Organization Issues

7.1 Who Tests?There are several possibilities as to what part of the software development organization actuallyperforms the testing. It is important to take an independent view of the software under test, in orderto ensure that no personal stake exists for not finding defects. For developers, this is quite difficultto do for two reasons. First, software engineers create the programs, documentation and relatedartifacts. They are proud of what they have built and look askance at anyone who attempts to findfault with it. This includes themselves! If they have produced what they consider their very bestwork, they will not want to immediately destroy it! Secondly, it is human nature to see what wasintended rather than what is actually there. Studies have shown that an individual can only find 25to 30% of their own errors [Beizer 1988].

Not withstanding the above, the developer of a software unit is likely the one that has the greatestunderstanding of its internal structure. As such, the developer is best suited to design the test casesbased on its structure to ensure complete path coverage. Thus, it is often the case that thedevelopers are responsible for testing the individual units of the program. In many cases, thedevelopers may also conduct integration testing to a certain level.

An alternative to having developers perform all the testing activities is to create an IndependentTest Group (ITG). This has several advantages: a separate team has true independence, and atesting expertise is developed in the group that can be applied across multiple products. An ITG canbe a resource to the developers to ensure testability and to perform inspections during design andcoding.

The software developer does not just turn over the software units to the test team and walk away.The two must work closely together throughout the development to ensure that thorough tests willbe conducted. Further, while testing is being conducted, the developer must be available to correctdefects as they are uncovered.

Issues can arise with an ITG however. The Test Group is responsible to “break” the thing that thedeveloper has build since they are paid to find the errors. This can lead to animosity betweendevelopers and the test team. Social qualities that the individuals on the test team must have to helpminimize this animosity are being in control, being competent, critical, comprehensive andconsiderate [Hetzel 1988].

Finally, customers make valuable testers. It is well know that as soon as customers start to use anew system, they have the “skill” to find large numbers of errors. This derives from the fact thatthey often exercise the system in ways that were never conceived by the developers. They (thecustomers) do not have the psychological bias present in the developers because they have nopreconceived notion of the architecture of the software. The customer is always involved in thevalidation testing.

Page 19: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 15 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

Table 2 shows for each of the testing levels who conducts the major portion of the testing effort.

Validation TestingUnitTesting

IntegrationTesting

SystemTesting Acceptance

TestingAlpha

TestingBeta

TestingSoftwareDevelopers

X X X X

IndependentTesting Group

X X X X X

Customers X X X

Table 2 Typical Tester versus Testing Level

7.2 When Should Testing Stop?As was stated earlier in Section 2.1, testing can only establish the presence of errors; it can neverestablish that the software is error free. Therefore, the question arises as to when sufficient testinghas been completed. Beizer notes that “There is no single, valid rational criterion for stopping[testing]. Furthermore, given any set of applicable criteria, how each is weighted depends verymuch on the product, the environment, the culture and the attitude to risk.” [Beizer 1990].

Metrics that might be employed in the determination of when to stop testing include

1. Defect Discovery RateIf the discovery rate is decreasing and passes a particular threshold, it may mean that theproduct is ready to ship. However, it must be understood why the rate is decreasing. If itwere due to less testing effort or if no remaining new test cases are available, thedecreased rate would not be representative of the software quality.

2. Trends in Severity Levels of Defects FoundIf the trend in severity level is not downwards, then the product is not ready to ship.

3. Remaining Defects Estimation Criteria (based on past, historical data)

4. Percentage Coverage Measurements

5. Reliability Growth Models(Mean time between failure (MTBF) and Mean Time To Repair (MTTR))

6. Running out of Testing Resources (budget or schedule).

Unfortunately, the last item, running out of resources, is one of the most common reasons for thetermination of testing.

If testing is halted too soon many defects may still be left in the software, with some at the criticallevel. This results in a reactive environment in the development organization, with programmers“fire-fighting” the defects in the product rather than working on new features or new products.Turnover rates for key employees will increase because of this. In addition, the customers will befrustrated in their attempts to use the software and/or have it fixed.

Page 20: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 16 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

If testing is continued beyond the ideal point, both the team and the end users will be confident ofthe quality in the product, however there are negative ramifications. There will be a significantincrease in the products costs as testing is an expensive activity. Either margins will be smaller orthe price of the product will be increased. Either way, the return on the software investment will bereduced, possibly to the point where the product is no longer economically viable. Late deliveriesmay cause the organization a loss of revenue, smaller market share, project cancellations, andfrustrated customers.

Recall that software testing is not striving for perfection but rather to achieve an acceptable level ofrisk. Sometimes the risk of shipping a flawed product may out-weighted the business risks resultingfrom competition or time-to-market.

7.3 Organizational IssuesAs has been indicated previously, software testing needs to be carefully managed. In order toaccomplish this, there are responsibilities at levels higher that the specific software developer thathave an impact on the effectiveness of the testing.

At the project level, the project manager should make the decision with the independent testinggroup for:

• Assignment of resources for testing.• Testing plan (before development)• Test cases design (before or in parallel with development)• Test execution (after development)• Problem resolution (whenever needed)

At the organization level, executive senior management will make the following decision about theorganizational policies:

• Set the testing policy, strategy and objectives for the company• Make policy to ensure that metrics for test effort and results are collected and used• How much effort and resource will be invested in Tool Support• Commit to improving the test process

Metrics that should be monitored across the organization include• Error rates in design, inspection, test design, test execution, operation• Error severity and cost to correct• Time and costs in test design, test execution, defect densities, defect repair, inspections and

review

The testing policy should apply to the whole development organization to ensure consistent qualityacross multiple products. This policy should include

• The objectives of testing• Economic constraints for testing• The quality and acceptance criteria for test documentation• What tools will be provided to support the testing activities• The evaluation of test effectiveness ratios

Page 21: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 17 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

8 Discussion and Conclusions

Software Testing is part of a managed software development process. Testing itself needs to bemanaged carefully as a significant portion of the development dollars are spent during this phase.There are several keys to successful testing.

• In order to be cost effective, the testing must be concentrated on areas where it will be mosteffective. In other words, the testing effort does not have to be uniformly distributed overthe system being tested. It is most important to test the most critical parts (to the users) ofthe system.

• The testing should be concentrated on the most difficult, most complex, the hardest to write,or the least liked. It is in these components that the greatest number of defects is historicallyfound. Defects (bugs) are social creatures; they tend to congregate in the same “locations”regardless of the software product they are found in.

• The testing should be planned such that when testing is stopped for whatever reason, themost effective testing in the time allotted has already been done.

Just as there are keys to success in testing, there are also many pitfalls.

• The absence of an organizational testing policy may result in too much effort and moneywill be spent on testing, attempting to achieve a level of quality that is impossible orunnecessary. Alternatively without a policy, insufficient time may be allocated to thetesting phase. Either way, the testing is not effective or efficient.

• The actual testing must be planned. Doing the planning early in the development willidentify what is needed in terms of test data, test drivers or harnesses. The need for testtools may also be determined in a timely fashion.

Uneven testing, where important (to the customer) functions are missed entirely orunimportant functions are tested to an unreasonable level also is a result of unplannedtesting. Priorities for all test should be decided in advance.

• If the software is very difficult to test, then the job of the tester will also be difficult. Earlytest design helps to identify these issues and will ensure testability. If the software is ofsuch poor quality, a vast number of errors will be generated. Testing will essentially stopbecause many tests cannot be completed. This should not be considered an issue of Testingbut rather one of software quality.

Regardless of how good our testing and test management becomes, we will always be caught by the“Pesticide Paradox”:

Every method used to prevent or find bugs leaves a residue of subtler bugs againstwhich those methods are ineffectual. [Beizer 1983].

Page 22: Software Testing

Software Testing 10-April-2003

SENG 623 Software Quality Management 18 Yuhang (Henry) WangWinter Term 2003 Scott Thornton

9 References

Boehm, B. Software Engineering Economics, Prentice-Hall, 1981,

Beizer, B. Software System Testing and Quality Assurance, Van Nostrand Reinhold,New York, 1984

Beizer, B. Software Testing Techniques, Van Nostrand Reinhold, New York, 1983, and1990 (2nd edition)

Craig, R. & Jaskiel, S. System Software Testing, Arctech House Publishers, Boston, 2002

Goodenough, J. & Gerhart, S. “Toward a Theory of Test data Selection”, IEEE Transactions onSoftware Engineering, SE-1(2), (June 1975)

Hetzel, B., ed, Program Test Methods, Prentice-Hall, Englewood Cliffs, NJ, 1973

Hetzel, B., The Complete Guide to Software Testing, QED Information Sciences,Wellesley, Mass. 1988

Hetzel, B., “Software Testing: Some Troubling Issues and Opportunities”, BCS SpecialInterest Group in Software Testing, Dec 6, 1991.

Hetzel, B., Making Software Measurement Work: Building an Effective MeasurementProgram, John Wiley and Sons, Inc., New York, 1993

Malevris N, & Petrova, E. On the Determination of an Appropriate Time for Ending the SoftwareTesting Process, Proceedings of the First Asia-Pacific Conference on QualitySoftware, Hong Kong, October 2000,

Marciniak, J. Encyclopaedia of Software Engineering, John J Marciniak ed; John Wiley &Sons, 1994

McCabe, T. “A Complexity Measure”, IEEE Transactions on Software Engineering, SE-2(4), 1976

Myers, G., The Art of Software Testing, John Wiley & Sons, New York, 1979

Tai, K.C., "What to Do Beyond Branch Testing," ACM Software Engineering Notes,vol. 14, no. 2, April 1989, pp. 58-61

Tennant, R. Creating Five-Star Test Metrics on a One Star Budget, InternationalConference on Software Testing, Analysis and Review, Anaheim, CA, 2002

Vaughan, D. & Elledge, J. Test Metrics Without Tear, International Conference on SoftwareTesting, Analysis and Review, Orlando FL, 2000