Design of stress testing focused on resource

Design of Stress Testing Focused on Resource

Yasuharu Nishi and Yoshinori Iizuka

Department of Chemical System Engineering, Graduate School of Engineering,

The University of Tokyo, Tokyo, 113-8656 Japan

SUMMARY

A stress test is a vital element of software system

testing. Although experiential methods have already been

proposed, no systematic design method using models has

been proposed. In this paper the authors first survey the

viewpoint of experienced engineers, then clarify the impor-

tance of focusing on resources. Next they expand their

concept of resources, then perform a causal analysis of

problems using a resource state model. Based on these

causes, the authors propose a design method and design

process for stress test items. In addition, the authors provide

an applied example of their proposed method. © 2002

Wiley Periodicals, Inc. Electron Comm Jpn Pt 3, 85(10):

51�68, 2002; Published online in Wiley InterScience

(www.interscience.wiley.com). DOI 10.1002/ ecjc.10044

Key words: software test; system test; stress; load;

resource.

1. Introduction

Software testing can be divided into unit tests, com-

bined tests, function tests, and system tests [1]. The stress

test is one of the important tests to be performed in a system

test. Stress tests are becoming extremely important as par-

allel processing becomes more common and data storage

capacity rises.

Research related to stress tests includes Myers�s con-

ceptual categories [2] and Beizer�s experiential method [3].

In addition, the use of Kaner and colleagues� boundary

value analysis method [2] has been suggested [4].

When designing an actual stress test, efficient test

designs are created by combining the methods described

above. This is because the viewpoint to be focused on in the

design of the stress test is established by an experienced

engineer.

On the other hand, Beizer described the need to

simplify the concept of the program as a way to test a

program efficiently [5]. In other words, the test design

requires a model similar to the product design. By introduc-

ing a model, the product design and a similar test design

can be performed systematically, and reviews and other

design evaluations can be made simpler.

In prior test design, a data flow model and various

other models used for unit testing have been proposed, but

a model specific to stress test design has not been proposed.

Thus, in this paper the authors propose a model

particular to stress test design and a method for stress test

design based on the perspective of an experienced engineer.

Section 2 describes the problems found in existing methods.

Section 3 investigates and analyzes the perspective to be

focused on in stress test design as seen by an experienced

engineer, then proposes a resource state model for stress test

design. In Section 4 the conditions which result in problems

in the model are clarified based on outline analysis in FT

diagrams, and stress is discussed. Section 5 explains the

methods of applying stress, then proposes a design method

for stress tests which focuses on resources. Section 6 offers

applied examples.

© 2002 Wiley Periodicals, Inc.

Electronics and Communications in Japan, Part 3, Vol. 85, No. 10, 2002Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J83-D-I, No. 10, October 2000, pp. 1070�1086

51

2. Existing Methods

2.1. Myers�s conceptual categories

Myers described the concepts necessary for stress

tests using the following four categories [2]:

• High volume tests (high volumes of data)

• Storage region tests (necessary storage region not

given)

• Stress tests (large amounts of data in brief times)

• Reliability tests (execution over the long term)

Myers goes no further than describing concepts, and so

specific test items cannot be designed.

2.2. Beizer�s experiential method

Beizer defined a stress test as a test in which all

resources are filled at the same time by using high back-

ground loads [3]. A resource refers to a storage region in

memory or on a disk. In contrast to Myers�s conceptual

categories, storage region tests and reliability tests are not

performed.

Beizer describes a design policy in which tests are

performed with consideration for the threshold values of

system resources. However, because no examples are given,

experience is required to design the specific test items.

2.3. The boundary value analysis method

Kaner�s group suggests using a boundary value

analysis method [2] for the design of stress tests [4]. The

boundary value analysis method gives as parameters near

boundary values for the input space and result space defined

in the specifications.

3. Modeling Programs Focused on

Resources

In this section surveys and analysis are performed of

the perspectives of an experienced engineer designing

stress tests. In accordance with these perspectives, the

authors propose a resource state model used to systemati-

cally design stress tests.

3.1. Investigation of perspectives needed for

stress test design

The authors performed surveys and analysis of the

perspectives which focus on stress test design as held by an

experienced engineer. In order to minimize bias, we em-

ployed engineers with over 3 years of experience at testing

organizations which are asked by various large vendors to

perform tests.

The survey method involved showing the tests to the

engineers, then offering system test items as design results.

There were nine types of tests, including groupware,

DBMS, operating systems, and device drivers. In order to

simplify the responses, a brief explanation of Myers�s con-

ceptual categories* was given.

The survey results yielded 411 system test items in

12 categories. First, 144 items which could be used in

stress tests were selected from the items obtained; sec-

ond, 4 items in which the definition of a stress test itself,

that is, a test which imparts high levels of stress, was

included. Next, 138 items for which designs could be

created without the perspective of experience but instead

using the boundary value analysis method were ex-

cluded. In this fashion the perspectives focused on when

designing the 13 items obtained are thought to represent

the perspectives focused on by experienced engineers

when designing stress tests.

Further detailed analysis was performed on these 13

items. The boundary value analysis method was then ap-

plied to the storage capacity of 8 of these items after setting

up an exchange of memory and other information; in addi-

tion, the remaining 5 items became test items used to

increase the exchange time after setting up an exchange of

memory and other information. In other words, tests de-

signs were created after setting of an exchange of all mem-

ory and other information.

The majority of what is used for exchanging infor-

mation is memory, a disk or other storage region referred

to as a resource by Beizer.

On the other hand, however, although information

such as the bandwidth of a network could be used, such

information was assumed not to represent a storage region.

In addition, analysis results expressed at various lev-

els of abstraction were obtained for the presumed resource,

from abstract expressions like memory to concrete expres-

sions like 32-KB segments.

Based on the above, we can conclude that the per-

spectives necessary for the design of stress tests include

�resources� at various levels of abstraction in the broad

sense given by Beizer�s definition.

3.2. Outline of the resource state model

A model specific to stress tests is necessary for per-

forming stress test design systematically. In this section the

authors model the operation of a program in accordance

with the resource perspective developed in the previous

*No investigation of efficiency tests, setup tests, or service type tests was

made.

52

section. This model will be referred to as a resource state

model. Figure 1 shows an outline of the resource state

model.

This research describes a mechanism which gener-

ates problems due to stress by using a resource state model

in order to derive stress test items. Here a problem refers to

the occurrence of incorrect condition or output results that

do not match expectations [6].

For instance, this could refer to a problem in which

an incompatibility occurs in the content of a database due

to a competitive state found when a CPU becomes heavily

loaded due to multiprocess processing, or a problem which

occurs when a WRITE operation to a resource occurs

continuously over very short times due to a large number

of tasks or a content inconsistency occurs in the state value

of a resource.

In Section 3.3 a resource is modeled, and in Section

3.4 the resource state value and the related exchanges are

modeled. In Section 3.5 the concept of a layer is described

as a model for the level of abstraction of the resource, and

in Section 3.6 a task is described as a model of program

operation.

3.3. Resources

Although Beizer has represented memory, a disk, or

other storage region as a resource [3], based on the analyti-

cal results in Section 3.1, an experienced engineer would

also include network bandwidth as a resource.

Thus, in this paper the definition of a resource will be

expanded to �an object which exchanges information held

by itself in order to execute a program.� As a result, network

bandwidth can also represent a resource. Here the informa-

tion held in a resource will be referred to as the resource

�state value.�

By using this definition the execution of a program

can be taken to be �a chain of exchanges of the respective

state values of several resources.�

3.4. Resource state values and exchanging

them

The state values exchanged by resources include not

only the content of a resource, such as the data being input

or output or interim computational results, but also various

resource attributes such as pointers and mutex.

In order to make their work general, in the authors�

model only the state values absolutely necessary are mod-

eled under conditions in which exchanges are occurring

among several resources.

When one resource is working with several resources,

first the working resource must be identified and indicated

from among the other resources. If conversely a large

amount of data is passed to it as a form of stress, buffer

overflow may result, and it will not be possible to identify

the resource. In such cases a general protection fault or

similar problem will occur.

When the resource is identified, the content of the

data it holds is passed to the designated resource. The

designated resource then holds the content of the passed

data thereafter.

When data are passed from several resources at the

same time, nonexclusive control is implemented in order to

avoid a competitive state, and the resource must be desig-

nated as dominant. In particular, when a lot of data are

passed as stress at the same time, competitive states can

readily occur.

The resource state value can be taken to be in three

forms, as follows. In the authors� model, these are expressed

as shown below.

[Identification state value (ID)]

The identification state value is the state value held

by the resource in order to identify and indicate the resource

to be used from among the various resources. Pointer and

variable names, and the sector number represent the iden-

tification state value.

[Content state value (DATA)]

The content state value is the content of the data to be

exchanged with another resource. The value of the variable,

and the data in memory or on disk represent the content

state value.

[Dominant state value (LOCK)]

The dominant state value is the state value held by the

resource in order to indicate that the exchange is being

performed nonexclusively under conditions in which ex-

changes with other resources are occurring at the same time.Fig. 1. Resource state model.

53

An empty transmission path in mutex or Ethernet represents

the dominant state value.

The exchange of state values occurring among re-

sources can be taken as representing two operations, those

of referencing and updating state values, when seen from

the viewpoint of one resource.

In order to reference a state value, the ID for the

resource to be referenced is required. In order to update a

state value, the ID and DATA for the resource to be updated

are required. In the authors� model, the ID and other infor-

mation needed for such referencing or updating are referred

to as a �parameter.� If a parameter can be passed using a

reference to another resource such as a pointer, then it can

be displayed to the program as a variable name or it can be

passed from the outside as would be input from the user.

In other words, in order to model the exchange of

state values between resources, �READ� and �WRITE� for

the three types of state values as well as their respective

parameters must be considered.*

Hereafter the expressions RI and WD will refer to

READ (R.) and WRITE (W.) for the state values, and the

subscripts will refer to the state value type

(ID/DATA/LOCK) using uppercase letters. Note that P.

indicates the state value passed as a parameter. For instance,

RI would refer to an ID state value (ID) READ, WD to a

content state value (DATA) WRITE, and PI to an ID passed

as a parameter. In addition, IDs which cannot be identified

are called �NULL IDs.�

3.5. Level of abstraction for resources

3.5.1. Layers and resource pools

Based on the analysis results obtained in Section 3.1,

an experienced engineer will represent resources using

various levels of abstraction from abstract expressions such

as memory to concrete expressions such as 32-KB seg-

ments, then will perform boundary value analysis.

As an example let us consider acquiring a memory

block from an OS heap region using an operation like

malloc() in the C language. If a test designer implements

the test design by referring only to the boundary value

representing the maximum size of the heap region, then

there is no need to consider a resource which represents a

memory block. Conversely, if the designer implements the

test by also considering the internal boundary value which

represents the maximum size of the memory block which

can be acquired, then a more concrete resource, such as a

memory block, must also be considered.

Thus, when a test design is implemented by also

considering a more internal boundary value, a more con-

crete resource, that is one which has a lower level of

abstraction, must be used. In the authors� model, the level

of abstraction of a resource is called a �layer,� and resource

in a particular layer is defined as subdivision of the resource

in an upper layer. In other words, expressing the same

resource as a resource in a lower layer means implementing

test designs by considering the boundary values which

occur at the subdivision. Here, in order to make the distinc-

tion between an upper and lower layer, the resource imme-

diately about a particular resource will be referred to as the

�resource pool.� Figure 2 shows a diagram of a resource

and a resource pool.

3.5.2. Allocating and freeing resources

In the malloc() example above, because there is a

limit to the size of the heap region, memory blocks which

are not read cannot be returned. When not returned, mem-

ory insufficiency and other problems resulting from mem-

ory leaks may occur if the resource acquisition is performed

many times as a form of stress.

In the authors� model, an operation which involves

acquiring a memory block is called �allocating� a resource,

and an operation which involves returning a memory block

is referred to as �freeing� a resource. Also, the size of a

resource which can be allocated will be referred to as the

�size� of the resource pool. The size can be set externally,

and is an important element for test design. Below the

expressions A and F will refer to allocating (A) a resource

and freeing (F) a resource.

3.5.3. Overflow and insufficiency

As discussed in Section 3.5.1, when implementing a

test design by considering a resource with a more internal

boundary value, the resource operation must consider more

internal elements.

Fig. 2. Resource and resource pool.

*For convenience, a write to a dominant state value will represent a

dominant operation or a dominant clear operation.

54

As an example, let us consider an array with five

elements. In this instance, the array is the resource pool, and

the array elements are the resources. If the test designer only

considers the layer that is the array, array substitutions can

only be performed all at once. On the other hand, if the test

designer considers the layers that are the array elements,

then substitutions in the array can be considered as substi -

tution sets for the individual array elements.

If substitutions into the array elements are performed

six times here, then problems will occur. This problem is

referred to as buffer overflow. It is equivalent to when a

complete substitution is performed repeatedly on a matrix

and checks on boundary conditions are not performed

appropriately. In other words, problems such as a general

protection error or an update to a variable that should not

be updated occur, because a substitution is performed in an

illegal region after the sixth substitution exceeds the matrix

boundary or boundary condition checks are not performed.

In particular, if large amounts of data are passed as stress,

these conditions can readily occur.

In this fashion operations must be subdivided in

accordance with the breakdown of resources. If the parame-

ter PS is used for subdivision, the row in the matrix above

would experience problems when PS is six for a resource

pool whose size is five. This is called �overflow� and occurs

when PS is larger than the size.

A separate problem occurs when PS is 4. If a complete

substitution is performed with PS = 4, then later operations

are performed with PS = 5, the matrix elements which are

to be substituted are not substituted, and so the output will

be irregular. This is called �shortage� and occurs when PS

is smaller than the size.

Overflow and shortage are shown in Fig. 3. The

assignment of PS can be performed directly using a value

input by software, or it can be performed indirectly using

the data size.

3.6. Program operation

Based on the definitions above, program operation

under the authors� model can be considered to consist of

performing operations which include allocating the individ-

ual resources, repeating the reads and writes, then freeing

in succession.

In general, program operation is implemented using

parallel processing. In parallel processing, when a lot of

processes are passed as stress, deadlock or a starvation state

can occur.

The execution unit during parallel processing under

the authors� model is referred to as a �task� and is defined

as follows. A task consists of operations on one or more

resources. When a resource to be operated on is controlled

by another task, or when the conditions defined in the

execution environment are met, a task transfers its opera-

tions to another task and then �waits.�

When a task waits, a task which is waiting is selected

based on priority and then executed. This is called a �task

switch� in the authors� model, and the priority for a task

switch is called the �execution priority.� When a particular

task�s execution priority is sufficiently low, if a lot of

processes are passed as stress, a starvation state will result

with the task stopped in the wait state.

The determination of task selection and the number

of tasks being executed, as well as the number of tasks to

be executed in succession represent important elements of

test design. In addition, the determination of the speed of

resources which affect the wait time is also a part of test

design.

4. Causal Analysis of Problems Using FT

Diagrams

In Section 4.1 the problems which can occur in

programs which are expressed using the resource state

model are described. The causes of these problems are

analyzed using FT diagrams in Section 4.2, and stress is

discussed in Section 4.3.

4.1. Problems in the resource state model

A problem represents a situation in which the output

results do not match what was expected or in which an

illegal phenomenon occurs [6]. An illegal phenomenon

refers to something that is deemed illegal regardless of the

program specifications, such as the occurrence of an error

of the program stopping, or something for which the defi-

nition of illegality varies depending on the program speci-Fig. 3. Overflow and shortage.

55

fications, such as a drop in speed. For the latter, the rela-

tionship with specifications is discussed in detail in a sepa-

rate publication, and so in this paper only the former will

be dealt with. As a result, in this section problems which

can occur in the resource state model when output results

do not match expectations or when a phenomenon is

deemed illegal regardless of the program specifications will

be clarified.

In Section 3.3 the execution of a program can be taken

to be a chain of exchanges of resource state values. The

output results can be considered the final link in the chain,

and so when the output results do not match expectations,

this means that the results written to the resource which

comprises the chain do not match expectations.

On the other hand, an illegal phenomenon can be

considered to occur during individual operations in a task

or between operations.

An illegal phenomenon which occurs during an op-

eration can be considered as occurring when the conditions

necessary for the operation to succeed are not satisfied. The

conditions necessary for an operation to succeed can be

broken down into two categories: acquire and

write/read/free. The conditions necessary to successfully

acquire a resource are only that the size of the resource to

be acquired is smaller than the size of the resource pool. If

a large resource is to be acquired as stress, problems such

as a memory insufficiency may occur due to memory leaks.

In contrast, the conditions necessary for a

write/read/free for a resource are that the ID can be dis-

criminated and is not controlled. If a large amount of data

are passed as stress, buffer overflow may occur, and a

problem such as a general protection error may occur

without ID discrimination. On the other hand, if a resource

is controlled continuously as a result of stress, then prob-

lems such as data lock in which a task is not executed may

occur.

An illegal phenomenon which occurs between two

operations is related to task switching. Because a task

switch only involves selecting the tasks to be executed, a

problem such as the starvation state in which a task is not

selected or executed may occur.

In this manner there can be said to be four types of

problems in the resource state model. These problems are

represented in the following fashion in the authors� model.

[Internal Inconsistency]

An internal consistency occurs when the internal

state value (DATA) of a resource to be read does not match

what is expected. A problem due to a conflict state is an

internal inconsistency.

[Resource Exhaustion]

Resource exhaustion occurs when the size of the

resource pool is smaller than the size of the resource to be

acquired, and the resource cannot be acquired. Memory

lock represents resource exhaustion.

[Protection Exception]

Protection exception occurs when an ID which can-

not be discriminated (null ID) is passed as a parameter. A

general protection fault is a protection exception.

[Wait Stop]

A wait stop occurs when a task stop occurs while

waiting and not executing. Deadlock and the starvation

state represent a wait stop.

4.2. Analysis of causes of problems using FT

diagrams

In this section the causes for each of the problems

described in the previous section are analyzed. The analyses

use FT diagrams [e.g., 7].

An FT diagram is a tree which represents hierarchi-

cally the causes of a problem. An element in a branch is

normally called an �event,� an element at the root a �top

event,� and an element at a leaf a �basic event.� A basic

event is an event which cannot be analyzed further using

the information given. Also, conditions called an �AND

gate� and �OR gate� are provided in order to express

whether or not there is a need to have several causes occur

at the same time.

In order to avoid having the FT diagrams become

confusing in this analysis, the authors created a supplemen-

tary FT diagram in which �A illegal,� �A extra,� �R./W./F.

illegal,� �R./W./F. extra,� and �timing off� represent top

events. In addition, in cases in which the cause of an internal

inconsistency is an internal inconsistency in another re-

source or parameter, the FT diagram was made to be itera-

tive. Note also that in an FT diagram, WO represents

�occupy� and WR �release.� The main FT diagrams and

supplementary FT diagrams the authors created are in the

Appendix.

In the authors� research the next step was the derived

basic events. This resulted in the basic event being broken

down into seven classifications. The basic event and its

classifications are shown in Table 1.

4.3. Stress in the resource state model

A problem occurs when the portion of a program that

is the trigger is executed. There are many causes of prob-

lems, such as specification errors and program defects due

to omissions, or hardware that fails after operational chaos.

Regardless, a problem will not necessarily occur even if the

trigger portion is executed.

For instance, a defect such as the lack of the code to

free a memory block which is acquired using malloc() will

not cause a problem merely by executing the defective part.

In order to cause a problem, stress must be applied in a form

which causes a number of executions in excess of the size

of the heap region.

56

In order to cause these kinds of problems, conditions

in addition to the part which represents the trigger are at

times necessary. When representing a program using the

resource state model, the basic elements in Table 1 derived

using the FT diagram in Section 4.2 represent the problem

trigger and the conditions with respect to the problems

described in Section 4.1.

In the authors� research the condition which causes a

problem when executing the part which represents the

problem trigger is referred to as �stress.� In other words,

stress in the authors� research can be taken to represent the

conditions which make clear the defector failure that is not

normally visible. Therefore, items such as the problems in

the program or the problems with hardware that appear in

Table 1 indicate the parts which represent the problem

triggers, and items such as size, frequency, speed, timing,

and illegal input can be interpreted as classifications of

stress. These items will be called �stress items.� In the next

section the characteristics and procedures necessary to con-

trol stress items are discussed.

5. Proposal of a Stress Test Design Method

with Focuses on Resources

In this section the method for passing stress, in other

words the characteristics used to control stress items, is

explained first. Section 5.2 describes the design of stress

items. In Section 5.3 the process for stress test design

focusing on resources is proposed, and Section 5.4 explains

the estimation of resources themselves.

5.1. Characteristics for controlling stress items

When designing a test, two categories of charac-

teristics must be delineated. The first is the qualitative

characteristics. Here qualitative refers to characteristics

such as what functions are executed, what features and

functions are combined and executed, what sequence is

used for the execution, and what data are used.

The second category is quantitative characteristics.

Here quantitative refers to characteristics such as how many

functions are executed at the same time, how many func-

tions are executed in succession, the size of the data, the

size of the resources, and the speed of the resources.

Because qualitative characteristics vary with soft-

ware, deriving a general design policy is problematic. As a

result, in this section quantitative characteristics which can

control stress items are discussed. This policy will be re-

ferred to as �stress driver.� In the remainder of this section

examples of stress drivers will be given. As shown in Table

2, examples will be given for three categories of charac-

teristics: characteristics which provide control from outside

using the resource state model, in other words charac-

teristics related to tasks and operations, characteristics re-

lated to parameters, and characteristics related to resources.

Table 2. Stress drivers

Model characteristics Stress driver

Characteristics related to tasks and

operations

Burst stress

Long-run stress

Characteristics related to parameters Limit stress

Volume stress

Omission stress

Characteristics related to resources Speed stress

Table 1. Categorized basic events

Category Basic event

Program problems Problems in the code

Problems in the design

Problems with tasks when

performing task switching

Low execution priority for a

particular task

Hardware problems Problems with hardware

Size Execution in excess of the size

Input in excess of the size

Input below the size

Frequency Too many tasks to be controlled

Too many tasks to be executed

The frequency of A is greater than

that of F

Speed Operation speed is fast/slow

Timing Occurs between W. and R.

Occurs before the original F

Occurs after the original F

Occurs before even WO

Occurs before even WR

Occurs during a task switch

Illegal input Input of null ID

Input of other resource ID

Input of ID that is the same as . . .

Input of strange DATA

57

5.1.1. Characteristics related to tasks

Quantitative characteristics related to tasks include

how many tasks are being executed at the same time and

how many tasks are being executed in succession.

Consequently, �burst stress� in which many tasks are

executed at the same time and �long-run� stress in which

many tasks are executed in succession can be mentioned as

stress drivers. The number of tasks executed under burst

stress or long-run stress is derived from the size of the

resources when resource exhaustion or content discrepan-

cies occur, and from the size of the resource which runs the

tasks which is performing a task switch when a wait stop

occurs. A test item to reduce the size of the resource can

also be designed when necessary.

5.1.2. Characteristics related to parameters

Quantitative characteristics related to parameters can

designate directly or indirectly, as discussed in Section

3.5.3. A parameter which is designated directly includes

PS greater or less than the size, the null ID, other resource

IDs, IDs which are the same as other resources, and strange

DATA. The PS greater or less than the size and the null ID

can be designed using the boundary value analysis method

with the resource size as the boundary. This is referred to

as �limit stress.� The remaining characteristics, however,

are not addressed here, because they are qualitative charac-

teristics for which design cannot be performed without

knowing internal information such as the resource ID or

DATA attributes.

By contrast, parameters which designate indirectly

include PS greater or less than the size. PS greater than the

size can be designed by either setting the test data size

greater than the resource size, or setting the resource size

lower than the test data size. This is called �volume stress.�

Overflow can be caused by volume stress. Also, PS less than

the size can be designed by either setting the test data size

lower than the resource size, or setting the resource size

greater than the test data size. This is called �omission

stress.� Omission stress causes operational overflow be-

cause the size of the data is smaller than the size of the

resource to be operated, in other words the program cannot

check for insufficiencies. As a result, it is vital to specialize

functions which may cause insufficiencies. For instance,

there are cases in which the data size is set above specifica-

tions, but in fact less data than the specifications indicate

are actually passed.

5.1.3. Characteristics related to resources

Quantitative characteristics related to resources in-

clude how large to make the resource size and how high to

set the resource speed. The resource size is not treated as an

independent stress driver, because it is determined by the

relative relationship between the size of the data given and

the number of tasks being executed. On the other hand, the

resource speed is a characteristic associated with timing

problems, and it affects the next operation and operations

being executed at the same time. Thus, the speed of the

resource in question is derived from the speed of the re-

source treated by the next operation or the operation of tasks

being executed at the same time. The resource settings can

be updated, and the resources themselves can be exchanged.

This is called �speed stress.�

5.2. Stress test item design

In this section, the method of applying stress, that is,

the design of stress test items, is explained by describing

the relationship between a stress item and the stress driver.

The size stress item can be passed using an operation

leak with respect to the resource using the limit, burst, or

long-run stress for a data overflow with respect to a re-

source.

The frequency stress item can be passed by burst

stress. Also, because operations subdivided in volume and

limit stress occur in large number, they can pass stress.

The speed stress item can be passed by speed stress.

Also, if the resource in question is linked in virtual memory,

and has a trade-off relationship between size and speed,

then slashing due to large swaps resulting from volume and

limit, burst, or long-run stress occur, and stress which

reduces speed can be created.

The timing stress item analyzes which software func-

tions equate with which operations in which resources, then

regulates the operational sequence by combining functions

and regulates the execution start time for functions pre-

cisely. Because precisely controlling the execution start

time for functions is problematic in practice, the probability

of execution in the real world with the desired timing can

be raised by burst and long-run stress. Also, by lowering the

speed, controlling the execution start timing should become

easier.

The illegal input stress item can be passed using

overflow resulting from volume, burst, or long-run stress,

or input which exceeds its range as a result of limit stress.

Different from problems in the program, problems

with hardware occur probabilistically, and occur more fre-

quently with wear. As a result, stress can be created by burst

and long-run, as well as volume and limit stress.

The relationship between the stress items and stress

drivers described above is shown in Table 3.

5.3. Stress test design process for resources

In this section the process for designing and imple-

menting the stress test items derived in the previous section

are explained.

58

Using the stress test items derived in the previous

section in real software first requires a phase in which the

resources of the software are analyzed. In this phase, the

resources must first all be described. If the description of

the resources is missing, none of the problems associated

with that resource can be detected. Also, resources which

are not described must be estimated. Such estimation will

be touched on in Section 5.4. In addition, the question of

up to what layer to consider the resources described must

be studied.

What is next necessary is a phase for the analysis of

functions. In this phase, functions of the software are ap-

plied to the resource state model. First, all of the functions

of the software are described. Next, it must be determined

which resource operation the functions described corre-

spond to, and which takes performing what kinds of opera-

tions can they be taken to represent. Then, the

before-and-after relationship as well as the possibility of

simultaneous execution can be determined based on the

interdependent relationship between operations for each

resource.

Once the analysis of functions is performed, we can

move into the stress design phase. In this phase, the stress

items derived from the resource state model using the FT

diagrams are turned into a real test case. First, the stress

items passed with respect to each resource are analyzed.

Next, the stress drivers being used are analyzed, and char-

acteristic values are calculated by applying the boundary

value analysis method with respect to the characteristic, all

for the purpose of applying stress. In accordance with the

calculated characteristic values, the functions to be tested,

as well as combinations of them, their sequence and timing

are determined, and then test data are designed. Finally, the

necessary items are determined, and the test case is created.

The final phase of test design is the phase in which

the test case priority is determined. First, the importance of

resources is decided with consideration for the magnitude

of cases in which problems occur. Problems which are

centered around architecture or resources which resulted in

security holes are deemed to be very important. In addition,

the importance of functions based on product charac-

teristics is also considered, and the test case priority is

determined.

In addition, the test design process also requires

efforts to prevent test item errors and omissions. As a result,

in this process a design review is implemented during each

of the phases described above. In addition, for functions

whose correspondence with a task for a resource cannot be

created due to an insufficient resource estimation, test leaks

must be prevented to the utmost possible by using stress

applied to similar functions or by using the boundary value

analysis method on data passed to the function.

The test execution phase comes after test design.

Figure 4 outlines the procedure above. The design method

proposed in Sections 5.2 and 5.3 will be referred to as the

�resource-oriented stress testing.�

Under this proposed method, stress test items can be

designed systematically while bearing in mind the cause-

and-effect relationship with problems by using the resource

state model. This is quite different from the Myers category

method and the Beizer example method.

In addition, the boundary value analysis method typi-

cally focuses only on the input and output values for its

specifications. As a result, even if detailed information

related to the design and implementation of tests is avail-

Table 3. Relation between stress items and stress drivers

Stress item Stress passed Stress driver

Size Size is large Volume, limit, burst, long-run

Size is small Omission

Frequency Frequency is high Burst

Subdivided Volume, limit

Speed Speed varies Speed

Slashing Volume, limit, burst, long-run

Timing Probability is high Burst, long-run

Control is easy Speed, volume, limit, burst, long-run

Illegal input Input is direct Limit

Input is indirect Volume, burst, long-run

Problems with hardware Probability is high Bust, long-run, volume, limit

Wear Long-run

59

able, it cannot be used effectively to discover the cause-and-

effect relationship with problems.

5.4. Resource estimation

The method proposed in the authors� research is

effective when the test designers have sufficient informa-

tion about resources. However, information about resources

which must be considered after the detailed design of

software is often not given to the test designers. In such

cases, the test designers must extract silent resource infor-

mation which is not known by performing a task known as

�estimation.� In particular, the specifications of recent soft-

ware are fluid, and so the task of estimation is becoming

increasingly important.

This being said, the task of estimating the presence

of resources and the characteristics of resources depends on

experience, and can be performed based on experience, as

is test design. Thus, by providing the perspective for esti-

mation to the test designers beforehand, precise estimations

become possible. Examples of this perspective go beyond

the scope of this research, and so here examples of estima-

tion of resources themselves will be given.

• Functions which use prior execution results

• Multiple functions which use the same type of

data

• Software behavior

• Knowledge accumulated or acquired beforehand

6. Applied Example

This section provides an applied example of the

proposed method using software which has been shipped

to the marketplace. Also, for reference, results of the expe-

riential method are given, and a comparison and commen-

tary are provided. The test items are two World Wide Web

browsers.

6.1. The tests

For these tests two commercial World Wide Web

browsers were obtained. These represent software which

effectively divide the market into two. They will be referred

to as Browsers A and B. The OS was Windows 95.

The tests were implemented for text functions and

table functions activated by the tags shown in Table 4. The

text functions were 23 formatting tags classified into Font

Style, Headings, Phrase in HTML 3.2 [8]. The table func-

tions were the table tags TABLE, TR, and TD.

6.2. Design of stress test items

Test design using the proposed method was com-

pleted in roughly 2 hours by the authors. For the conven-

tional test design, design results from engineers with over

8 years of experience in test organization were used. The

total time required for design was 4 hours.

6.2.1. Test design using the proposed method

In test design using the proposed method, first re-

source estimation was performed in accordance with the

design process proposed in the previous section. By focus-

ing on the nesting of the tags from the standpoint of �func-

tions using prior execution results� for formatting tags,

resources which store the depth of the nesting was esti-

mated. The estimated resource will be referred to as a tag

buffer. Table 4 shows the tags which use the tag buffer. For

the table tags, the resources which store the depth of the

nesting were estimated in the same way as was done for the

formatting tags, but estimation could not be performed for

the resource characteristics. These estimated resources will

be referred to as the table buffer.

Next, task analysis was performed. The following

results were obtained: the buffer is acquired and written

to each time the World Wide Web browser reads a tag.

Then, the buffer is read whenever the read data are

Fig. 4. Design process of resource-oriented stress

testing.

60

displayed, and is then freed each time a World Wide Web

page is displayed.

Only volume stress was designed with respect to the

measured buffer for the test case. Volume stress was de-

signed for tags which use the measured buffer as the test

case. For tags which do not use the measured buffer, test

items using volume stress were passed along. Because the

resource characteristics could not be measured for format-

ting tags, 46 items were designed using 10,000 or 1000 as

nesting based on the limitations of execution time. In order

to maintain the data file and size designed using the format-

ting today test case at roughly the same level, 6 items were

designed for the table tags: a table with 1000 rows and

columns, a nested table with 10 rows and 10 columns, a

table with 10,000 rows and columns, and a table with 30

levels of nesting in 30 rows and 30 columns.

As for the priority for the test cases, the test case tags

which could estimate the buffer were given priority. This is

because the buffer overflow occurred as a problem in the

World Wide Web browsers, a major security hole resulted.

Also, no design review was implemented.

6.2.2. Test design using the experiential

method

For formatting tags, three tests were designed.

First was a �same tag iteration test� for which the

same tag was used with 100 repetitions without nesting and

the data in the tags rose in proportion to the number of

times.

Next, a �same tag nesting test� was designed with 100

levels of nesting for <BIG> and <SMALL>, the effect of

multiple designations. For the eight Phrase tags, a �distinct

tag nesting test� was designed to take the other 15 tags

besides Phrase and nest them cyclically.

Half-sized and full-sized numbers and Katakana

script, plus full-sized Hiragana script and Kanji characters

of six types were used for the test data.

The test items designed consisted of a total of 198

items: 138 items for the same tag iteration test, 12 items for

the same tag nesting test, and 48 items for the distinct tag

nesting test.

A test item with a table of 1 row and 1 column to n

rows and n columns stacked vertically was used for the table

tags. There were three items: n at 30, 40, and 50. The test

data consisted of half-sized numbers.

6.3. Detection results

6.3.1. Detection results using the proposed

method

In the proposed method, 26 problems were detected

using the 53 test cases*. Analysis of similar problems

showed that they could be classified into five categories.

Table 5 shows the detection results.

6.3.2. Detection results using the experiential

method

In the experiential method, 92 problems were de-

tected using 198 test cases. Here as well, analysis of similar

problems showed they could be classified into four catego-

ries. Table 6 shows the detection results.

6.4. Evaluation and discussion

In the proposed method, problems which could not

be detected in the developmental organization of the test

were readily detected. In particular, the general protection

fault detected in the test of table tags for Browser B repre-

sents a security hole for a World Wide Web browser, and so

is a problem which must be detected and revised properly.

The significance of detecting a problem like this is very

high.

In addition, in the formatting tag test using the pro-

posed method, the problems of extra lines and new lines

being out of position detected under the experiential

method could not be detected. Also, priority for tags which

did not use the tag buffer was lowered.

Nevertheless, the results of considering the detection

results using the proposed method after the test execution

clarified design errors such as resource estimation leaks

which the buffer for the size of the screen could not esti-

*In the test item (30 rows, 30 columns, 30 levels of nesting) for the table

tags in Browser B, the problem which occurred at the same time that

Browser B abnormally ended could not be detected. As a result, a test case

for a 5 row, 5 column, 5 levels of nesting was specially designed and

implemented.

Table 4. Tags to be stress-tested and tags guessed to

use buffers

Tags to be tested Tags using buffers

Font style B, BIG, I, SMALL,

STRIKE, SUB, SUP,

TT, U

B, I

Headings H1, H2, H3, H4, H5,

H6

H1, H2, H3, H4, H5,

H6

Phrase CITE, CODE, DFN,

EM, KBD, SAMP,

STRONG, VAR

CITE, DFN, EM,

STRONG

Tables TABLE, TR, TD TABLE

61

mate. In the proposed method, however, because design

review was implemented with a focus on resources, if an

appropriate design review had been implemented, resource

estimation leaks would have been identified, and found to

represent problems which could not be detected. On the

other hand, the results of considering the detection results

using the experiential method after the test execution were

complex and problematic due to the need to consider each

test case, and the causes of detection leaks were not speci-

fied. In the same fashion, because the design review was

not carried out for each test case, complexity and problems

were expected.

In this example, the number of functions to be tested

was limited, and the qualitative characteristics of the func-

tions were comparatively clear. When designing test items

in practice, the qualitative characteristics of the functions

must be extracted, and then the selection and combination

of functions must be found using an FT diagram. This work

requires interpretation of a resource state model and FT

diagrams, though the interpretation of the model and FT

diagrams, as well as the analysis of tasks may not be so easy,

depending on the test in question.

In addition, an inclusive example of resources is still

problematic, even if the internal structure is known. As a

result, the authors expect examples based on the importance

of resources and estimation viewpoint to become more

important. In particular, the adjustment of viewpoint for the

estimation of resource characteristics which cannot at pre-

sent be estimated represents a vital task.

7. Conclusion

In this paper the authors created a resource state

model, a model of the operation of programs which focuses

on resources. The authors then performed causal analysis

of problems in accordance with their model, and described

a design method for stress test items. In addition, the

authors proposed a stress test design process which focuses

on resources. Using the proposed method, systematic and

efficient stress test design becomes possible.

Table 6. Results of non-resource-oriented stress testing

Formatting tag tests Table tag tests

Test items for which

problems were detected

Undetected problems Test items for which


Undetected problems

Browser A Same tag iteration test New line position off None None

Browser B Same tag iteration test New line position off All Upper right frame and

lower frame lost

Same tag iteration test Extra line present

Table 5. Results of the resource-oriented stress testing

Formatting tag tests Table tag tests

Test items for which


Undetected problems Test items for which


Undetected problems

Browser A Tags using the tag buffer Nesting invalid 30 rows

30 columns

30 levels of nesting

A table frame table was

inverted, and upper right

frame was lost

Browser B Tags not using the tag

buffer

Upper right character lost 30 rows

30 columns

30 levels of nesting

General protection fault

and memory

insufficiency

5 rows

5 columns

5 levels of nesting

Upper right frame and

lower frame lost

62

Major problems which cannot be detected during the

development process can be detected by applying the pro-

posed method to real-world software products.

Future topics of research include a proposal for a

detailed analysis method for tasks and the interpretation of

the resource state model and FT diagrams, a proposal for a

viewpoint of estimation and examples based on the impor-

tance of resources, and large-scale use in real-world test

design.

Acknowledgments. The authors express their

gratitude to Professors Hitoshi Kubome and Musashi

Nakashino of the Science and Engineering Faculty of Chuo

University for their valuable input, Professor Shimon Ken

of Musashi Institute of Technology, and the members of the

System Evaluation Office of the Business Products First

Division of CSK Business Systems. In addition, they thank

Mercury Interactive Japan K.K. for providing their power-

ful tools. Also, the authors thank their readers for invaluable

input.

REFERENCES

1. Pressman RS. Software engineering�A practitio-

ner�s approach. McGraw�Hill; 1997.

2. Myers GJ. The art of software testing. John Wiley &

Sons; 1979. Software testing methodology. Kin-

daikagakusha; 1980.

3. Beizer B. Software system testing and quality assur-

ance. ITP; 1996.

4. Kaner C, Falk J, Nguyen HQ. Testing computer soft-

ware. ITP; 1993.

5. Beizer B. Software testing techniques. Van Nostrand

Reinhold; 1990. Software testing techniques. Nikkei

BP Corp.; 1994.

6. Beizer B. Black-box testing. John Wiley & Sons;

1995. An introduction to real-world software testing.

Nikkei BP Corp.; 1997.

7. Shiomi H, Shimaoka J, Ishiyama K. FMEA and FTA

applications. Nikka Giren Publishing; 1984.

8. Raggett D. HTML 3.2 Reference Specification, W3C

Recommendation, 1997.

http://www.w3.org/TR/REC-html32.html

APPENDIX

1. Supplemental FT Diagrams Using Causal

Analysis

Figure A-1 shows a supplemental FT diagram for A

illegal, A extra, and R./W./F illegal. Figure A-2 shows a

supplemental FT diagram for R./W./F extra and timing off.

2. Main FT Diagrams Using Causal Analysis

Figure A-3 shows a main FT diagram for an internal

inconsistency and resource exhaustion. Figure A-4 shows a

main FT diagram for a protection exception and wait stop.

63

Fig. A-1. Supplemental fault trees (1).

64

Fig. A-2. Supplemental fault trees (2).

65

Fig A-3. Main fault trees (1).

66

Fig A-4. Main fault trees (2).

67

AUTHORS (from left to right)

Yasuharu Nishi (student member) graduated from the Department of Chemical Systems in the School of Engineering at

the University of Tokyo in 1995 and completed his master�s program in 1997. Currently he is working on his doctorate there.

He is pursuing research related to software quality, particularly software testing. He is a member of the Information Processing

Council, the Japan Quality Control Society, IEEE-CS, and ACM.

Yoshinori Iizuka graduated from the Department of Statistics in the School of Engineering at the University of Tokyo

in 1970 and completed his master�s program in 1975. He then became a lecturer in the College of Electronic Communications.

He became a lecturer in reactive chemistry at the School of Engineering in 1976, an assistant professor in 1984, and in 1994

moved to the Department of Chemical Systems in the Graduate School due to a restructuring there. He has been a professor

there since 1997. He holds a D.Eng. degree, and is pursuing research related to quality control and statistical analysis. He is a

member of the Japan Quality Control Society, the Applied Statistics Association, the Japan Reliability Council, the Chemical

Engineering Association, and ASQ.

68

Documents

Design of stress testing focused on resource