24
Character String Predicate Character String Predicate Based Automatic Software Test Based Automatic Software Test Data Generation Data Generation Ruilian Zhao Ruilian Zhao Computer Science Dept. Computer Science Dept. Beijing University of Beijing University of Chemical Technology Chemical Technology [email protected] [email protected] n n Michael R. Lyu Computer Science Dept. Chinese University of Hong Kong [email protected]

Character String Predicate Based Automatic Software Test Data Generation

  • Upload
    jarvis

  • View
    37

  • Download
    0

Embed Size (px)

DESCRIPTION

Character String Predicate Based Automatic Software Test Data Generation. Michael R. Lyu Computer Science Dept. Chinese University of Hong Kong [email protected]. Ruilian Zhao Computer Science Dept. Beijing University of Chemical Technology [email protected]. Outline. - PowerPoint PPT Presentation

Citation preview

Page 1: Character String Predicate Based Automatic Software Test Data Generation

Character String Predicate Based Automatic Character String Predicate Based Automatic Software Test Data GenerationSoftware Test Data Generation

Ruilian ZhaoRuilian ZhaoComputer Science Dept. Computer Science Dept.

Beijing University of Beijing University of Chemical TechnologyChemical Technology

[email protected]@mail.buct.edu.cn

Michael R. LyuComputer Science Dept.

Chinese University of Hong Kong

[email protected]

Page 2: Character String Predicate Based Automatic Software Test Data Generation

OutlineOutline

Introduction

An overview of related work

Test data generation based on character string predicate

Experimental results

Conclusion

Page 3: Character String Predicate Based Automatic Software Test Data Generation

IntroductionIntroduction

Software testing is usually difficult, expensive and time consuming.

If test data could be automatically generated, the cost of software testing would be significantly reduced.

Page 4: Character String Predicate Based Automatic Software Test Data Generation

IntroductionIntroduction

There are many automatic test data generation approaches.

But, little attention has been paid to the problem of test data generation for programs

whose predicates can contain character string variables.

Page 5: Character String Predicate Based Automatic Software Test Data Generation

IntroductionIntroduction

Character string is an important element in programming.

Here, we present an approach to

automatically generate test data for program paths

that include character string predicates, and

a corresponding test data generator is developed.

So,how to generate test data of character string

is a problem that needs further research.

Page 6: Character String Predicate Based Automatic Software Test Data Generation

IntroductionIntroduction

The effectiveness of test data generator is examined on a number of programs.

The experimental results illustrate that the test data generator is effective.

Page 7: Character String Predicate Based Automatic Software Test Data Generation

An overview of related workAn overview of related work

1. Predicate-based testing

Predicate testing is a common approach to software testing, which requests each predicate

in the program under test to be checked.

There are a lot of predicate testing strategies.

However, they demand that predicates in tested programs must be numerical predicates.

Page 8: Character String Predicate Based Automatic Software Test Data Generation

An overview of related workAn overview of related work

There are many automatic test data generation approaches.For example,

Symbolic execution-based test data generation

Random test data generation

Dynamic test data generation

2. Test data generation

Page 9: Character String Predicate Based Automatic Software Test Data Generation

An overview of related workAn overview of related work

However, they do not generate test data of character string.

Some systems are developed by using testing techniques to generate test data of integer, real or float types.

Page 10: Character String Predicate Based Automatic Software Test Data Generation

Test data generation based on Test data generation based on character string predicatecharacter string predicate

The goal of test data generation is to find a program input on which

a chosen program path will be traversed.

This problem can be reduced to a sequence of subgoals where each subgoal is solved by

performing function minimization using gradient descent.

Page 11: Character String Predicate Based Automatic Software Test Data Generation

Test data generation based on Test data generation based on character string predicatecharacter string predicate

A character string predicate is the predicate that consists of at least one character string variable

and one character string comparison function.

We focus on how to automatically generate test data for program paths that include character string predicates.

Page 12: Character String Predicate Based Automatic Software Test Data Generation

Test data generation based on Test data generation based on character string predicatecharacter string predicate

Similarly to the numerical predicate, we can construct a branch function with regard to

a character string predicate, which is not take the requirement branch, so that its value is positive for initial input x0.

For example, strcmp(str1,str2) > 0Let (x)=str1-str2 , if str1 - str2 is positive for initial input x0,

otherwise (x)=str2-str1 .

The current values of str1 and str2 in this predicate can be calculated by using program instrumentation technique.

Page 13: Character String Predicate Based Automatic Software Test Data Generation

Test data generation based on Test data generation based on character string predicatecharacter string predicate

The program input is adjusted gradually until (x) becomes negative.

A problem that we must resolve is how to compare two character strings as well as

how to evaluate the branch function (x) .

The required inputs have been found,namely, the predicate takes the requirement branch.

Page 14: Character String Predicate Based Automatic Software Test Data Generation

Test data generation based on Test data generation based on character string predicatecharacter string predicate

So, we first define a function ع

11

0

][)(

iLL

i

wistrstr

where str is a character string, L is its length,is a positive weighting factor representing

a weighted value imposed upon each character element of the string, and w is equal to 128.

1 iLw

Page 15: Character String Predicate Based Automatic Software Test Data Generation

Test data generation based on Test data generation based on character string predicatecharacter string predicate

By the theorem, a character string can be transformed into a unique nonnegative integer.

N )(str )(str )(str

Theorem: Suppose S is a set of character strings, is a set of nonnegative integers. Let is defined as above.

Then is a one-to-one function from S to .)(str

N)(str

N

Page 16: Character String Predicate Based Automatic Software Test Data Generation

Test data generation based on Test data generation based on character string predicatecharacter string predicate

The distance between two strings can be defined as below:

N )(str )(str )(str

Where L1 and L2 are the length of string str1, str2, L=max(L1,L2).

1 2( , )dis str str 11

02

11

01 ][][

21

iLL

i

iLL

i

wistrwistr

The distance dis(str1,str2) determines a nonnegative integer, and can be used to evaluate the branch function (x)

with regard to a character string predicate.

Page 17: Character String Predicate Based Automatic Software Test Data Generation

Test data generation based on Test data generation based on character string predicatecharacter string predicate

N

It is easy to see that

by the verification of above theorem

121 ]0[]0[ Lwstrstr 1

1

121 )])[],[max((

iLL

i

wistristr>

We search an appropriate adjustment direction for the 0th character of an input variable,

and adjust the character by gradient descent until 0<0.

As a result, we can find an input that makes the string predicate to take the requirement branch.

Page 18: Character String Predicate Based Automatic Software Test Data Generation

Test data generation based on Test data generation based on character string predicatecharacter string predicate

For an equality (=) or non-equality (≠) predicate , we need to construct branch functions

for every unequal character such that i >0,where i[0,L], L=max(L1,L2)

N )(str )(str )(str

Then, we search an adjustment direction to improve the branch function until i 0.

Page 19: Character String Predicate Based Automatic Software Test Data Generation

Experimental resultsExperimental resultsInt max(int argc,char ** argv){ argc--; argv++; if ((argc>0)&&('-'==**argv)) { if (!strcmp(argv[0],"-ceiling")) { strncpy(ceiling,argv[1],BUFSIZE); argv++; argv++; argc--; argc--; } else { fprintf(stderr,"Illegal option %s.\n",argv[0]); return(2); } } if(argc==0) { fprintf(stderr,"Max requires at least one argument.\n"); return(2); } for(;argc>0;argc--,argv++) { if(strcmp(argv[0],result)>0); strncpy(result,argv[0],BUFSIZE); } if (strcmp(ceiling,result)<=0) printf("\n max:%s",ceiling); else printf("\n max:%s",result); return(0);}

The specification:

Which prints the lexicographic maximum of command-line arguments.

There is one option:-ceiling

This provides a ceiling:If the maximum would be larger than

this specified ceiling, it is the maximum.

Page 20: Character String Predicate Based Automatic Software Test Data Generation

Experimental resultsExperimental resultsInt max(int argc,char ** argv){ argc--; argv++; if ((argc>0)&&('-'==**argv)) { if (!strcmp(argv[0],"-ceiling")) { strncpy(ceiling,argv[1],BUFSIZE); argv++; argv++; argc--; argc--; } else { fprintf(stderr,"Illegal option %s.\n",argv[0]); return(2); } } if(argc==0) { fprintf(stderr,"Max requires at least one argument.\n"); return(2); } for(;argc>0;argc--,argv++) { if(strcmp(argv[0],result)>0); strncpy(result,argv[0],BUFSIZE); } if (strcmp(ceiling,result)<=0) printf("\n max:%s",ceiling); else printf("\n max:%s",result); return(0);}

record (argc,0,'>',"&&");record('-',**argv, '=');if ((argc>0)&&('-'==**argv)){ record(argv[0],"-ceiling", '!'); if (!strcmp(argv[0],"-ceiling")) …; }record(argc,0,'=',"");if(argc==0)…;record(argc,0,'>',"");for(;argc>0;argc--,argv++){ record(argv[0],result, '>', ""); if (strcmp(argv[0],result)>0) …; record(argc,0,'>',"");}record(ceiling,result, '-', "");if (strcmp(ceiling,result)<=0) …;

Page 21: Character String Predicate Based Automatic Software Test Data Generation

Experimental resultsExperimental results

Considering that the FOR loop is executed 0 time, 1 time and 2 times, there are 31 paths in Max program.

We design 50 program inputs at random, which are used as the original input to the test data generator.

As a result, 16 test inputs are generated by the test data generator.

Page 22: Character String Predicate Based Automatic Software Test Data Generation

Experimental resultsExperimental results

We measure the coverage of generated test data using the ATAC coverage testing tool.

0

20

40

60

80

100

120

1 3 5 7 9 11 13 15

Path

Cov

erag

e block

decision

C-use

P-use

0

20

40

60

80

100

120

1 3 5 7 9 11 13 15

Path

Cov

erag

e block

decision

C-use

P-use

Page 23: Character String Predicate Based Automatic Software Test Data Generation

Experimental resultsExperimental resultsCompare the evaluation number of branch function in the gradient descent, the gradual descent and

the random-number test data generator under the same coverage.

0200400600800

100012001400160018002000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Path

Eva

luat

ion

Num

ber Gradient descent

Gradual descent

Random-number

Average

The gradient descent test data generator is more economical than the gradual descent and the random generator.

Page 24: Character String Predicate Based Automatic Software Test Data Generation

ConclusionConclusion

To our knowledge, this is the first automatic test data generation approach

based on character string predicates.

The preliminary experimental results show that the methodology is effective.