1 Today More on random testing + symbolic constraint solving (“concolic” testing) Using summaries to explore fewer paths (SMART) While preserving level

1

Today

More on random testing + symbolic constraint solving (“concolic” testing)• Using summaries to explore fewer paths

(SMART)• While preserving level of branch coverage

• Whitebox fuzz testing (SAGE)

Start on debugging• Quick start: Zeller’s delta-debugging

algorithm and tools

2

SMART: Using Summaries in Testing

Idea: complete coverage of all paths is too expensive• Fix: limit the number of paths, but preserve

the level of branch coverage achieved by trying al paths

• How? Use function boundaries – make the exploration compositional

“Compositional Dynamic Test Generation” (Godefroid) paper is a good look at this kind of testing in general

3

The Compositional Approachint G = 0;

int foo (int x, int y) {if (x < y) G++;if (x == 10) G++;

}

int bar (int x, int y) {if (x < 50) G++;if (y > 8) G++;

}

int main () {foo(x, y);bar(x, y);

if (x + y == 40) G++;

}

How many paths through this program?

2

4

8

16

32

4



}


}


if (x + y == 40) G++;

}

Traditional DART/CUTE will have to executethe program 32 times, and explore 32 paths.

One idea: treat each function independently –don’t worry about paths that cross functionboundaries – but still solve for all paths througheach function.

When we reach a return from a function, use thebasic algorithm to find all paths through that function.Afterwards, “don’t care” what path we take through it

5



}


}


if (x + y == 40) G++;

}

Now how many paths?

2

4

4

2

4

4

2

2

Total = 4 + 4 + 2 = 10 (vs. 32)

6



}


}


if (x + y == 40) G++;

}

2

4

4

2

4

4

2

2

Have we lost anything?

Not branch coverage – assumingcomplete exploration is possible,and all constraints are linear, wepreserve level of branch coverage.

But we could miss bugs…

7


int a[4];


}


}


if (x + y == 40) G++;

a[G] = 3;

}

2

4

4

2

4

4

2

2

Could handle by adding explicit“branch” for array bounds check,however.

8


int a[4];


}


}


if (x + y == 40) G++;

a[G] = 3;

}

2

4

4

2

4

4

2

2

When solving the constraints, SMARTmakes use of “procedure summaries”to avoid re-analyzing functions in fulldetail – but these are a static analysistopic more than a testing topic.

9

Whitebox Fuzz Testing

SAGE• Now we’ll take a quick run through

Godefroid’s invited talk at the 2007 Workshop on Random Testing

10

Debugging

11

Debugging, in the Trenches

Rasala put his hands on his desk and buried his face in them.

It was just another routine day down at debugging headquarters.

In the back of Veres’s mind still lies a small suspicion that the problem might after all be noise. And now – much to Guyer’s delight, when he finds out later on – it is Veres himself who disconnects the I-cache. Then he runs the program past the point of failure, and everything works. He puts the I-cache back in and once again Gollum fails. This doesn’t prove the IP is to blame, but it does tend to eliminate noise as a suspect, once and for all. . .

from THE CASE OF THE MISSING NAND GATE (Chapter 10 of Kidder’s The Soul of a New Machine)

12

(The Soul of a New Machine)Kidder’s book tells the storyof the development of a micro-computer in the early 80s.

The book’s a classic – won theAmerican Book Award for non-fiction. Anyone who cares howcomputers are made (or howpeople work) should read it.

Chapter 10 is a classic story ofdebugging a hardware problem.The ideas apply just as well tosoftware, and this is the bestdescription of heavy-duty debugI’ve ever seen.

13

Debugging

Debugging is really hard -- even with a good failing test case in hand

One of the most time-consuming tasks in software development [Ball, Eick]

Locating the fault is the most time-consuming part of debugging [Vesey]

14

Debugging

Takes as much as 50% of development time on some projects

Arguably the most scientific part of “computer science” practice• Even though it’s usually done in a totally

ad hoc, haphazard way!

“Debugging is twice as hard as writing the codein the first place. Therefore, if you write the codeas cleverly as possible, you are, by definition, notsmart enough to debug it.” - Brian Kernighan

15

Debugging and Testing

What are test cases for, anyway?• Often: so we can locate and fix a fault• Or: so we can understand how serious the

failure is, and triage / “flight rule” it away• If we have many bugs, and some may not be

important enough to merit resources• Or if there is a reason we can’t change the code

and have to work around the problem

In either case, we have a “debugging” task at hand – must at least understand the failure, even if only to triage

16

Scientific Debugging

Test cases (ones that fail and ones that succeed) can be the experiments we perform to verify our hypotheses• The failing test case informs us that

there is a phenomenon to explain (apple on the head)

• Generate (or examine) more test cases to find out more about what is going on in the program

17

Scientific Debugging

Hypothesis

Failing Test

Code

Test

More Tests

PredictionPrediction ExperimentExperiment Observation+ Conclusion

Hypothesis is supported:

refine hypothesis

Hypothesis is rejected:

create new hypothesis

DiagnosisDiagnosis

18

Testing for Debugging

Several ways to use test cases in debugging:• Test case minimization (today)

• Shrink the test case so we don’t have to look at lots of irrelevant or redundant operations

• Fault localization• Give suggestions about where the fault may be

(based on test case executions)• Error explanation

• Give a “story” of causality• (A causes B; B causes C; C causes failure)

attempt to automate part of scientific debugging

Documents

1 Today More on random testing + symbolic constraint solving (“concolic” testing) Using summaries to explore fewer paths (SMART) While preserving level