ISSTA 2010 Presentation

© 2010 IBM Corporation

IBM logo must not be moved, added to, or altered in any way.

Recommended maximum length: 2 lines

For client presentations, client’s logo may go in this area

Directed Test Generation for Effective Fault Localization

Shay Artzi, Julian Dolby, Frank Tip, Marco Pistoia

IBM T.J. Watson Research Center

ISSTA – July 14, 2010 – Trento, Italy

IBM Research

© 2010 IBM Corporation2

Overview

Automatically generate tests for Web applications

Synthesize test suite to maximize code coverage

Presented at [ISSTA 08]

Fault localization finds bugs given a set of tests

Statistical techniques based on executed statements

Fault localization can work over generated tests

Apply localization on the set of tests generated for coverage

Presented at [ICSE'10]

Can tune generation of tests to aid fault localization?

IBM Research


Outline

Example

Enhanced Fault Localization

Directed Test Generation for Fault Localization

Evaluation

Related Work

Conclusion

IBM Research


Running PHP Example

1 if (isset($_REQUEST['query']) ){ 2 3 switch ($_REQUEST['query']) { 4 case "model": 5 $kf="model_id"; break; 6 case "make": 7 $kf="make_id"; break; 8 } 9 $v =_REQUEST['key'];10 11 $sql="SELECT * FROM VEHICLES WHERE " . $kf . " = '” . $v . “'”;12 $result=mysql_query($sql);13 $n = mysql_numrows($result);14 15 } else if (isset($_REQUEST['update']) ) { … }

IBM Research


Running Example – Success


query → 'make'key → 'Ford'

'query' is set in request

$kf set to 'make_id'

SELECT * FROM VEHICLES WHERE make_id = 'Ford';

returns 2 result rows

$n is assigned 2

IBM Research


Running Example – Failure


'query' is set in request

switch falls through

$kf is never assigned

SELECT * FROM VEHICLES WHERE = 'Ford';

returns null due to invalid sql

mysql_numrows(): supplied argument is not a valid MySQL result resource

query → 'company'key → 'Ford'

IBM Research


Fault Localization

Given a failure, find responsible bug

Statistical techniques on set of passing, failing tests

Correlate executed statements with manifestations of bug

Specific formula: Ochiai [Abreu et al, ISDC’06]

suspiciousness j ≡ failing tests executing statement j

failing tests× failing passing tests executing j

IBM Research


Fault Localization Example

Apply Ochiai to the two example executions

Correctly find two blameless statements

But cannot isolate fault precisely

statement 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

passing 1 1 1 1 1 1 1 1

failing 1 1 1 1 1 1

suspiciousness .7 .7 0 0 .7 .7 .7 .7

IBM Research


Enhanced Fault Localization

Some faults caused by absence of code

Not captured by statistics over statements

E.g. don't handle all values from a function (new)

Extend “statement” for call to include abstraction of result

Refine blame to kind of result

Abstract value, e.g. 0, non-zero, null, non-null, etc

E.g. missing 'else' or switch case (ICSE 2010)

Extend conditional and switch with the outcome

Refine blame to specific control flow outcome

if statement s become (s, true) or (s, false)

switch statement s becomes (s, case number taken)

IBM Research


Enhanced Fault Localization Example

Enhanced technique isolates key issues

Fall through of switch leaves the key field unset

Return of null result on sql error not checked

statement 1 (3,2) (3,0) 6 7 9 11 (12,obj) (12,null) 13

passing 1 1 1 1 1 1 1 1

failing 1 1 1 1 1 1

suspiciousness .7 0 1 0 0 .7 .7 0 1 .7

IBM Research


Combined Concrete And Symbolic Execution

Executions that explore different paths

For new executions, find input that forces different path

Path constraint captured during execution

Based on conditionals true for the given execution

E.g. failing example run has path constraint:

Negate portion of constraint, solve for new input

Multiple choices for what to negate

Direct choices to aid fault localization

isset query ∧query≠' make '∧query≠' model '∧isset key

IBM Research


Directed Test Generation

If failure occurs, how to localize the fault?

Assume no relevant test suite is available

How best to generate tests for fault-localization?

What kind of tests work best given a known failure?

Intuition: “similar” tests to localize fault

Minimize differences to narrow possible bug causes

Investigate “directed” test generation methods

Based on similarity metrics

IBM Research


Directed Test Generation Example

Original failing test:

Negate a tail:

Each new input explores a different execution

Intuition: most 'similar' execution best to focus on bug

Note second path will yield successful run


isset query ∧query≠' make '∧query≠' model '∧¬isset key

isset query ∧¬query≠' make '∧query≠' model '

¬isset query

IBM Research


Example Generated Execution 1



IBM Research





IBM Research




¬isset query

IBM Research


Similarity Metrics

Intuitively, find similar but different executions

Ideally find a similar execution without the bug

Similar buggy executions also helpful to localize

Two heuristics:

Path constraint similarity (PCS)

Number of identical path constraint elements

Input similarity (IS)

Number of identical input values

IBM Research


Directed Test Generation Example

Original failing test:

Generated subsequent executions:

PCS: 3, IS: 1

PCS: 1, IS: 1

PCS: 0, IS: 0




¬isset query

IBM Research


Evaluation of Directed Test Generation

35 manually localized execution faults in four programs

Localize each fault

100 tests generated using:Base: tests generated by original run (maximize branch coverage)

Starting generation from the failing test

Coverage: tests generated using the original Apollo

PCS: Directed test generation using path-constraint similarity

IS: Directed test generation using input similarity

Measure effort needed to find the bug assuming statements examined in order of decreasing suspiciousness

measure % of well-localized faults

IBM Research


Directed Test Generation Results

IBM Research


Related Work

Foundations

Localization: Tarantula[Jones et al, 02], Ochiai[Abreu et al., 06]

Concolic testing [Sen et al., 05], DART [Gotfroid et al., 05]

Other directed dynamic, e.g. [Burnim, Sen, 08]

Aims to improve coverage, not fault localization

Delta Debugging [Zeller, 99]

Focuses on minimizing inputs for single executions

Other fault localization, e.g. [Wang, Roychoudhury, 05]

Focus on localization for a single failing test input

IBM Research


Conclusions

Directed generation strategy can aid localization

Generate small, effective test suite

Reduce time and number of tests by > 85%

Enhancements needed to fault localization

Documents

ISSTA 2010 Presentation