Systematic Software Testing Using Test Abstractions

Systematic Software TestingUsing Test Abstractions

Slides by Darko Marinov

Presented at RIO Summer SchoolRio Cuarto, Argentina

February 2011

Examples of Structurally Complex Data

<library> <book year=2009> <title>T1</title> <author>A1</author> </book> <book year=2010> <title>T2</title> <author>A2</author> </book></library>

/library/book[@year<2010]/title

XML document

web sites

s1

s2

s4

s3

s50

1

3

2

red-black tree

Java program

class A { int f;}

class B extends A { void m() { super.f = 0; }}

Running Example: Inheritance Graphs

Javaprogram

interface I { void m();}

interface J { void m();}

class A implements I { void m() {}}

class B extends A implements I,J { void m() {}}

.

I J

A

B

Inheritancegraph

Properties

1. DAG

2. ValidJava

Testing Setup

Examples of code under test Compiler or refactoring engines, input: programs Abstract data type, input/output: data structure Web traversal code, input: web topology XML processing code, input/output: XML document

codetest

generationtest

oracle

0

1

3

2

0 3

2

0 3

2

0

3

pass

fail

inputs outputs

Test Generation

tool code

0

1

3

2

0 3

2

inputsFully automaticManual

0

1

3

2

0 3

2

inputs

tool

Semi automated

input setdescription

0

1

3

2

0 3

2

inputs

Test Abstractions

Each test abstraction describes a set of (structurally complex) test inputs Example: inheritance graphs with up to N nodes

Workflow Tester manually writes test abstractions Tool automatically generates test inputs

Benefits No need to manually write large number of test inputs Useful for test generation, maintenance, and reuse

Key Questions for Test Abstractions

tooltestabstraction

0

1

3

2

0 3

2

inputs

1. Which test inputsto generate?

2. What language to use for test abstractions?

3. How to generate test inputs from abstractions?

4. How to check test outputs?5. How to map failures to faults?

Which Inputs to Generate?

Bounded-exhaustive testing Generate all inputs within given (small) bounds Rationale: small-scope hypothesis [Jackson & Damon ISSTA’96]

Found bugs in both academia and industry Java compilers, model checkers… [Gligoric et al. ICSE’10]

Refactoring engines in Eclipse/NetBeans [Daniel et al. FSE’07]

Web traversal code from Google [Misailovic et al. FSE’07]

XPath compiler at Microsoft [Stobie ENTCS’05]

Fault-tree analyzer for NASA [Sullivan et al. ISSTA’04]

Constraint solver, network protocol [Khurshid & Marinov J-ASE’04]

How to Describe/Generate Inputs?

Filtering test abstractions [SOFTMC’01, ASE’01, FME’02, ISSTA’02, OOPSLA’02, SAT’03, MIT-TR’03, J-ASE’04, SAT’05, LDTA’06, ALLOY’06, ICSE-DEMO’07, STEP’07, FSE’07, ISSTA’08, ICST’09, CSTVA’10]

searchinput

s

filters

bounds

Testers describe what properties test inputs satisfy

Tool: search

Generating test abstractions [FSE’07, FASE’09]

execute

inputs

generators

bounds

Testers describe how to generate test inputs

Tool: execution

Previously Explored Separately

filters generators

Inheritancegraph

DAG X

ValidJava X

OtherJava

properties

property 3 X

… … …

property N X

Some properties easier/harder as filters/generators Key challenge for combining: search vs. execution

UDITA: Combined Both [ICSE’10]

filters generators

Inheritancegraph

DAG X

ValidJava X

OtherJava

properties

property 3 X

… … …

property N X

Extends Java with non-deterministic choices Found bugs in Eclipse/NetBeans/javac/JPF/UDITA

Outline

Introduction Example Filtering Test Abstractions Generating Test Abstractions UDITA: Combined Filtering and Generating Evaluation Conclusions

Example: Inheritance Graphs

class IG {

Node[ ] nodes;

int size;

static class Node {

Node[ ] supertypes;

boolean isClass;

}

}

Properties DAG

Nodes in the graph should have no direct cycle

ValidJava Each class has at most one

superclass All supertypes of interfaces

are interfaces

Example Valid Inheritance Graphs

I J

A

B

G2

C

G1

G3

A

CB

G4

C

D

J

I J

A B C

G5

Example Invalid Inheritance Graphs

I

A

B

G2

C

G1

G3

A

CB

G4

C

J

I J

A B C

G5

J

cycle

D

class implements

class

class extends two

classes

interface extends

classclass

extends interface

(In)valid Inputs

Valid inputs = desired inputs Inputs on which tester wants to test code Code may produce a regular output or an exception E.g. for Java compiler

Interesting (il)legal Java programs No precondition, input can be any text

E.g. for abstract data types Encapsulated data structure representations Inputs need to satisfy invariants

Invalid inputs = undesired inputs

Outline


Filtering Test Abstractions

Tool requires: Filters: encode what properties inputs satisfy Bounds

Tool provides: All inputs within bounds that satisfy the properties

searchinput

s

filters

bounds

Language for Filters

Each filter takes an input that can be valid or invalid Returns a boolean indicating validity

Experience from academia and industry showed that using a new language makes adoption hard

Write filters in standard implementation language (Java, C#, C++…) Advantages

Familiar language Existing development tools Filters may be already present in code

Challenge: generate inputs from imperative filters

Example Filter: DAG Property

boolean isDAG(IG ig) { Set<Node> visited = new HashSet<Node>(); Set<Node> path = new HashSet<Node>(); if (ig.nodes == null || ig.size != ig.nodes.length) return false; for (Node n : ig.nodes) if (!visited.contains(n)) if (!isAcyclic(n, path, visited)) return false; return true;}boolean isAcyclic(Node node, Set<Node> path, Set<Node> visited) { if (path.contains(node)) return false; path.add(node); visited.add(node); for (int i = 0; i < supertypes.length; i++) { Node s = supertypes[i]; for (int j = 0; j < i; j++) if (s == supertypes[j]) return false; if (!isAcyclic(s, path, visited)) return false; } path.remove(node); return true;}

Input Space

All possible object graphs with an IG root(obeying type declarations) Natural input spaces relatively easy to enumerate Sparse: # valid test inputs << # all object graphs

01

32

0 32

03

03

03

03

03

03

03

03 0

3

03

Bounded-Exhaustive Generation

Finite (small) bounds for input space Number of objects Values of fields

Generate all valid inputs within given bounds Eliminates systematic bias Finds all bugs detectable within bounds

Avoid equivalent (isomorphic) inputs Reduces the number of inputs Preserves capability to find all bugs

Example Bounds

Specify number of objects for each class

1 object for IG: { IG0 }

3 objects for Node: { N0, N1, N2 }

Specify set of values for each field For size: { 0, 1, 2, 3 }

For supertypes[i]: { null, N0, N1, N2 }

For isClass: { false, true }

Previously written with special library In UDITA: non-deterministic calls for initialization

Bounds Encoded in UDITA

IG initialize(int N) { IG ig = new IG(); ig.size = N; ObjectPool<Node> pool = new ObjectPool<Node>(N); ig.nodes = new Node[N]; for (int i = 0; i < N; i++) ig.nodes[i] = pool.getNew(); for (Node n : nodes) { int num = getInt(0, N − 1); n.supertypes = new Node[num]; for (int j = 0; j < num; j++) n.supertypes[j] = pool.getAny(); n.isClass = getBoolean(); }return ig; }

Java with a few extensions for non-determinism

static void mainFilt(int N) { IG ig = initialize(N); assume(isDAG(ig)); assume(isValidJava(ig)); println(ig); }

IG0 N0 N1 N2

size s[0] s[1] isC s[1] s[0] s[1]isCs[0] isC

N1 N1 null null null null3 2 31

Example Input Space

1 IG object, 3 Node objects: total 14 fields(“nodes” and array length not shown)

null

N0

N1

N2

null

N0

N1

N2

false

true

null

N0

N1

N2

null

N0

N1

N2

false

true

null

N0

N1

N2

null

N0

N1

N2

false

true

4 * (4 * 4 * 2 * 3)3 > 219 inputs, but < 25 valid

0

1

2

3

IG0 N0 N1 N2

size s[0] s[1] isC s[1] s[0] s[1]isCs[0] isC

Input Generation: Search

Naïve “solution” Enumerate entire (finite) input space, run filter on

each input, and generate input if filter returns true Infeasible for sparse input spaces (#valid << #total) Must reason about behavior of filters

Our previous work proposed a solution Dynamically monitors execution of filters Prunes large parts of input space for each execution Avoids isomorphic inputs

Outline


Generating Test Abstractions

Tool provides: Generators: encode how to generate inputs Bounds

Tool requires: All inputs within bounds (that satisfy the properties)

execute

inputs

generators

bounds

Example Generator: DAG Property

void mainGen(int N) { IG ig = initialize(N); generateDAG(ig); generateValidJava(ig); println(ig); }

void generateDAG(IG ig) { for (int i = 0; i < ig.nodes.length; i++) { int num = getInt(0, i); ig.nodes[i].supertypes = new

Node[num]; for (int j = 0, k = −1; j < num; j++) { k = getInt(k + 1, i − (num − j)); ig.nodes[i].supertypes[j] = ig.nodes[k]; } }}

N0

N1

N2

Outline


Filtering vs. Generating: DAG Property

boolean isDAG(IG ig) { Set<Node> visited = new HashSet<Node>(); Set<Node> path = new HashSet<Node>(); if (ig.nodes == null || ig.size != ig.nodes.length) return false; for (Node n : ig.nodes) if (!visited.contains(n)) if (!isAcyclic(n, path, visited)) return false; return true;}boolean isAcyclic(Node node, Set<Node> path, Set<Node> visited) { if (path.contains(node)) return false; path.add(node); visited.add(node); for (int i = 0; i < supertypes.length; i++) { Node s = supertypes[i]; for (int j = 0; j < i; j++) if (s == supertypes[j]) return false; if (!isAcyclic(s, path, visited)) return false; } path.remove(node); return true;}

void generateDAG(IG ig) { for (int i = 0; i < ig.nodes.length; i++) { int num = getInt(0, i); ig.nodes[i].supertypes = new Node[num]; for (int j = 0, k = −1; j < num; j++) { k = getInt(k + 1, i − (num − j)); ig.nodes[i].supertypes[j] = ig.nodes[k]; } }}

Filtering arguably harder than generating for DAG

20+ vs. 10 LOC

However, the opposite for ValidJava property

UDITA Requirement: Allow Mixing

filters generators

DAG X

ValidJava X

UDITAinput

sgenerators

bounds

filters

Other Requirements for UDITA

Language for test abstractions Ease of use: naturally encode properties Expressiveness: encode a wide range of properties Compositionality: from small to large test abstractions Familiarity: build on a popular language

Algorithms and tools for efficient generation of concrete tests from test abstractions

Key challenge: search (filtering) and execution (generating) are different paradigms

UDITA Solution

Based on a popular language (Java) extended with non-deterministic choices Base language allows writing filters for search/filtering

and generators for execution/generating Non-deterministic choices for bounds and enumeration Non-determinism for primitive values: getInt/Boolean New language abstraction for objects: ObjectPool

class ObjectPool<T> {

public ObjectPool<T>(int size, boolean includeNull) { ... }

public T getAny() { ... }

public T getNew() { ... } }

UDITA Implementation

Implemented in Java PathFinder (JPF) Explicit state model checker from NASA JVM with backtracking capabilities Has non-deterministic call: Verify.getInt(int lo, int hi)

Default/eager generation too slow Efficient generation by delayed choice execution

Publicly available JPF extensionhttp://babelfish.arc.nasa.gov/trac/jpf/wiki/projects/jpf-delayed

Documentation on UDITA web page

http://mir.cs.illinois.edu/udita

Delayed Choice Execution

Postpone choice of values until needed

Lightweight symbolic execution Avoids problems with full blown symbolic execution Even combined symbolic and concrete execution has

problems, especially for complex object graphs

Eager

int x = getInt(0, N);…y = x;…… = y; // non-copy

Delayed

int x = Susp(0, N);…y = x;…force(y);… = y; // non-copy

non-determinism

Object Pools

Eager implementation of getNew/getAny by reduction to (eager) getInt Simply making getInt delayed does not work because

getNew is stateful

We developed a novel algorithm for delayed execution of object pools Gives equivalent results as eager implementation Polynomial-time algorithm to check satisfiability of

getNew/getAny constraints (disequality from a set) Previous work on symbolic execution typically encodes into

constraints whose satisfiability is NP-hard (disjunctions)

Outline


Evaluation

UDITA Compared Delayed and Eager execution Compared with a previous Generating approach Compared with Pex (white-box)

Case studies on testing with test abstractions Some results with filtering test abstractions Some results with generating test abstractions Tested refactoring engines in Eclipse and NetBeans,

Java compilers, JPF, and UDITA

Eager vs. Delayed

.

Generating vs. UDITA: LOC

Generating vs. UDITA: Time

UDITA vs. Pex

Compared UDITA with Pex State-of-the-art symbolic execution

engine from MSR Uses a state-of-the-art constraint solver

Comparison on a set of data structures White-box testing: tool monitors code under test Finding seeded bugs

Result: object pools help Pex to find bugs Summer internship to include object size in Pex

Some Case Studies with Filtering

Filtering test abstractions used at Microsoft Enabled finding numerous bugs in tested apps

XML tools XPath compiler (10 code bugs, test-suite augmentation) Serialization (3 code bugs, changing spec)

Web-service protocols WS-Policy (13 code bugs, 6 problems in informal spec) WS-Routing (1 code bug, 20 problems in informal spec)

Others SSLStream MSN Authentication

Some Case Studies with Generating

Generating test abstractions used to test Eclipse and NetBeans refactoring engines [FSE’07]

Eight refactorings: target field, method, or class Wrote about 50 generators Reported 47 bugs

21 in Eclipse: 20 confirmed by developers 26 in NetBeans: 17 confirmed, 3 fixed, 5 duplicates,

1 won't fix Found more but did not report duplicate or fixed

Parts of that work included in NetBeans

Typical Testing Scenario

Create a model of test inputs (e.g., Java ASTs) Write filters/generators for valid inputs UDITA generates valid inputs Translate from model to actual inputs (pretty-

print) Run on two code bases, compare outputs

pass

fail

=?

onecode

anothercode

<library> <book year=2001> <title>T1</title> <author>A1</author> </book> <book year=2002> <title>T2</title> <author>A2</author> </book> <book year=2003> <title>T3</title> <author>A3</author> </book></library>

/library/book[@year<2003]/titl

<title>T1</title><title>T2</title>

<title>T1</title><title>T2</title>

UDITA

bounds

filters/generators

pretty

printer

Some More Bugs Found

Eclipse vs. NetBeans refactoring engines 2 bugs in Eclipse and 2 bugs in NetBeans

Sun javac compiler vs. Eclipse Java compiler 2 reports (still unconfirmed) for Sun javac

Java PathFinder vs. JVM 6 bugs in JPF (plus 5 more in an older version)

UDITA Delayed vs. UDITA Eager Applying tool on (parts of) itself 1 bug in UDITA (patched since then)

Example Bug

Generated programimport java.util.List;

class A implements B, D { public List m() { List l = null; A a = null; l.add(a.m()); return l; } }

interface D { public List m();}

interface B extends C { public List m();}

interface c { public List m();}

Incorrect refactoringimport java.util.List;

class A implements B, D { public List<List> m() { List<List<List>> l = null; A a = null; l.add(a.m()); return l; } }

interface D { public List<List> m();}

interface B extends C { public List<List> m();}

interface c { public List<List> m();}

Outline


Summary

Testing is important for software reliability Necessary step before (even after) deployment

Structurally complex data Increasingly used in modern systems Important challenge for software testing

Test abstractions Proposed model for complex test data Adopted in industry Used to find bugs in several real-world applications

Benefits & Costs of Test Abstractions

Benefits Test abstractions can find bugs in real code (e.g., at

Microsoft or for Eclipse/NetBeans/javac/JPF/UDITA) Arguably better than manual testing

Costs

1. Human time for writing test abstractions (filters or generators) [UDITA, ICSE’10]

2. Machine time for generating/executing/checking tests (human time waiting for the tool) [FASE’09]

3. Human time for inspecting failing tests [FASE’09]

Collaborators from U. of Illinois and other schools Brett Daniel Danny Dig Kely Garcia Vilas Jagannath Yun Young Lee

Funding NSF CCF-0746856, CNS-0615372, CNS-0613665 Microsoft gift

Credits for Generating TA and UDITA

Milos Gligoric (Belgrade->Illinois) Tihomir Gvero (Belgrade->EPFL) Sarfraz Khurshid (UT Austin) Viktor Kuncak (EPFL)

Test Abstractions

Test abstractions Proposed model for structurally complex test data Adopted in industry Used to find bugs in several real-world applications

(e.g., at Microsoft, Eclipse/NetBeans/javac/JPF/UDITA)

UDITA: latest approach for test abstractions Easier to write test abstractions Novel algorithm for faster generation Publicly available JPF extension

http://mir.cs.illinois.edu/udita

Documents

Systematic Software Testing Using Test Abstractions