72
via Cooperative Testing and Analysis Improving Software Dependability Tao Xie Department of Computer Science North Carolina State University

via Cooperative Testing and Analysis

  • Upload
    ciro

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

Improving Software Dependability. via Cooperative Testing and Analysis. Tao Xie Department of Computer Science North Carolina State University. Software Dependability Matters. Loss of Money: Software faults costed the U.S. economy about $59.5 billion each year (0.6% GDP) [NIST 02] - PowerPoint PPT Presentation

Citation preview

Page 1: via Cooperative Testing and Analysis

via Cooperative Testing and Analysis

Improving Software Dependability

Tao XieDepartment of Computer ScienceNorth Carolina State University

Page 2: via Cooperative Testing and Analysis

Software Dependability Matters

Loss of Money: Software faults costed the U.S. economy about $59.5 billion each year (0.6% GDP) [NIST 02]

Loss of Life: Faulty medical devices caused 30,000 deaths and 600,000 injuries (1985-2005), with likely 8% due to software faults [FDA 06]

22

Page 3: via Cooperative Testing and Analysis

Improving Software DependabilityTitles of Major Conference Pubs (2005-Present)

3

Page 4: via Cooperative Testing and Analysis

Improving Software Dependability

Testing & Analysis

Analytics

Reliability

ICSE 12a, ICSE 09aICSE 08, ICSE 05FSE 09, FSE 07, FSE 12bASE 11b, ASE 10, ASE 09aASE 09b, ASE 08a, ASE 07ECOOP 09, WWW 13

ICSE 11, ICSE 10a, ICSE 10bICSE 09b, ICSE 07 FSE 10 , FSE 12cISSTA 11, ISSTA 10, ISSTA 09ASE 11a, ASE 08b, ASE 06OOPSLA 11, ECOOP 06

PerformanceICSE 12b

SIGMETRICS 08

Major Conference Pubs (2005-Present)

10 ICSE, 7 FSE3 ISSTA, 9 ASE3 OOPLSA/ECOOP

Security/PrivacyFSE 11, SIGMETRICS 08WWW 07

FSE 12a

Page 5: via Cooperative Testing and Analysis

Improving Software Dependability

Testing & Analysis

Analytics

Major Conference Pubs (2005-Present)

5

10 ICSE, 7 FSE3 ISSTA, 9 ASE3 OOPLSA/ECOOP

Reliability

ICSE 12a, ICSE 09aICSE 08, ICSE 05FSE 09, FSE 07, FSE 12bASE 11b, ASE 10, ASE 09aASE 09b, ASE 08a, ASE 07ECOOP 09, WWW 13

ICSE 11, ICSE 10a, ICSE 10bICSE 09b, ICSE 07 FSE 10 , FSE 12cISSTA 11, ISSTA 10, ISSTA 09ASE 11a, ASE 08b, ASE 06OOPSLA 11, ECOOP 06

PerformanceICSE 12b

SIGMETRICS 08

Security/PrivacyFSE 11, SIGMETRICS 08WWW 07

FSE 12a

Artifacts Under Analysis• DB apps• GUI apps• Web/SOA apps•Mobile apps• Cloud apps• Analytics systems•AC/Firewall policies

• API docs• Bug reports• Requirements doc• Execution traces• …

Page 6: via Cooperative Testing and Analysis

Impact/Leadership: Software Testing

We produce; others use

We lead; others followTutorials

Programming Contests

Call for Proposals

Keynotes2010

Page 7: via Cooperative Testing and Analysis

Redundant Test Detectionfor Parasoft Jtest

7

Rostra identified 90% tests generated by Parasoft Jtest 4.5 to be redundant. Parasoft fixed issue in later versions after seeing our results.

ASE 2004

Page 8: via Cooperative Testing and Analysis

Fitnex Path-Exploration Strategy for Pex in Pex Download counts

initial 20 months of release Academic: 17,366

Industrial: 13,022 Total: 30,388

8

“It has saved me two major bugs (not caught by normal unit tests) that would have taken at least a week to track down and fix normally plus a few smaller issues so I'm a big proponent of Pex.”

Pex detected various bugs (including a serious bug) in a core .NET component (already been extensively tested over 5 years by 40 testers) , used by thousands of developers and millions of end users.

Released since 2008

Page 9: via Cooperative Testing and Analysis

Coding Duel Games forPex for Fun

1,129,019 clicked 'Ask Pex!'

www.pexforfun.com

“It really got me *excited*. The part that got me most is about spreading interest in teaching CS: I do think that it’s REALLY great for teaching | learning!”

“I used to love the first person shooters and the satisfaction of blowing away a whole team of Noobies playing Rainbow Six, but this is far more fun.”“I’m afraid I’ll have to constrain myself to spend just an hour or so a day on this really exciting stuff, as I’m really stuffed with work.”

Released since 2010

X

Page 10: via Cooperative Testing and Analysis

Access Control Policy Tool (ACPT)

Access Control Policy Tool (ACPT) beta release being beta-tested with >130 users/organizations

10

Beta-users: NSA, MITRE, DISA, NOAA, SAIC, DNI, Pacific Northwest National Lab, Fermi Lab, BAE system, Lockheed Martin, Raytheon, Boeing, SMI, VA government, John Hopkins U., …

“There are many valuable features in the ACPT and we hope to recommend it to our vendors to verify and validate the policies they author.”

Released since 2009

“ACPT provides all the adequate functionality for the verification of access control policies against static constraints.”

Page 11: via Cooperative Testing and Analysis

Impact/Leadership: Software Testing

We produce; others use

We lead; others followTutorials

Programming Contests

Call for Proposals

Keynotes

Scalable Unit Test GenerationTesting+Human Factors

2010

Page 12: via Cooperative Testing and Analysis

Impact/Leadership: Software Analytics

We produce; others use

We lead; others follow7 years of

ICSE Tutorials

XIAOStackMine

2007

2013

2014

2013

Page 13: via Cooperative Testing and Analysis

StackMine: Performance Debugging in the Large

13

“For 1000 traces, we believe the tool saves us 4-6 weeks of time to create new signatures, which is quite a significant productivity boost.”

- from Development Manager in Windows

Since December 2010, continuously used at a Microsoft team for performance analysis (Windows mini-hang, etc.)

StackMine

Page 14: via Cooperative Testing and Analysis

XIAO: Code Clone Detectionfor Security + Refactoring

2012

XIAO available in Visual Studio 2012

Searching similar snippets for fixing bug onceFinding refactoring opportunity

XIAO Clone Search integrated in workflow @Microsoft Security Response Center (MSRC)

“run [XIAO] for every MSRC case to find any instance of the vulnerable code in any shipping product. This system is the one that found several of the copies of CVE-2011-3402 that we are now addressing with MS12-034.” -MS Security Research & Defense blog

Page 15: via Cooperative Testing and Analysis

Impact/Leadership: Software Analytics

We produce; others use

We lead; others follow7 years of

ICSE Tutorials

XIAOStackMine

Software Analytics (insightful and actionable

info for software practitioners)

2007

2013

2014

2013

Page 16: via Cooperative Testing and Analysis

Reliability

ICSE 12a, ICSE 09aICSE 08, ICSE 05FSE 09, FSE 07, FSE 12bASE 11b, ASE 10, ASE 09aASE 09b, ASE 08a, ASE 07ECOOP 09, WWW 13

ICSE 11, ICSE 10a, ICSE 10bICSE 09b, ICSE 07 FSE 10 , FSE 12cISSTA 11, ISSTA 10, ISSTA 09ASE 11a, ASE 08b, ASE 06OOPSLA 11, ECOOP 06

Cooperative Testing and Analysis

Testing & Analysis

Analytics

ICSE 12b

16

PerformanceICSE 12b

SIGMETRICS 08

Security/PrivacyFSE 11, SIGMETRICS 08WWW 07

FSE 12a

Page 17: via Cooperative Testing and Analysis

Global Trend: Replace Human orGet Human Out of the Loop

Google’s driverless car

Microsoft's instant voice translation tool

IBM Watson as Jeopardy! player

Page 18: via Cooperative Testing and Analysis

18

void test1() { Graph ag = new Graph(); Vertex v1 = new Vertex(0);}

18

00: class Graph { …03: public void AddVertex (Vertex v) {04: vertices.Add(v);05: }06: public Edge AddEdge (Vertex v1, Vertex v2) { … 15: } 16: }

Unit Test Generation: Replace Human or Get Human Out of the Loop

Class Under Test

void test2() { Graph ag = new Graph(); Vertex v1 = new Vertex(0); AddEdge(v1, v1);}

Generated Unit Tests

Manual Test Generation: Tedious, Missing Special/Corner Cases, …

Page 19: via Cooperative Testing and Analysis

State-of-the-Art/Practice Test Generation Tools

Running Symbolic PathFinder ...…=============================

========================= results

no errors detected=============================

========================= statistics

elapsed time: 0:00:02states: new=4, visited=0,

backtracked=4, end=2search: maxDepth=3, constraints=0choice generators: thread=1, data=2heap: gc=3, new=271, free=22instructions: 2875max memory: 81MBloaded code: classes=71, methods=884

19

Page 20: via Cooperative Testing and Analysis

Challenges Faced by Test Generation Tools

object-creation problems (OCP) - 65% external-method call problems (EMCP) – 27%

Total block coverage achieved is 50%, lowest coverage 16%.

20

Ex: Dynamic Symbolic Execution (DSE) /Concolic Testing [Godefroid et al. 05][Sen et al. 05][Tillmann et al. 08]

Instrument code to explore feasible paths Challenge: path explosion

When desirable receiver or argument

objects are not generated

Page 21: via Cooperative Testing and Analysis

Example Object-Creation Problem

21

A graph example from QuickGraph library

Includes two classes GraphDFSAlgorithm

GraphAddVertexAddEdge: requires

both vertices to be in graph

00: class Graph { …03: public void AddVertex (Vertex v) {04: vertices.Add(v); // B1 }06: public Edge AddEdge (Vertex v1, Vertex v2) {07: if (!vertices.Contains(v1))08: throw new VNotFoundException(""); 09: // B210: if (!vertices.Contains(v2))11: throw new VNotFoundException("");12: // B314: Edge e = new Edge(v1, v2);15: edges.Add(e); } }

//DFS:DepthFirstSearch18: class DFSAlgorithm { … 23: public void Compute (Vertex s) { ...24: if (graph.GetEdges().Size() > 0) { // B425: isComputed = true;26: foreach (Edge e in graph.GetEdges()) {27: ... // B528: }29: } } } 21

[OOPSLA 11]

Page 22: via Cooperative Testing and Analysis

22

Test target: Cover true branch (B4) of Line 24

Desired object state: graph should include at least one edge

Target sequence:

Graph ag = new Graph();Vertex v1 = new Vertex(0);Vertex v2 = new Vertex(1);ag.AddVertex(v1);ag.AddVertex(v2);ag.AddEdge(v1, v2);DFSAlgorithm algo = new

DFSAlgorithm(ag);algo.Compute(v1);

22

00: class Graph { …03: public void AddVertex (Vertex v) {04: vertices.Add(v); // B1 }06: public Edge AddEdge (Vertex v1, Vertex v2) {07: if (!vertices.Contains(v1))08: throw new VNotFoundException(""); 09: // B210: if (!vertices.Contains(v2))11: throw new VNotFoundException("");12: // B314: Edge e = new Edge(v1, v2);15: edges.Add(e); } }

//DFS:DepthFirstSearch18: class DFSAlgorithm { … 23: public void Compute (Vertex s) { ...24: if (graph.GetEdges().Size() > 0) { // B425: isComputed = true;26: foreach (Edge e in graph.GetEdges()) {27: ... // B528: }29: } } }

Example Object-Creation Problem [OOPSLA 11]

Page 23: via Cooperative Testing and Analysis

Challenges Faced by Test Generation Tools

object-creation problems (OCP) - 65% external-method call problems (EMCP) – 27%

Total block coverage achieved is 50%, lowest coverage 16%.

23

Ex: Dynamic Symbolic Execution (DSE) /Concolic Testing [Godefroid et al. 05][Sen et al. 05][Tillmann et al. 08]

Instrument code to explore feasible paths Challenge: path explosion

Typically DSE instruments or explores only methods @ project under test;Third-party API external methods (network, I/O, ..):• too many paths• uninstrumentable

Page 24: via Cooperative Testing and Analysis

Example External-Method Call Problems (EMCP)

24

Page 25: via Cooperative Testing and Analysis

Challenges Faced by Test Generation Tools

object-creation problems (OCP) - 65% external-method call problems (EMCP) – 27%

Total block coverage achieved is 50%, lowest coverage 16%.

25

Ex: Dynamic Symbolic Execution (DSE) /Concolic Testing [Godefroid et al. 05][Sen et al. 05][Tillmann et al. 08]

Instrument code to explore feasible paths Challenge: path explosion

Page 26: via Cooperative Testing and Analysis

What to Do Next?

2010 Dagstuhl Seminar 10111Practical Software Testing: Tool Automation and Human Factors

Page 27: via Cooperative Testing and Analysis

Conventional Wisdom: Improve Automation CapabilityTackling object-creation problems

Seeker [OOSPLA 11] , MSeqGen [ESEC/FSE 09] Covana [ICSE 2011], OCAT [ISSTA 10]Evacon [ASE 08], Symclat [ASE 06]

Still not good enough (at least for now)! ▪ Seeker (52%) > Pex/DSE (41%) > Randoop/random

(26%)

Tackling external-method call problems DBApp Testing [ESEC/FSE 11], [ASE 11] CloudApp Testing [IEEE Soft 12] Deal with only common environment APIs

@NCSU ASE

Page 28: via Cooperative Testing and Analysis

Unconventional Wisdom: Bring Human in the Loop

Malaysia Airlines Flight 124 @2005Lisanne Bainbridge, "Ironies of Automation”, Automatica 1983 .

Ironies of Automation“The increased interest in human factors among engineers reflects the irony that the more advanced a control system is, so the more crucial may be the contribution of the human operator.”

Page 29: via Cooperative Testing and Analysis

29

Test target: Cover true branch (B4) of Line 24

Desired object state: graph should include at least one edge

Target sequence:

Graph ag = new Graph();Vertex v1 = new Vertex(0);Vertex v2 = new Vertex(1);ag.AddVertex(v1);ag.AddVertex(v2);ag.AddEdge(v1, v2);DFSAlgorithm algo = new

DFSAlgorithm(ag);algo.Compute(v1);

29

00: class Graph { …03: public void AddVertex (Vertex v) {04: vertices.Add(v); // B1 }06: public Edge AddEdge (Vertex v1, Vertex v2) {07: if (!vertices.Contains(v1))08: throw new VNotFoundException(""); 09: // B210: if (!vertices.Contains(v2))11: throw new VNotFoundException("");12: // B314: Edge e = new Edge(v1, v2);15: edges.Add(e); } }

//DFS:DepthFirstSearch18: class DFSAlgorithm { … 23: public void Compute (Vertex s) { ...24: if (graph.GetEdges().Size() > 0) { // B425: isComputed = true;26: foreach (Edge e in graph.GetEdges()) {27: ... // B528: }29: } } }

Example Object Creation Problem (OCP)

Page 30: via Cooperative Testing and Analysis

Human Can Help! Object Creation Problems (OCP)Tackle object-creation problems with Factory Methods

30

Page 31: via Cooperative Testing and Analysis

Human Can Help!External-Method Call Problems (EMCP)Tackle external-method call problems with Mock Methods or Method InstrumentationMocking System.IO.File.ReadAllText

31

Page 32: via Cooperative Testing and Analysis

Automation in Software Testing

2010 Dagstuhl Seminar 10111Practical Software Testing: Tool Automation and Human Factors 32

Page 33: via Cooperative Testing and Analysis

Automation in Software Testing

Dagstuhl Seminar 10111Practical Software Testing: Tool Automation and Human Factors

Human Factors

33

Page 34: via Cooperative Testing and Analysis

Cooperative Software Testing and AnalysisHuman-Assisted Computing

Driver: tool Helper: human Ex. Covana [ICSE 2011]

Human-Centric Computing Driver: human Helper: tool Ex. Pex for Fun [ICSE 2013 SEE]

Interfaces are important. Contents are important too! 34

Page 35: via Cooperative Testing and Analysis

Example Problems Faced by Tools

35

Symptoms

(Likely) Causes

external-method call problems (EMCP)all executed external-

method calls

object-creation problems (OCP)

all non-primitive program inputs/fields

Page 36: via Cooperative Testing and Analysis

Technical ChallengesCausal analysis: tracing between

symptoms and (likely) causes Reduce cost of human consumption▪ reduction of #(likely) causes▪ diagnosis of each cause

Solution construction: fixing suspected causes Reduce cost of human contribution▪ measurement of solution goodness▪ Inner iteration of human-tool cooperation! 36

Page 37: via Cooperative Testing and Analysis

Black-Box Systematic Debugging Not Feasible

37

Symptoms

(Likely) Causes

external-method call problems (EMCP)

object-creation problems (OCP)

Given symptom sforeach (c in LikelyCauses) { Fix(c); if (IsObserved(s)) RelevantCauses.add(c)}

Page 38: via Cooperative Testing and Analysis

White-Box Causal Analysis: Covana

Goal: Precisely identify problems (causes) faced by a tool for causing not to cover a statement (symptom)

Insight: Partially-covered conditional has data dependency on a real problem

38

[ICSE 11]

From xUnit

Page 39: via Cooperative Testing and Analysis

ECMP with Data Dependency on Program Inputs [Inputs EMCP]

Data Dependencies

39

Consider only EMCPs whose arguments have data dependencies on program inputs▪ Fixing such problem candidates facilitates test-generation tools

From xUnit

Page 40: via Cooperative Testing and Analysis

Symptom with Data Dependency on EMCP [EMCP Symptom]

Symptom Expression:return(File.Exists) == true

Element of EMCP Candidate:return(File.Exists)

Conditional in Line 1 has data dependency on File.Exists

40

Partially-covered conditionals have data dependencies on EMCP candidates

Page 41: via Cooperative Testing and Analysis

Example EMCP being Filtered [EMCP !Symptom]

41From xUnit

Page 42: via Cooperative Testing and Analysis

Tool Architecture of Covana

Data Dependence Analysis

Forward Symbolic Execution

Problem Candidat

es

Problem Candidate Identificati

on

Runtime Informati

on

Identified Problems

Coverage

Program

Generated Test Inputs

Runtime Events

42

[Inputs EMCP]

[EMCP Symptom]

Page 43: via Cooperative Testing and Analysis

Evaluation – Subjects and Setup

Subjects: xUnit: unit testing framework for .NET▪ 223 classes and interfaces with 11.4 KLOC

QuickGraph: C# graph library▪ 165 classes and interfaces with 8.3 KLOC

Evaluation setup: Apply Pex to generate tests for program under

test Feed the program and generated tests to Covana Compare baseline solution and Covana

43

Page 44: via Cooperative Testing and Analysis

Evaluation – Research Questions

RQ1: How effective is Covana in identifying the two main types of problems, EMCPs and OCPs?

RQ2: How effective is Covana in pruning irrelevant problem candidates of EMCPs and OCPs?

44

Page 45: via Cooperative Testing and Analysis

Evaluations - RQ1: Problem Identification

Covana identifies • 43 EMCPs with only 1 false positive and 2 false negatives• 155 OCPs with 20 false positives and 30 false negatives.

45

Page 46: via Cooperative Testing and Analysis

Evaluation –RQ2: Irrelevant-Problem-Candidate Pruning

Covana prunes • 97% (1567 in 1610) EMCP candidates with 1 false positive and 2 false negatives• 66% (296 in 451) OCP candidates with 20 false positives and 30 false negatives

46

Page 47: via Cooperative Testing and Analysis

Cooperative Software Testing and AnalysisHuman-Assisted Computing

Driver: tool Helper: human Ex. Covana [ICSE 2011]

Human-Centric Computing Driver: human Helper: tool Ex. Pex for Fun [ICSE 2013 SEE]

Interfaces are important. Contents are important too! 47

Page 48: via Cooperative Testing and Analysis

Coding Duel Games forPex for Fun

1,129,019 clicked 'Ask Pex!'

www.pexforfun.com

“It really got me *excited*. The part that got me most is about spreading interest in teaching CS: I do think that it’s REALLY great for teaching | learning!”

“I used to love the first person shooters and the satisfaction of blowing away a whole team of Noobies playing Rainbow Six, but this is far more fun.”“I’m afraid I’ll have to constrain myself to spend just an hour or so a day on this really exciting stuff, as I’m really stuffed with work.”

Released since 2010

X

Page 49: via Cooperative Testing and Analysis

Behind the Scene of Pex for Fun

Secret Implementation class Secret {

public static int Puzzle(int x) { if (x <= 0) return 1; return x * Puzzle(x-1); }}

Player Implementation class Player {

public static int Puzzle(int x) { return x; }}

class Test {public static void Driver(int x) { if (Secret.Puzzle(x) != Player.Puzzle(x)) throw new Exception(“Mismatch”); }}

behaviorSecret Impl == Player Impl

49

Page 50: via Cooperative Testing and Analysis

Coding Duel Competition @ICSE 2011

http://pexforfun.com/icse2011

Page 51: via Cooperative Testing and Analysis

Coding Duels for Automatic Grading @Grad Software Engineering Course

Especially valuable in Massive Open Online Courses (MOOC)

http://pexforfun.com/gradsofteng

Page 52: via Cooperative Testing and Analysis

Human-Human Cooperation: Pex for Fun (Crowdsourcing)

52

Internet

class Secret { public static int Puzzle(int x) { if (x <= 0) return 1; return x * Puzzle(x-1); } }

Everyone can contribute Coding duels Duel solutions

Page 53: via Cooperative Testing and Analysis

Human-Human Cooperation: Puzzle Games (Crowdsourcing)

Internet

Puzzle Games Made from Difficult Constraints or Object-Creation Problems

Page 54: via Cooperative Testing and Analysis

Summary: Cooperative Testing and AnalysisHuman-Assisted Computing

Driver: tool Helper: human Ex. Covana [ICSE 2011]

Human-Centric Computing Driver: human Helper: tool Ex. Pex for Fun [ICSE 2013 SEE]

Page 55: via Cooperative Testing and Analysis

Future Directions Cooperative testing and analysis

Tool-Human, Human-Human, Tool-Tool “Big data” in software analytics

Text analytics: data from users, developers, forums,…

Dependability of data-analytic systems “Big system” in software testing and analysis

Requirements/design specs, Human assistance Emerging/critical systems, e.g.,

Cloud/mobile, game apps, online services Cyber-physical systems (medical-device soft)

Dependability attributes: security, energy,… Educational software engineering

Gov/So

ciety

Research

Industry

impact

55

Page 56: via Cooperative Testing and Analysis

Acknowledgments Former/current students

Mithun Acharya (PhD 2009), ABB Research USA Suresh Thummalapenta (PhD 2010), IBM Research India Kunal Taneja (PhD, 2012), Accenture Labs USA Xiaoyin Wang (PhD@PKU, 2012), c0-supervised, Postdoc@UC

Berkeley Hao Zhong (PhD@PKU, 2009), c0-supervised, Assist Prof@CAS China JeeHyun Hwang, Sihan Li, Rahul Pandita, Kiran Shakya, Yoonki Song,

Xusheng Xiao, Wei Yang, Xiao Yu Industrial/agency/academic collaborators, e.g.,

Microsoft Research Pex group/Software Analytics group NIST Computer Security Division FDA Laboratory of Software Engineering

Support from NSF, ARO, NIST, NSA, Microsoft, IBM, ABB

Page 57: via Cooperative Testing and Analysis

Thank you!

Questions ?

https://sites.google.com/site/asergrp/

Page 58: via Cooperative Testing and Analysis

Summary: Cooperative Testing and AnalysisHuman-Assisted Computing

Driver: tool Helper: human Ex. Covana [ICSE 2011]

Human-Centric Computing Driver: human Helper: tool Ex. Pex for Fun [ICSE 2013 SEE]

Page 59: via Cooperative Testing and Analysis

Example False Negatives of EMCPs

59

OpenSubKey() receives constant string as arguments. It does not have data dependencies on program inputs

false branch at Line 5 is not covered.

Use return value of OpenSubKey() Concrete argument for external-

method calls using constant string to access external environment affecting achieved coverage

Page 60: via Cooperative Testing and Analysis

Example Identified EMCP

Branch Statement Line 1 has data dependency on File.Exists at Line 1

False branch at Line 1 is not covered

File.Exists is reported

60

ParseCommandLine, Pex achieved 44/154 (28.57%),

Page 61: via Cooperative Testing and Analysis

Human-Human/Tool Cooperation: Performance Debugging in the Large

61

Pattern Matching

Bug update

Problematic Pattern

Repository

Bug Database

Trace analysis

How many bugs are still unknown?

Which trace should I

investigate first?

Bug filingKey to bug

discoveryBottleneck

of scalability

StackMine [Han et al. ICSE 12b]

Trace StorageTrace collection

Internet

Page 62: via Cooperative Testing and Analysis

62

Intuition: Browser Tab Creation ExampleWhat happens behind a typical UI-delay?

ReadyThread

CallstacksWait

CallstacksCPU

Sampled Callstacks

CPU Wait (UI Delay) Ready CPUWaitCPUUI thread

Ready

Time

Wait callstackntdll.dll!_RtlUserThreadStartBrowser! Main…ntdll.dll!LdrLoadDll…nt!AccessFaultnt!PageFault…

Wait callstackntdll!UserThreadStartBrowser! Main…Browser!OnBrowserCreatedAsyncCallback…BrowserUtil!ProxyMaster::GetOrCreateSlaveBrowserUtil!ProxyMaster::ConnectToObject…rpc!ProxySendReceive…wow64!RunCpuSimulationwow64cpu!WaitForMultipleObjects32wow64cpu!CpupSyscallStub…

ReadyThread callstackntdll.dll!_RtlUserThreadStart…rpc!LrpcIoComplete…user32!PostMessage…win32k!SetWakeBitnt!KeSetEvent…

ReadyThread callstacknt!KiRetireDpcListnt!KiExecuteAllDpcs…nt!IopfCompleteRequest…nt!KeSetEvent…

Underlying Disk I/O

Worker thread

Ready CPUUnexpected long

execution

CPU sampled callstackntdll!UserThreadStart…ntdll!TppWorkerThread…ole!CoCreateInstance…ole!OutSerializer::UnmarshalAtIndexole!CoUnmarshalInterface…

CPU sampled callstackntdll!UserThreadStart…ntdll!TppWorkerThread…ole!CoCreateInstance…ole!OutSerializer::UnmarshalAtIndexole!CoUnmarshalInterface…

CPU sampled callstackntdll!UserThreadStart…ntdll!TppWorkerThread…ole!CoCreateInstance…ole!OutSerializer::UnmarshalAtIndexole!CoUnmarshalInterface…

Page 63: via Cooperative Testing and Analysis

Pattern Discovery and Investigation Prioritization via Callstack Mining and Clustering

63

Callstacks

stobject!WakeUp_DeviceMonitor…

KernelBase!LoadLibrary…

ntdll!OpenFile

Callstack patterns

stobject!WakeUp_DeviceMonitor

…KernelBase!LoadLibrary

…nt!AccessFault

stobject!WakeUp_DeviceMonitor

…KernelBase!LoadLibrary

…nt!PageRead

709

412

259

Pattern clusters

ntdll!OpenFile

stobject!WakeUp_DeviceMonitor

KernelBase!LoadLibrary

nt!PageRead

nt!AccessFault

1392

709

259412

Cluster Hits Total Wait time (ms)

1 94,279 927,8242 51,107 561,4163 35,536 3,051,307… …

Ranked cluster

123

Traces

Parsing & Filtering

Sequence Pattern Mining

Pattern Clustering

Parsing & Filtering

Sequence Pattern Mining

PatternClustering

Ranking

- ntdll!UserThreadStart|- …… | |- …… | | |- user32!InternalCallWinProc | | | |- stobject!WakeUp_DeviceMonitor| | | | |- KernelBase!LoadLibrary| | | | | |- …… | | | | | | |- ntdll!OpenFile| | | | | | | |- ……

- ntdll!UserThreadStart|- ……| |- …… | | |- user32!InternalCallWinProc | | | |- stobject!WakeUp_DeviceMonitor| | | | |- kernel32!LoadLibrary| | | | | |- …… | | | | | | |- nt!PageRead| | | | | | | |- ……

- ntdll!_RtlUserThreadStart|- ……| |- …… | | |- user32!InternalCallWinProc | | | |- stobject!WakeUp_DeviceMonitor| | | | |- kernel32!LoadLibrary| | | | | |- KernelBase!LoadLibrary| | | | | | |- …… | | | | | | | |- nt!Trap| | | | | | | | |- nt!AccessFault| | | | | | | | | |- ……Ranking

[Han et al. ICSE 12b]

Page 64: via Cooperative Testing and Analysis

StackMine: Industry Impact “We believe that the MSRA tool is

highly valuable and much more efficient for mass trace (100+

traces) analysis. For 1000 traces, we believe the tool saves us 4-6 weeks of time to create new signatures,

which is quite a significant productivity boost.”

- from Development Manager in Windows

Highly effective new issue discovery on

Windows mini-hang

Continuous impact on future Windows versions

64

Page 65: via Cooperative Testing and Analysis

65

StackMine: Challenges

Combination of expertise• Generic machine learning tools

without domain-knowledge guidance do not work

Highly complex analysis• Numerous program runtime

combinations for triggering performance bugs

• Multi-layer intertwined runtime components from application to kernel

Large-scale trace data• TBs of trace files and growing• Millions of events in single traceIntern

et

Page 66: via Cooperative Testing and Analysis

Tool-Tool Cooperation Static analysis + dynamic analysis

Static Checker + Test Generation …

Dynamic analysis + static analysis Fix generation + fix validation …

Static analysis + static analysis …

Dynamic analysis + dynamic analysis [ASE 08] …

66

Page 67: via Cooperative Testing and Analysis

Bigger PictureMachine is better at task set A

Mechanical, tedious, repetitive tasks, … Ex. solving constraints along a long path

Human is better at task set B Intelligence, human intent, abstraction, domain

knowledge, … Ex. local reasoning after a loop, recognizing

naming semantics

= A U

B67

Page 68: via Cooperative Testing and Analysis

State-of-the-Art/Practice Testing Tools

Running Symbolic PathFinder ...…=============================

========================= results

no errors detected=============================

========================= statistics

elapsed time: 0:00:02states: new=4, visited=0,

backtracked=4, end=2search: maxDepth=3, constraints=0choice generators: thread=1, data=2heap: gc=3, new=271, free=22instructions: 2875max memory: 81MBloaded code: classes=71, methods=884

Tools Typically Don’t Communicate Challenges Faced by Them to Enable Cooperation between Tools and Users

68

Page 69: via Cooperative Testing and Analysis

Human-Centric ComputingCoding duels at

http://www.pexforfun.com/ Brain exercising/learning while having fun Fun: iterative, adaptive/personalized, w/ win

criterion Abstraction/generalization, debugging,

problem solvingBrain exercising

Page 70: via Cooperative Testing and Analysis

Human-Assisted ComputingMotivation

Tools are often not powerful enough Human is good at some aspects that tools are not

What difficulties does the tool face? How to communicate info to the user to get help?

How does the user help the tool based on the info?

70

Iterations to form Feedback Loop

Page 71: via Cooperative Testing and Analysis

DSE Challenges - Preliminary Study

Real EMCPs: 0Real OCPs: 5

Reported EMCPs: 44Reported OCPs: 18 vs.

71

Page 72: via Cooperative Testing and Analysis

Overview of Covana

Data Dependence Analysis

Forward Symbolic Execution

Problem Candidat

es

Problem Candidate Identificati

on

Runtime Informati

on

Identified Problems

Coverage

Program

Generated Test Inputs

Runtime Events

72