66
Transferring Software Testing Tools to Practice Tao Xie University of Illinois at Urbana-Champaign [email protected] http://taoxie.cs.illinois.edu/ In collaboration with collaborators from Microsoft Research, Tencent, and Salesforce

Transferring Software Testing Tools to Practice (AST 2017 Keynote)

  • Upload
    tao-xie

  • View
    511

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Transferring Software Testing Tools to Practice

Tao XieUniversity of Illinois at Urbana-Champaign

[email protected] http://taoxie.cs.illinois.edu/

In collaboration with collaborators from Microsoft Research, Tencent, and Salesforce

Page 2: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

(Automated) Test Generation

• Human• Expensive, incomplete, …

• Brute Force• Pairwise, predefined data, etc…

• Tool Automation!!

Page 3: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Getting Real to Produce Practice Impact

3

Making real impact

Building real technologies

Solving real problems

Software testing tools are naturally tied with software development practice

Page 4: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Dynamic Symbolic Execution

Code to generate inputs for:

Constraints to solve

a!=null

a!=null &&

a.Length>0

a!=null &&

a.Length>0 &&

a[0]==1234567890

void CoverMe(int[] a){

if (a == null) return;if (a.Length > 0)

if (a[0] == 1234567890)throw new Exception("bug");

}

Observed constraints

a==null

a!=null &&

!(a.Length>0)

a!=null &&

a.Length>0 &&

a[0]!=1234567890

a!=null &&

a.Length>0 &&

a[0]==1234567890

Data

null

{}

{0}

{123…}a==null

a.Length>0

a[0]==123…T

TF

T

F

F

Execute&MonitorSolve

Choose next path

Done: There is no path left.

Negated condition

[DART: Godefroid et al. PLDI’05]

Page 5: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Microsoft Research Automated Test Generation Tool: Pex & Relatives

• Pex (released in May 2008)

http://research.microsoft.com/pex/

Page 6: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Microsoft Research Automated Test Generation Tool: Pex & Relatives

• Pex (released in May 2008)

• 30K downloads after 20 months

• Active user community: 1.4K forum posts during ~3 years

• Shipped with Visual Studio 2015 as IntelliTest

• Moles (released in Sept 2009)• Shipped with Visual Studio 2012 as Fakes

• “Provide Microsoft Fakes w/ all Visual Studio editions” got 1.5K community votes

http://research.microsoft.com/pex/

Page 7: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

There are decision procedures for individual path conditions, but…

• Number of potential paths grows exponentially with number of branches

• Reachable code not known initially

• Without guidance, same loop might be unfolded forever

Fitnex search strategy

[Xie et al. DSN 09]

Explosion of Search Space

http://taoxie.cs.illinois.edu/publications/dsn09-fitnex.pdf

Page 8: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

DSE Example

public bool TestLoop(int x, int[] y) {

if (x == 90) {

for (int i = 0; i < y.Length; i++)

if (y[i] == 15)

x++;

if (x == 110)

return true;

}

return false;

}

TestLoop(0, {0})

Path condition:!(x == 90)

↓New path condition:(x == 90)

↓New test input:TestLoop(90, {0})

Page 9: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

DSE Example

public bool TestLoop(int x, int[] y) {

if (x == 90) {

for (int i = 0; i < y.Length; i++)

if (y[i] == 15)

x++;

if (x == 110)

return true;

}

return false;

}

TestLoop(90, {0})

Path condition:(x == 90) && !(y[0] ==15) && !(x == 110)

↓New path condition:(x == 90) && (y[0] ==15)

↓New test input:TestLoop(90, {15})

Page 10: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Challenge in DSE

public bool TestLoop(int x, int[] y) {

if (x == 90) {

for (int i = 0; i < y.Length; i++)

if (y[i] == 15)

x++;

if (x == 110)

return true;

}

return false;

}

TestLoop(90, {15})

Path condition:(x == 90) && (y[0] ==15) && !(x+1 == 110)

↓New path condition:(x == 90) && (y[0] ==15) && (x+1 == 110)

↓New test input:No solution!?

Page 11: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

A Closer Look

public bool TestLoop(int x, int[] y) {

if (x == 90) {

for (int i = 0; i < y.Length; i++)

if (y[i] == 15)

x++;

if (x == 110)

return true;

}

return false;

}

TestLoop(90, {15})

Path condition:(x == 90) && (y[0] ==15) && (0 < y.Length) && !(1 < y.Length) && !(x+1 == 110)

↓New path condition:(x == 90) && (y[0] ==15) && (0 < y.Length) && (1 < y.Length) Expand array size

Page 12: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

A Closer Look

public bool TestLoop(int x, int[] y) {

if (x == 90) {

for (int i = 0; i < y.Length; i++)

if (y[i] == 15)

x++;

if (x == 110)

return true;

}

return false;

}

TestLoop(90, {15})

We can have infinite paths!

Manual analysis need at least 20 loop iterations to cover the target branch

Exploring all paths up to 20loop iterations is infeasible:

220 paths

Page 13: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Fitnex: Fitness-Guided Exploration

public bool TestLoop(int x, int[] y) {

if (x == 90) {

for (int i = 0; i < y.Length; i++)

if (y[i] == 15)

x++;

if (x == 110)

return true;

}

return false;

}

Key observations: with respect to the coverage target

• not all paths are equally promising for branch-node flipping

• not all branch nodes are equally promising to flip

• Our solution:

– Prefer to flip branch nodes on the most promising paths

– Prefer to flip the most promising branch nodes on paths

– Fitness function to measure “promising” extents

TestLoop(90, {15, 0})TestLoop(90, {15, 15})

[Xie et al. DSN 2009]

http://taoxie.cs.illinois.edu/publications/dsn09-fitnex.pdf

Page 14: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Fitness Function

• FF computes fitness value (distance between the current state and the goal state)

• Search tries to minimize fitness value

[Tracey et al. 98, Liu at al. 05, …]

Page 15: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Fitness Function for (x == 110)

public bool TestLoop(int x, int[] y) {

if (x == 90) {

for (int i = 0; i < y.Length; i++)

if (y[i] == 15)

x++;

if (x == 110)

return true;

}

return false;

}

Fitness function: |110 – x |

Page 16: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Compute Fitness Values for Paths

public bool TestLoop(int x, int[] y) {

if (x == 90) {

for (int i = 0; i < y.Length; i++)

if (y[i] == 15)

x++;

if (x == 110)

return true;

}

return false;

}

(90, {0}) 20(90, {15}) 19(90, {15, 0}) 19(90, {15, 15}) 18(90, {15, 15, 0}) 18(90, {15, 15, 15}) 17(90, {15, 15, 15, 0}) 17(90, {15, 15, 15, 15}) 16(90, {15, 15, 15, 15, 0}) 16(90, {15, 15, 15, 15, 15}) 15…

Fitness Value(x, y)

Fitness function: |110 – x |

Give preference to flip paths with better fitness valuesWe still need to address which branch node to flip on paths …

Page 17: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Compute Fitness Gains for Branches

public bool TestLoop(int x, int[] y) {

if (x == 90) {

for (int i = 0; i < y.Length; i++)

if (y[i] == 15)

x++;

if (x == 110)

return true;

}

return false;

}

(90, {0}) 20(90, {15}) flip b4 19(90, {15, 0}) flip b2 19(90, {15, 15}) flip b4 18(90, {15, 15, 0}) flip b2 18(90, {15, 15, 15}) flip b4 17(90, {15, 15, 15, 0}) flip b2 17(90, {15, 15, 15, 15}) flip b4 16(90, {15, 15, 15, 15, 0}) flip b2 16(90, {15, 15, 15, 15, 15}) flip b4 15…

Fitness Value(x, y)

Fitness function: |110 – x |

Branch b1: i < y.LengthBranch b2: i >= y.LengthBranch b3: y[i] == 15Branch b4: y[i] != 15 •Flipping Branch b4 (b3) gives us average 1 (-1) fitness gain (loss)

•Flipping branch b2 (b1) gives us average 0 fitness gain (loss)

Page 18: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Compute Fitness Gain for Branches cont.

• For a flipped node leading to Fnew, find out the old fitness value Fold before flipping

• Assign Fitness Gain (Fold – Fnew) for the branch of the flipped node

• Assign Fitness Gain (Fnew – Fold ) for the other branch of the branch of the flipped node

• Compute the average fitness gain for each branch over time

Page 19: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Search Frontier

• Each branch node candidate for being flipped is prioritized based on its composite fitness value:

• (Fitness value of node – Fitness gain of its branch)

• Select first the one with the best composite fitness value

To avoid local optimal or biases, the fitness-guided strategy is integratedwith Pex’s fairness search strategies

Page 20: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Microsoft Research Automated Test Generation Tool: Pex & Relatives

• Pex (released in May 2008)

• 30K downloads after 20 months

• Active user community: 1.4K forum posts during ~3 years

• Shipped with Visual Studio 2015 as IntelliTest

• Moles (released in Sept 2009)• Shipped with Visual Studio 2012 as Fakes

• “Provide Microsoft Fakes w/ all Visual Studio editions” got 1.5K community votes

http://research.microsoft.com/pex/

What went on behind the scenes to build a user base?See more @ASE 2014 Experience Report: http://taoxie.cs.illinois.edu/publications/ase14-pexexperiences.pdf

Page 21: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Lesson 1. Evolving Vision

void TestAdd(ArrayList a, object o) {Assume.IsTrue(a!=null);int i = a.Count;a.Add(o);Assert.IsTrue(a[i] == o);

}

Parameterized Unit Tests Supported by Pex

Moles/Fakes

IntelliTest

Pex4Fun/Code Hunt

• Surrounding (Moles/Fakes)

• Retargeting (Pex4Fun/Code Hunt)

• Simplifying (IntelliTest)

Page 22: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Lesson 2. Landing the First Customer

• Developer/manager: “Who took a dependency on your tool?”

• Pex team: “Do you want to be the first?”

• Developer/manager: “I love your tool but no.”

Tool Adoption by (Mass) Target Users

Tool Shipping with Visual Studio

Macro Perspective

Micro Perspective

Page 23: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Lesson 2. Landing the First Customer

• Tackle real-world challenges• Demo Pex on real-world cases (e.g., ResourceReader) beyond textbook examples

• Demo Moles to well address important scenarios (e.g., unit testing SharePoint code)

• Address technical/non-technical barriers for tech adoption in industry• Offer tool license not prohibiting commercial use

• Incremental shipping• Ship experimental reduced versions and gather feedback

• Find early adopters

• Provide quantitative info (reflecting tool’s importance or benefit extent)• Not all downloads are equal! (e.g., those from Fortune 500)

cont.

Page 24: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Lesson 3. Human Factors –Generated Data Consumed by Human

• Developer: “Code digger generates a lot of “\0” strings as input. I can’t find a way to create such a string via my own C# code. Could any one show me a C# snippet? I meant zero terminated string.”

• Pex team: “In C#, a \0 in a string does not mean zero-termination. It’s just yet another character in the string (a very simple character where all bits are zero), and you can create as Pex shows the value: “\0”.”

• Developer: “Your tool generated “\0””

• Pex team: “What did you expect?”

• Developer: “Marc.”

Page 25: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Lesson 3. Human Factors –Generated Name Consumed by Human

• Developer: “Your tool generated a test called Capitalize01. I don’t like it.”

• Pex team: “What did you expect?”

• Developer:“Capitalize_Should_Fail_When_Value_Is_Null.”

Page 26: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Lesson 3. Human Factors –Generated Results Consumed by Human

Object Creation messages suppressed (e.g., Covana by Xiao et al. [ICSE’11])

Exception Tree View

Exploration Tree View

Exploration Results View

Page 27: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Lesson 4. Misconceptions

• Someone advertises: “Simply one mouse click and then everything would work just perfectly”• Often need environment isolation w/ Moles/Fakes

• “One mouse click, a test generation tool would detect all or most kinds of faults in the code under test”• Developer: “Your tool only finds null references.”

• Pex team: “Did you write any assertions?”

• Developer: “Assertion???”

• “I do not need test generation; I already practice unit testing (and/or TDD). Test generation does not fit into the TDD process”

Page 28: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Lesson 5. Embracing Feedback

Gathered feedback from target tool users:

• Directly, e.g., via • MSDN Pex forum, tech support, outreach to MS engineers and .NET user groups,

outreach to external early adopters

• Indirectly, e.g., via • interactions with the Visual Studio team (a tool vendor to its huge user base)

• Lack of basic test isolation in practice => Moles• Our suggestion of refactoring code for testability faced strong resistance in practice

• Observation at Agile 2008 conference• Large focus on mock objects and tool support for mock objects

Page 29: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Feedback

Early drops on VS Code Gallery of the Pex Extension, and the Code Digger extensions for Visual Studio 2013

Visual Studio MVP Community

Internal dogfooding by teams within Microsoft

Uservoice feedback (> 20 ideas)

StackoverflowActive forum with questions tagged with "Pex" or "IntelliTest“

Facebookhttps://www.facebook.com/PexMoles/

Twitterhttps://twitter.com/pexandmoles

Item Votes

Add support for NUnit and xUnit.net 304

Add Support for VB.NET 336

Make it available with Visual Studio Professional 319

Enable IntelliTest for 64 bit projects 72

Gathering Feedback

Page 30: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Pex to IntelliTest - Adds and Cuts

• Additions• Externalizing Test Framework support

• Cuts• Visual Basic Support

• CommandLine Support

• FixIt

• Code Contract integration

• Pex Explorer

• Reporting

Page 31: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

From http://bbcode.codeplex.com/

NUnit Extension – 37,378 installs

xUnit.net Extension – 17,438 installs

Pex to IntelliTest - Shipped!

(as of May 19, 20 17)

Page 32: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Collaboration with Academia• Win-win collaboration model

• Win (Industry Lab): longer-term research innovation, man power, research impacts, …

• Win (University): powerful infrastructure, relevant/important problems in practice, both research and industry impacts, …

• Hosting academic visitors• Faculty visits

e.g., Fitnex [Xie et al. DSN’09], Pex4Fun [Tillmann et a. ICSE’13 SEE]

• Student internshipse.g., FloPSy [Lakhotia et al. ICTSS’10], DyGen [Thummalapenta et al. TAP’10]

http://research.microsoft.com/pex/community.aspx#publications

Page 33: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Engaging Broader Academic Communities

• Academic research inspiring internal technology development• Reggae [Li et al. ASE’09] Rex [Veanes et al. ICST’10]

• MSeqGen [Thummalapenta et al. FSE’09] DyGen [Thummalapenta et al. TAP’10]

• …

• Academic research exploring longer-term research frontiers• DySy [Csallner et al. ICSE’08]

• Seeker [Thummalapenta et al., OOPSLA’11]

• Covana [Xiao et al. ICSE’11]

• SEViz [Honfi et al. ICST’15]

• Pex + Code Contracts [Christakis et al. ICSE’16]

• …

http://research.microsoft.com/pex/community.aspx#publications

Page 34: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Going from Pex to Coding Duels

Secret Implementation

class Secret {public static int Puzzle(int x) {

if (x <= 0) return 1;return x * Puzzle(x-1);

}}

Player Implementation

class Player {public static int Puzzle(int x) {

return x;}

}

class Test {public static void Driver(int x) {

if (Secret.Puzzle(x) != Player.Puzzle(x))throw new Exception(“Mismatch”);

}}

behaviorSecret Impl == Player Impl

34

Page 35: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

About Code Hunt

Blogs and SitesApril 29, 2015May 15, 2014 www.codehunt.comresearch.microsoft.com/codehuntresearch.microsoft.com/codehuntcommunityData site on Github

Powerful and versatile platform for coding as a game

Built on the symbolic execution of Pex

Addressing different audiences – students, developers, researchers

Data available in the cloud

Unique in working from unit tests not specifications

Open sourced data available for analysis

Over 350,000 players as of August 2016 (since mid 2014)

www.codehunt.com

Page 36: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

It’s a game!

1. iterative gameplay2. adaptive3. personalized4. no cheating5. clear winning criterionScore is based on

• how many puzzles solved,

• how well solved, and

• when solved

code

test cases

Page 37: Transferring Software Testing Tools to Practice (AST 2017 Keynote)
Page 38: Transferring Software Testing Tools to Practice (AST 2017 Keynote)
Page 39: Transferring Software Testing Tools to Practice (AST 2017 Keynote)
Page 40: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Lesson 6: Following the Data

• Java is provided by a source-to-source translator• We watched which features players used and what errors they made to

concentrate translation efforts for maximum effect

• The bank of over 400 puzzles records difficulty levels • These are updated by crowdsourcing users attempts

• The vast number of attempts at solving puzzles gives reliable data as to where programmers have difficulty – see open sourced data

For ImCupSept257 users x 24 puzzles x approx. 10 tries = about 13,000 programs`

http://taoxie.cs.illinois.edu/publications/icse15jseet-codehunt.pdfICSE 2015 JSEET:

http://taoxie.cs.illinois.edu/publications/icse13see-pex4fun.pdfICSE 2013 SEE:

Page 41: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Test Generation for Mobile Apps

When Monkey and WeChat Meet …

41

http://taoxie.cs.illinois.edu/publications/fse16industry-wechat.pdf

FSE 2016 Industry Track:

ICSE 2017 SEIP Track:

http://taoxie.cs.illinois.edu/publications/icse17seip-wechat.pdf

Page 42: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Motivation

Choudhary et al. [ASE’15]: Do we have good-enough tools to test Android apps?• Evaluated six research tools and Monkey on 68 open-source apps• Monkey tool outperformed all six research tools

Their study can be further extended• No industrial-strength Android app was studied• No demonstration on whether and how techniques can further

improve Monkey under industrial settings

42

Page 43: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Challenges for code coverage measurement: Requiring app’s source code Industrial-strength Android app can cause 64K

reference limit exception during instrumentation

Challenges for applicability: Scalability on testing apps with large codebases OS compatibility of testing tool

Challenges on Testing Industrial Mobile Apps

43

Page 44: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

WeChat Overview

WeChat = WhatsApp+Facebook+Instagram+PayPal+Uber… 846 million monthly active users Daily number: dozens of billion messages sent, hundreds of

million photos uploaded, hundreds of million payment transactions executed

WeChat backend: 2K+ microservices running on 40k+ servers 10M queries per second during Chinese New Year Eve

Large codebase on WeChat Android

44

Page 45: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Monkey: Experimental setup

Experiment Setup• Set Monkey to fire random events every 500

milliseconds• Run Monkey on WeChat 5 times independently• Run Monkey for 18 hours each time (2 hours

without login)

Evaluation Metric• Line coverage• Activity coverage

45

Page 46: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Monkey: Coverage Result Findings

Finding 1: Monkey has low

line coverage (19.5%) and low

activity coverage (10.3%). s

Finding 2: Monkey allocates a lopsided distribution of exploration time on each activity.

46

0

5

10

15

20

25

30

35

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Coverage

Percentage

ExplorationTime(hours)

LineCoverage- Monkey

ActivityCoverage- Monkey

Manually login

Page 47: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Monkey: Exploration time challenges

Widget obliviousness: It is difficult to generate events at the small-sized GUI element

State obliviousness: Monkey explores the same two activities repeatedly without contributing to new code coverage

To another activity

Back to other activities.

SelectContactUI ContactLabelUI

47

Page 48: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

New Approach

Design goals• Have direct access to UI elements on activity under test• Allocate more exploration time towards new GUI states

Techniques• Use UIAutomator framework to gain UI layout tree • Abstract GUI states to guide firing of state-changing

events

48

Page 49: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

New Approach: Coverage result

New approach covers an additional 11.1% p.p. more lines and 18.4% p.p. more activities than Monkey does! 49

Page 50: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Categorization of Not-covered Activities

50

Dead activity examples: unreleased features,activities for older devices

Insufficient account state examples: requiring financial information, account history, enabled features

Page 51: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Example Not-covered Activities

51Default disabled features of WeChat

Activity for searching saved favorite history Activity for showing details of searching result

Page 52: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Substring Hole Analysis

•Substring hole: name set of not-covered activities

•Identify “wallet”-related activities to be not-covered•Manual testing of wallet-related tests, reducing substring hole of “wallet” to be about 22.5% (16 / 71) from 85.9% (61 / 71)

52

Page 53: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Learning for Test PrioritizationAn Industrial Case Study

Benjamin Busjaeger

Salesforce

Tao Xie

University of Illinois,

Urbana-Champaign

FSE 2016 Industry Track

http://taoxie.cs.illinois.edu/publications/fse16industry-learning.pdf

Page 54: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Large Scale Continuous Integration

2K+Modules

600K+Tests

2K+Engineers

500+Changes/Day

Main repository ...

350K+Source files

Page 55: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Motivation

Before Commit After Commit

Objective Reject faulty changes Detect test failures as quickly as possible

Current Run hand-picked tests Parallelize (8000 concurrent VMs) &batch (1-1000 changes per run)

Problem Too complex for human Feedback time constrained by capacity & batching complicates fault assignment

Desired Select top-k tests likely to fail Prioritize all tests by likelihood of failure

Page 56: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Insight: Need Multiple Existing Techniques

● Heterogeneous languages: Java, PL/SQL, JavaScript, Apex, etc.

●Non-code artifacts: metadata and configuration●New/recently-failed tests more likely to fail

Test code coverage of change Textual similarity between test and change

Test age and recent failures

Page 57: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Insight: Need Scalable Techniques●Complex integration tests: concurrent execution●Large data volume: 500+ changes, 50M+ test runs, 20TB+ results per

day

→ Our approach: Periodically collect coarse-grained measurements

Test code coverage of change Textual similarity between test and change Test age and recent failures

Page 58: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

model for learning from past test results

New Approach: Learning for Test Prioritization

Test code coverage of change Textual similarity between test and change Test age and recent failures

Change

Test Ranking

→ Implementation currently in pilot use at Salesforce

Page 59: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Empirical Study: Setup

●Test results of 45K tests for ~3 month period●In this period, 711 changes with ≥1 failure

• 440 for training• 271 for testing

●Collected once for each test:• Test code coverage• Test text content

●Collected continuously:• Test age and recent failures

Page 60: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

● New approach achieves highest average recall at all cutoff points• 50% failures detected with top 0.2%• 75% failures detected with top 3%

Results: Test selection (before commit)New Approach

Page 61: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

● New approach achieves highest APFD with least variance• Median: 85%• Average: 99%

Results: Test prioritization (after commit)New Approach

Page 62: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Summary: Learning for Test Prioritization

● Main insights gained in conducting test prioritization in industry

● Novel learning-based approach to test prioritization

● Implementation currently in pilot use at Salesforce

● Empirical evaluation using a large Salesforce dataset

Page 63: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Summary• Pex practice impact by surrounding, retargeting, simplifying

• Lessons in transferring tools to practice1. Evolving vision

2. Landing your first customer

3. Human factors

4. Misconceptions

5. Embracing feedback

• Collaboration/engagement with academia

• Educational impact and lesson learned

6. Following the data

• Important to research on testing industrial apps (e.g., WeChat)

• Beyond open-source ones

http://research.microsoft.com/pex/

http://www.codehunt.com/

Page 64: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Getting Real to Produce Practice Impact

64

Making real impact

Building real technologies

Solving real problems

Software testing/analysis tools are naturally tied with software development practice

Page 65: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Thank you! Questions?

65

This material is based upon work supported by the Maryland Procurement Office under Contract No. H98230-14-C-0141. This work is also supported in part by National Science Foundation under grants no. CCF-1409423, CNS-1434582, CNS-1513939, CNS-1564274.

Page 66: Transferring Software Testing Tools to Practice (AST 2017 Keynote)

Getting Real to Produce Practice Impact

66

Making real impact

Building real technologies

Solving real problems

Software testing/analysis tools are naturally tied with software development practice