Software Analytics: Data Analytics for Software Engineering

Preview:

Citation preview

Software Analytics:

Data Analytics for Software EngineeringAchievements and Opportunities

Tao XieDepartment of Computer Science

University of Illinois at Urbana-Champaign, USA

taoxie@illinois.edu

In Collaboration with Microsoft Research; Most Slides Adapted from Slides Co-Presented with

Dongmei Zhang (Microsoft Research, http://research.microsoft.com/en-us/groups/sa/)

New Era…Software itself is changing...

Software Services

How people use software is changing…

Individual Isolated

Not much data/content

generation

How people use software is changing…

How people use software is changing…

Individual Isolated

Not much data/content

generation

How people use software is changing…

Individual

Social

Isolated

Not much data/content

generation

Collaborative

Huge amount of data/artifacts

generated anywhere anytime

How software is built & operated is changing…

How software is built & operated is changing…

Data pervasive

Long product cycle

Experience & gut-feeling

In-lab testing

Informed decision making

Centralized development

Code centric

Debugging in the large

Distributed development

Continuous release

… …

How software is built & operated is changing…

Data pervasive

Long product cycle

Experience & gut-feeling

In-lab testing

Informed decision making

Centralized development

Code centric

Debugging in the large

Distributed development

Continuous release

… …

Software Analytics

Software analytics is to enable software

practitioners to perform data exploration and

analysis in order to obtain insightful and

actionable information for data-driven tasks

around software and services.

Dongmei Zhang, Yingnong Dang, Jian-Guang Lou, Shi Han, Haidong Zhang, and Tao Xie. Software

Analytics as a Learning Case in Practice: Approaches and Experiences. In MALETS 2011

http://research.microsoft.com/en-us/groups/sa/malets11-analytics.pdf

Software Analytics

Software analytics is to enable software

practitioners to perform data exploration and

analysis in order to obtain insightful and

actionable information for data-driven tasks

around software and services.

http://research.microsoft.com/en-us/groups/sa/

http://research.microsoft.com/en-us/news/features/softwareanalytics-052013.aspx

Research topics – the trinity view

Software

Users

Software

Development

Process

Software

System

• Covering different areas of

software domain

• Throughout entire development

cycle

• Enabling practitioners to obtain

insights

Data sources

Runtime traces

Program logs

System events

Perf counters

Usage log

User surveys

Online forum posts

Blog & Twitter

Source code

Bug history

Check-in history

Test cases

Eye tracking

MRI/EMG

Target audience – software practitioners

Target audience – software practitioners

Developer

Tester

Target audience – software practitioners

Developer

Tester

Program Manager

Usability engineer

Designer

Support engineer

Management personnel

Operation engineer

Output – insightful information

• Conveys meaningful and useful understanding or

knowledge towards completing the target task

• Not easily attainable via directly investigating raw data

without aid of analytics technologies

• Examples

– It is easy to count the number of re-opened bugs, but how to

find out the primary reasons for these re-opened bugs?

– When the availability of an online service drops below a

threshold, how to localize the problem?

Output – actionable information

• Enables software practitioners to come up with concrete

solutions towards completing the target task

• Examples

– Why bugs were re-opened?

• A list of bug groups each with the same reason of re-

opening

– Why availability of online services dropped?

• A list of problematic areas with associated confidence values

Research topics & technology pillars

Software

Users

Software

Development

Process

Software

System Vertical

Horizontal

Information Visualization

Data Analysis Algorithms

Large-scale Computing

Connection to practice

• Software Analytics is naturally tied with software

development practice

• Getting real

Connection to practice

• Software Analytics is naturally tied with software

development practice

• Getting real

Real

Data

Real

Problems

Real

Users

Real

Tools

Outline

• Software Engineering for Big Data

• Overview of Software Analytics

• Selected Projects

– StackMine: Performance debugging in the large

– XIAO: Scalable code clone analysis

– SAS: Incident management of online services

– Code Hunt/Pex4Fun: Teaching & learning via interactive gaming

StackMinePerformance debugging in the large

via mining millions of stack traces

http://research.microsoft.com/en-us/groups/sa/stackmine_icse2012.pdf

Performance issues in the real world• One of top user complaints

• Impacting large number of users every day

• High impact on usability and productivity

High Disk I/O High CPU consumption

Performance issues in the real world• One of top user complaints

• Impacting large number of users every day

• High impact on usability and productivity

High Disk I/O High CPU consumption

As modern software systems tend to get more and more complex, given limited time and resource before software release, development-site testing and debugging become more and more insufficient to ensure satisfactory software performance.

Performance debugging in the large

Performance debugging in the large

Trace StorageTrace collection

Network

Performance debugging in the large

Trace StorageTrace collection

Network

Trace analysis

Performance debugging in the large

Trace StorageTrace collection

Bug DatabaseNetwork

Trace analysis

Bug filing

Performance debugging in the large

Trace StorageTrace collection

Problematic Pattern

Repository Bug DatabaseNetwork

Trace analysis

Bug filing

Performance debugging in the large

Pattern Matching

Trace StorageTrace collection

Bug update

Problematic Pattern

Repository Bug DatabaseNetwork

Trace analysis

Bug filing

Performance debugging in the large

Pattern Matching

Trace StorageTrace collection

Bug update

Problematic Pattern

Repository Bug DatabaseNetwork

Trace analysis

Bug filing

Key to issue

discovery

Performance debugging in the large

Pattern Matching

Trace StorageTrace collection

Bug update

Problematic Pattern

Repository Bug DatabaseNetwork

Trace analysis

Bug filing

Key to issue

discovery

Bottleneck of

scalability

Performance debugging in the large

Pattern Matching

Trace StorageTrace collection

Bug update

Problematic Pattern

Repository Bug DatabaseNetwork

Trace analysis

How many issues are

still unknown?

Bug filing

Key to issue

discovery

Bottleneck of

scalability

Performance debugging in the large

Pattern Matching

Trace StorageTrace collection

Bug update

Problematic Pattern

Repository Bug DatabaseNetwork

Trace analysis

How many issues are

still unknown?

Which trace file should I

investigate first?

Bug filing

Key to issue

discovery

Bottleneck of

scalability

Problem definition

Given OS traces collected from tens of thousands

(potentially millions) of users,

help domain experts identify impactful program execution

patterns

(that cause the most impactful underlying performance problems)

with limited time and resource.

Challenges

Combination of expertise• Generic machine learning tools without domain

knowledge guidance do not work well

Highly complex analysis• Numerous program runtime combinations triggering

performance problems

• Multi-layer runtime components from application to

kernel being intertwined

Large-scale trace data• TBs of trace files and increasing

• Millions of events in single trace streamInternet

Technical highlights

• Machine learning for system domain

– Formulate the discovery of problematic execution patterns as

callstack mining & clustering

– Systematic mechanism to incorporate domain knowledge

• Interactive performance analysis system

– Parallel mining infrastructure based on HPC + MPI

– Visualization aided interactive exploration

Impact“We believe that the MSRA tool is highly valuable and much more

efficient for mass trace (100+ traces) analysis. For 1000 traces, we

believe the tool saves us 4-6 weeks of time to create new signatures,

which is quite a significant productivity boost.”

Highly effective new issue discovery on Windows

mini-hang

Continuous impact on future Windows

versions

XIAOScalable code clone analysis

2012

http://research.microsoft.com/jump/175199

XIAO: Code Clone Analysis

• Motivation

– Copy-and-paste is a common developer behavior

– A real tool widely adopted internally and externally

• XIAO enables code clone analysis in the following way

– High tunability

– High scalability

– High compatibility

– High explorability

High tunability – what you tune is what you get

• Intuitive similarity metric: effective control of the

degree of syntactical differences between two code

snippets

for (i = 0; i < n; i ++) {

a ++;

b ++;

c = foo(a, b);

d = bar(a, b, c);

e = a + c; }

for (i = 0; i < n; i ++) {

c = foo(a, b);

a ++;

b ++;

d = bar(a, b, c);

e = a + d;

e ++; }

High scalability• Four-step analysis process

• Easily parallelizable based on source code partition

Pre-processing

Pruning Fine Matching

Coarse Matching

High compatibility

• Compiler independent

• Light-weight built-in parsers for C/C++ and C#

• Open architecture for plug-in parsers to support

different languages

• Easy adoption by product teams

– Different build environment

– Almost zero cost for trial

High explorability

1. Clone navigation based on source tree hierarchy

2. Pivoting of folder level statistics

3. Folder level statistics

4. Clone function list in selected folder

5. Clone function filters

6. Sorting by bug or refactoring potential

7. Tagging

1 2 3 4 5 6

7

1. Block correspondence

2. Block types

3. Block navigation

4. Copying

5. Bug filing

6. Tagging

1

2

3

4

1

6

5

Scenarios & SolutionsQuality gates at milestones• Architecture refactoring

• Code clone clean up

• Bug fixing

Post-release maintenance• Security bug investigation

• Bug investigation for sustained engineering

Development and testing• Checking for similar issues before check-in

• Reference info for code review

• Supporting tool for bug triage

Online code clone search

Offline code clone analysis

Benefiting developer community

Available in Visual Studio 2012 RC

Searching similar snippets

for fixing bug onceFinding refactoring

opportunity

More secure Microsoft products

Code Clone Search service integrated into

workflow of Microsoft Security Response Center

Over 590 million lines of code indexed across

multiple products

Real security issues proactively identified and addressed

Example – MS Security Bulletin MS12-034Combined Security Update for Microsoft Office, Windows, .NET Framework, and Silverlight, published: Tuesday, May 08, 2012

3 publicly disclosed vulnerabilities and seven privately reported involved. Specifically, one is exploited by the Duqu malware to execute arbitrary code when a user opened a malicious Office document

Insufficient bounds check within the font parsing subsystem of win32k.sys

Cloned copy in gdiplus.dll, ogl.dll (office), Silver Light, Windows Journal viewer

Microsoft Technet Blog about this bulletin

However, we wanted to be sure to address the vulnerable code wherever it appeared across the Microsoft code base. To that end, we have been working with Microsoft Research to develop a “Cloned Code Detection” system that we can run for every MSRC case to find any instance of the vulnerable code in any shipping product. This system is the one that found several of the copies of CVE-2011-3402 that we are now addressing with MS12-034.

SASIncident management of online services

http://research.microsoft.com/apps/pubs/?id=202451

Motivation

Incident Management (IcM) is a critical task to

assure service quality

• Online services are increasingly popular & important

• High service quality is the key

Incident Management: Workflow

Detect a

service

issue

Alert On-

Call

Engineers

(OCEs)

Investigate

the problem

Restore

the

service

Fix root cause

via

postmortem

analysis

Incident Management: Characteristics

Shrink-Wrapped

Software Debugging

Root Cause and Fix

Debugger

Controlled

Environment

Online Service

Incident

Management

Workaround

No Debugger

Live Data

Incident Management: Challenges

Large volume and noisy data

Highly complex problem space

No knowledge of entire system

Knowledge not well organized

SAS: Incident management of online services

SAS, developed and deployed to effectively reduce MTTR

(Mean Time To Restore) via automatically analyzing

monitoring data

5

5

Design Principle of SAS

Automating Analysis

Handling Heterogeneity

Accumulating Knowledge

Supporting human-in-the-loop (HITL)

Techniques Overview

• System metrics

– Identifying Incident Beacons

• Transaction logs

– Mining Suspicious Execution Patterns

• Historical incidents

– Mining Historical Workaround Solutions

Industry Impact of SAS

Deployment

• SAS deployed to

worldwide datacenters for

Service X (serving

hundreds of millions of

users) since June 2011

• OCEs now heavily depend

on SAS

Usage

• SAS helped successfully

diagnose ~76% of the

service incidents assisted

with SAS

Code Hunt/Pex4FunTeaching and learning via interactive gaming

http://research.microsoft.com/pubs/189240/icse13see-pex4fun.pdf

http://pex4fun.com/https://www.codehunt.com/

Coding Duels

1,841,880 clicked 'Ask Pex!'

http://pex4fun.com/

Code Hunt: Resigned As Gamehttps://www.codehunt.com/

Behind the Scene

Secret Implementation

class Secret {

public static int Puzzle(int

x) {

if (x <= 0) return 1;

return x * Puzzle(x-1);

}

}

Player Implementation

class Player {

public static int Puzzle(int x) {

return x;

}

}

class Test {

public static void Driver(int x) {

if (Secret.Puzzle(x) != Player.Puzzle(x))

throw new Exception(“Mismatch”);

}

}

behaviorSecret Impl == Player Impl

62

Coding Duels: Fun and Engaging

Iterative gameplay

Adaptive

Personalized

No cheating

Clear winning criterion

Teaching and Learning

Coding Duels for Course Assignments@Grad Software Engineering Course

http://pexforfun.com/gradsofteng

Observed Benefits

• Automatic Grading

• Real-time Feedback (for Both Students and Teachers)

• Fun Learning Experiences

Coding Duel Competition

@ICSE 2011

http://pexforfun.com/icse2011

Example User Feedback

“It really got me *excited*. The part that got me most

is about spreading interest in teaching CS: I do think

that it’s REALLY great for teaching | learning!”

“I used to love the first person shooters and

the satisfaction of blowing away a whole team

of Noobies playing Rainbow Six, but this is far

more fun.”

“I’m afraid I’ll have to constrain myself to spend just an

hour or so a day on this really exciting stuff, as I’m

really stuffed with work.”

Released since 2010

X

Usage ScenariosStudents: proceed through a sequence on puzzles to learn and practice.

Educators: exercise different parts of a curriculum, and track students’ progress

Recruiters: use contests to inspire communities and results to guide hiring

Researchers: mine extensive data in Azure to evaluate how people code and learn

http://taoxie.cs.illinois.edu/publications/icse15jseet-codehunt.pdfICSE 2015 JSEET:

http://taoxie.cs.illinois.edu/publications/icse13see-pex4fun.pdfICSE 2013 SEE:

Conclusion

Software

Users

Software

Development

Process

Software

System Vertical

Horizontal

Information Visualization

Data Analysis Algorithms

Large-scale Computing

Q & A

http://taoxie.cs.illinois.edu/

Contact: taoxie@illinois.edu

Conclusion

Software

Users

Software

Development

Process

Software

System Vertical

Horizontal

Information Visualization

Data Analysis Algorithms

Large-scale Computing

Code Hunt Programming Game

Code Hunt Programming Game

Code Hunt Programming Game

Code Hunt Programming Game

Code Hunt Programming Game

Code Hunt Programming Game

Code Hunt Programming Game

Code Hunt Programming Game

Code Hunt Programming Game

Code Hunt Programming Game

Code Hunt Programming Game

Code Hunt Programming Game

Code Hunt Programming Game

Leaderboard and Dashboard

Publically visible, updated during the contest

Visible only to the organizer

It’s a game!

iterative gameplay

adaptive

personalized

no cheating

clear winning criterion

code

test cases

Recommended