Transcript
Page 1: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology,Osaka University

Program Slicing on Java byte-code for Locating Functional Concerns

Takashi Ishio† Ryusuke Niitani †

Gail Murphy‡ Katsuro Inoue †

† Osaka University, Japan‡ University of British Columbia, Canada

Page 2: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Concern Location

A functional concern is code that helps fulfill a functional requirement. A software maintenance task usually focuses on a functi

onal concern.

Concern location comprises “Search and Explore.” Search “interesting” methods

grep or other feature location tools

Explore the interaction among the methods call graph, class hierarchy tree, cross reference

Page 3: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Example: Autosave function in jEdit

jEdit periodically saves the contents of text area. A user can specify the frequency.

We can easily find

Autosave class,

Buffer.autosave() method and

BufferIORequest.autosave() method.

How the classes and methods are interacting?

Page 4: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Exploring Interaction among methods

Important information: control-flow and data-flow. Which method triggers the autosave function. Which class has a necessary data (e.g. filename). How a method saves the contents to a text file.

We have to read following classes:

Autosave, Buffer, BufferIORequest, PerspeciveManager, VFSManager, FileVFS …

Page 5: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Automated Concern Location

We are trying to extract a concern graph from code fragments specified by a developer. Our approach is based on

program slicing.Our tool is based on

Soot, a Java bytecode analysis framework.

Program Slicing with Heuristics

Slice-to-ConcernGraphTranslation

Code fragmentsrelated to a functionality

A concern graph

a program slice

Page 6: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Autosave concern graph

Input = Autosave.*(), Buffer.autosave(), BufferIORequest.autosave()

Page 7: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Program Slicing

Slicing extracts statements related to criteria statements specified by a user.

1. A program P is converted to    a program dependence graph. vertices: statements in P edges: control/data dependence relations

2. A user specifies “slicing criteria” statements in P. The statements are translated into “criteria vertices” in the PDG.

3. A program slice, a set of statements that affect or depend on criteria, is extracted by graph traversal from criteria vertices.

1 i = 3;2 if (a > 0) {3 print i;4 }

data dependence

<3,i>

controldependence

use

definition

Page 8: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Slice including unrelated concerns

Slicing usually extracts many statements.A functional unit is connected to other units by

control/data-flow.28% on average in C program†

† Binkley, D., Gold, N. and Harman, M.: An Empirical Study of Static Program Slice Size. ACM TOSEM Vol.16, No.2, Article 8, April 2007.

Autosave UndoManagerautosave_dirty

flagactivate set/reset

reset

CompleteWord

set

slicing

Page 9: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Slicing with Barriers

A barrier is a vertex or an edge that terminate graph traversal†.

† Krinke, J.: Slicing, Chopping, and Path Conditions with Barriers.Software Quality Journal, Vol.12, No.4, pp.339-360,December 2004.

Autosave UndoManagerautosave_dirty

flagactivate set/reset

reset

CompleteWord

set

A barrier blocks graph traversal.slicing

Page 10: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Similarity-based Barrier

The key idea is following: if two methods are contributing to the same functionality,

the methods use similar methods, fields and classes.

Name Set NS(m) = a set of types, classes, methods and fields referred in m. A long name is “tokenized”.

e.g. “java.io.File” “java”, “io”, “File”, “java.io.File”

Page 11: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Example of Similarity

package org.gjt.sp.util;class IntegerArray {private int[] array;private int len;public void add(int num) { if(len >= array.length) { int[] arrayN = new int[len * 2]; System.arraycopy(array,0,arrayN,0,len); array = arrayN; } array[len++] = num;}public final int getSize() { return len;}public final void setSize(int len) { this.len = len;} }

org.gjt.sp.util.IntegerArray,org, gjt, sp, util, integer, array,void, add, int, len, int[], java.lang.System, java, lang, system,arraycopy

NS(IntegerArray.add)

NS(IntegerArray.getSize)

NS(IntegerArray.setSize)

org.gjt.sp.util.IntegerArray,org, gjt, sp, util, integer, array, getSize, get, size, int, len

org.gjt.sp.util.IntegerArray,org, gjt, sp, util, integer, array, len, setSize, set, size, void, int

sim = 0.801

sim = 0.639

Page 12: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Identifying Barriers

Program slicing is blocked at method m if m is not related to slicing criteria

Similarity(m, C) threshold≦

A method m is related to slicing criteria if slicing criteria includes a method n such that m is similar to n.

C = a set of methods that contain slicing criteria vertices.

Page 13: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Slicing algorithm

Slicing with summary edges

and barriers defined by Horwitz extended by Krinke

PDG based on Jimple code “jimple” is an intermediate represen

tation for bytecode. 3-address code Simple control-flow: “if” and “goto” Independent of JVM stack operation

Calculate similarityfor each method

Code fragmentsrelated to a functionality

a program slice

Identify barriers

Slicing with Barriers

Page 14: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Visualizing a slice as a concern graph

Concern Graph A vertex is a class, a method or a field. An edge represents a relation between two vertices.

call, create, check, read, write, superclass, …

We applied rule-based translation.†

v1 in m1 v2 in m2m1 m2

call

call or parameter

Slice Concern Graph

† Kameda, D. and Takimoto, M.: Building Cocnern Graph Based on Program Slicing. IPSJ Transactions on Programming, Vol.46, No.11 (Pro 26), pp.45-56. in Japanese.

v1 in m1

READ obj.field

m1 fieldread

Page 15: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

A graphical output with Graphviz

We omit intra-class edges in graphical format. Detail is provided in textual format.

e.g. “Autosave.setInterval(interval) calls

new Timer(interval, Autosave).”

Page 16: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

The effectiveness of barriers

Barriers reduced concern graph size: 1000 methods 20 methods Printable on an A3 or A4-sized paper

Comparing extracted graphs

with hand-made concern

graphs (not finished yet).

Our previous experiment is reported in: 仁井谷竜介,石尾隆,井上克郎 :プログラムスライシングを用いた機能的関心事の抽出手法の提案と実装 .PPL 2007. in Japanese.

concern graph size on6 maintenance taskson jEdit and our Slicer

Page 17: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Information extracted from Java program

To construct a dependence graphControl dependence relationData dependence relationCall Graph (with dynamic binding information)

To identify barriersa set of types, methods, fields referred in each

method m To slice the dependence graph

Mapping source code to vertices

Page 18: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Slicing Tool Overview

JavaClassFiles

PDGSlicer

SlicingCriteria

ConcernGraph

PDGConstructor

Jimple3-Address

Code

Call GraphPoints-to Set

Soot Framework (http://www.sable.mcgill.ca/soot/)

SPARKPoints-to Set

Analysis

Control-FlowData-flowAnalysis

AnnotatedJimple

JimpleTranslator

Page 19: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Our effort to implement the system

The program size PDG Construction: 2731 LOC (without comments) Slicing: 9296 LOC (without comments)

slicing algorithms, heuristic functions and concern graph translation

We could implement the PDG construction phase in two weeks: One week to understand how Soot works. The other week to implement code.

Soot enabled us to focus on the essential part of the research idea.

Page 20: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Advantage of Soot

A rich analysis toolkitSoot provides control-flow and data-flow for each

method.Jimple is simpler than source code and bytecode.

Complex Java statements are simplified during compilation.

Body Unit1 n

Stmt(Jimple code)

is-a

Value1 n

Expr

is-a

Control-flow

Data-flow

Local

ExceptionalUnitGraph

SmartLocalDefs

use

use

Method

Page 21: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Limitation of Soot

Soot is not a program analysis framework.

Soot keeps all data in memory to compile Jimple code to bytecode after the optimization.

Soot requires 2-4GB RAM to analyze jEdit and JDK.

Soot supports only the simple workflow: whole program analysis (call-graph construction) followed by local program analysis.

We cannot implement a statistics tool (whole-program analysis) that uses the result of method-local analysis.

Page 22: Program Slicing on Java byte-code  for Locating Functional Concerns

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University

Summary

Concern location based on program slicing We introduced heuristics in order to extract a functional

concern of interest to a developer. Input is the same as a traditional program slicing.

Most of graphs can be printed on an A3-sized paper.

Soot framework reduced the implementation effort. Soot is a good framework, but

we hope a framework specialized for program analysis. easy-to-learn, extensible and scalable


Recommended