19
TOKYO INSTITUTE OF TECHNOLOGY DEPARTMENT OF COMPUTER SCIENCE Shinpei Hayashi Yasuyuki Tsuda Motoshi Saeki Department of Computer Science Tokyo Institute of Technology Japan Detecting Occurrences of Refactoring with Heuristic Search 2008. 12. 5 APSEC 2008

Detecting Occurrences of Refactoring with Heuristic Search

Embed Size (px)

DESCRIPTION

Presented at APSEC 2008http://dx.doi.org/10.1109/APSEC.2008.9

Citation preview

Page 1: Detecting Occurrences of Refactoring with Heuristic Search

TOKYO INSTITUTE OF TECHNOLOGY

DEPARTMENT OF COMPUTER SCIENCE

Shinpei Hayashi Yasuyuki Tsuda Motoshi Saeki

Department of Computer Science

Tokyo Institute of Technology

Japan

Detecting Occurrences

of Refactoring

with Heuristic Search

2008. 12. 5

APSEC 2008

Page 2: Detecting Occurrences of Refactoring with Heuristic Search

Detecting Occurrences of Refactoring− Detecting where to and what kinds of refactoring

were performed between two ver. of programs

− From version archives such as CVS/Subversion

with Heuristic Search− We use a graph search technique for detecting

impure refactorings

Results− (Semi-)automated tools have been developed

− We have a simple case study

Abstract

Page 3: Detecting Occurrences of Refactoring with Heuristic Search

3

It is important to reduce the cost for

understanding the changes of a program,

which is continually modified

Background

use

Program S

Program L

(a library)

modify

FOSS project(FOSS: Free/Open Source Software)

I have to follow the changes of L...

What do the changes mean? ver. 123

ver. 124

Page 4: Detecting Occurrences of Refactoring with Heuristic Search

4

Categorization of modifications is useful− Detecting refactorings between two versions of a

program

Aim

use

Program S

Program L

(a library)

modify

FOSS project(FOSS: Free/Open Source Software)

The changes are:

Extract Method +

Move Method + α!ver. 123

ver. 124

Page 5: Detecting Occurrences of Refactoring with Heuristic Search

5

Related work

− with software metrics [Demeyer 2000]

− by checking pre/post-conditions of source

code properties [Weißgerber 2006]

Key issue

− detecting impure refactorings [Görg 2005]

• refactoring + refactoring

• refactoring + other modifications

Detecting Refactorings

Page 6: Detecting Occurrences of Refactoring with Heuristic Search

6

Example from Fowler’s book [Fowler 1999]

Motivating Scenario

Customer

statement()

RentalRental

getCharge()

Customer

statement()

Customer

statement()

getCharge(Rental)

n0 nmn1Extract

Method

Move

Method

Rental

Page 7: Detecting Occurrences of Refactoring with Heuristic Search

7

Scenario: Lost of States

Customer

statement()

Rental

getCharge()

Customer

statement()

Customer

statement()

getCharge(Rental)

n0 n2n1Extract

Method

Move

Method

Version

ArchiveRold Rnew

commit commitlose

intermediate state

Rental Rental

Page 8: Detecting Occurrences of Refactoring with Heuristic Search

Version

Archive

8

Hard to detect refactorings via mixed differences

− Considering intermediate states is required

Scenario: Diffs between revs

Customer

statement()

Rental

getCharge()

Customer

statement()

Rold Rnew

1. Some code fragments in

Customer#statement is removed

2. a method invocation to

Rental#getCharge is added to

statement

3. Rental#getCharge is added

Rental

Mixed differences

Page 9: Detecting Occurrences of Refactoring with Heuristic Search

9

Generating intermediate states by actually

applying refactorings to the program

Finding an appropriate path from Rold (= n0) to

Rnew (= nm) by a graph search technique

Our Approach

Initial state

(Rold)Final state

(Rnew)

n0 nm

States (nodes): versions of the program

Transitions (edges): refactoring operations

Page 10: Detecting Occurrences of Refactoring with Heuristic Search

10

1. Find likely refactorings

2. Evaluate the distance to the nm

3. Apply the best one and generate new state

Procedure

Initial state

(Rold)Final state

(Rnew)

n0 nm4

6

8

Page 11: Detecting Occurrences of Refactoring with Heuristic Search

11

Procedure

Initial state

(Rold)Final state

(Rnew)

n0 nm

1. Find likely refactorings

2. Evaluate the distance to the nm

3. Apply the best one and generate new state

6

8

23

3

Terminate if the new state almost equals to Rnew

Page 12: Detecting Occurrences of Refactoring with Heuristic Search

12

1. How to find candidates of refactorings?− They should be similar to the changes for nm

2. How to evaluate them?− The best one should generate new state closer to nm

We use structural differences between two states

Efficient Search

Initial state

(Rold)Final state

(Rnew)

n0 nm

6

8

23

3

Page 13: Detecting Occurrences of Refactoring with Heuristic Search

13

Calculated by comparing two AST− 4 types: add, remove, change, move

Structural Differences

change(Customer, the name customerID, the name id)

add(FieldDeclaration int[] phoneNum, Customer)

public class Customer {int customerID;String name;

}

public class Customer {int id;String name;int[] phoneNum;

}

Page 14: Detecting Occurrences of Refactoring with Heuristic Search

14

By matching between diffs. (D) and modifications

representing a refactoring operation (R)

− if a subset of D matches a subset of R, the refactoring

operation is expanded.

− Matching likelihood: (# of matched modifications) / (# of R)

Find New Refactorings

remove(Block, Customer#statement)

add(MethodInvocation, Rental#statement)

change(Rental, the name “n1”, the name“n2”)

The differences (D):

remove(Block, ClassA#method1)

add(MethodDeclaration, ClassA)

Extract Method (R):

nm

Matching likelihood:

0.5 (1/2)

Page 15: Detecting Occurrences of Refactoring with Heuristic Search

15

Using f(n, o) = g(n) + h(n) / α(o)− g(n): # of applied refactorings for obtaining n

− h(n): the size of differences between n and nm

− α(o): likelihood of o

f(n2, o6) = 2 + 3 / (1/2) = 8

Evaluation

Initial State

(Pold)

Goal State

(Pnew)

n0 nm

n2

n1

g(n2) = 2 h(n2) = 3

α(o6) = 1/2

o1

o2

o3

o4

o5

o6

Page 16: Detecting Occurrences of Refactoring with Heuristic Search

16

Expanding and evaluating candidates of

refactorings with REUSAR (implemented)

− Input: Two versions of Java source code

− Output: Candidates of refactorings with priority

− Calculating differences with XMLdiff (an existing tool)

Applying refactorings by Eclipse− Checking pre/post-conditions

− Modifying source code

Supporting Tools

Page 17: Detecting Occurrences of Refactoring with Heuristic Search

17

Applying our technique to an

existing version archive, REUSAR(supporting tool in this study)

Case Study

DistanceCalculator

calculateDistance(List<Diff>)

R1796 R17991. Rename Method

2. Remove Parameter

DistanceCalculator

calcDistance()

Page 18: Detecting Occurrences of Refactoring with Heuristic Search

18

Case Study: Result

n0 n2

Rename

Method

Remove

Parameter

Extract

Method

Remove

Parameter

Extract

Method

n1

R1796 R1799

Our technique is effective− It can correctly detect impure refactorings which were

actually performed

− Prioritizing the candidates reduces # of applications of

refactoring operations

4

4

8

7

4

Page 19: Detecting Occurrences of Refactoring with Heuristic Search

19

Summary

− Detecting refactorings by a graph search,

based on heuristics with structural

differences between two programs

− Our technique can detect impure

refactorings

Future Work

− Tool integration

− Evaluation: larger-scale case study

Conclusion