40
Reverse Engineering Object-Oriented Distributed Systems Dan C. Cosma LOOSE Research Group “Politehnica” University of Timisoara Romania

Reverse Engineering Object-Oriented Distributed Systems

Embed Size (px)

DESCRIPTION

Paper describing my PhD thesis, ICSM 2010 PhD dissertations track.

Citation preview

Page 1: Reverse Engineering Object-Oriented Distributed Systems

Reverse Engineering Object-Oriented

Distributed SystemsDan C. Cosma

LOOSE Research Group“Politehnica” University of Timisoara

Romania

Page 2: Reverse Engineering Object-Oriented Distributed Systems

OverviewUnderstanding object-oriented

distributed software applications

by reverse engineeringthe source code,

focusing on the distribution-related aspects of the system,

using a structural, technology-aware analysis approach

Page 3: Reverse Engineering Object-Oriented Distributed Systems

Distributed Software

The distributed aspect is crucial for understanding - systems are specifically built for distributed problems - technology dependence: communication infrastructure

Making the distributed aspect central makes the analysis easy - without ignoring the local functionality concerns

Page 4: Reverse Engineering Object-Oriented Distributed Systems

Methodology for understanding object-oriented distributed systems

meta-model

reverse engineering techniques

metrics

visualization

tool

A reverse engineering process

Page 5: Reverse Engineering Object-Oriented Distributed Systems

System Representation

Page 6: Reverse Engineering Object-Oriented Distributed Systems

Model

Augments an OO meta-model (Memoria): makes the distributed aspect a main concept

distributable feature -- feature directly involved in the distributed functionality, either by providing remote services, or by directly using such services

frontier classes -- act at the frontier between the system and the communication infrastructure (“communication mediator”)

Page 7: Reverse Engineering Object-Oriented Distributed Systems

Model Overview

System - Mediator

Frontier

Frontier Class

Frontier Class

Frontier Class

Frontier Class

Core Class

Core ClassCore

ClassCore Class

Core Class

Frontier Class

Core Class

Core Class

Core Class

Acquaintance Class Acquaintance

ClassAcquaintance

Class

Acquaintance Class

Acquaintance Class

Acquaintance Class

Acquaintance Class Acquaintance

Class

Acquaintance Class

Acquaintance Class

Acquaintance Class

Acquaintance Class

Class

Class

Class

Class

Class

Class

Acquaintance Class

Distributable

Feature Core

Distributable

Feature Core

Distributable

Feature Distributable

Feature

Local

Feature

Page 8: Reverse Engineering Object-Oriented Distributed Systems

The Approach

Page 9: Reverse Engineering Object-Oriented Distributed Systems

Mediator

remote

call

Mediator

2: Separate distinct cores of distributable features

utility

class

core of

distr.feat.

Mediator

4: Assess impact ofdistributable features

3: Capture coarse-grained architecture

of distributable features

1: Build the dependency graph of distributable features (DGDF)

frontier

class

0: Initial graph ofclasses

5: Support for restructuring

core class

extracted

new feature

Approach

vertex: a classedge: method call / attribute access / inheritance relation

Page 10: Reverse Engineering Object-Oriented Distributed Systems

The case studies

Java / RMI

Page 11: Reverse Engineering Object-Oriented Distributed Systems

I. Core analysis

Page 12: Reverse Engineering Object-Oriented Distributed Systems

Goals

Find the core entities involved in the distributed functionality

Get an overview of the distributed architecture

Page 13: Reverse Engineering Object-Oriented Distributed Systems

Build a Core Graph

Start with the Frontier Classes: the best starting points describing involvement in distribution

Incrementally add new vertices and edges

Identify the Frontier - technology dependent rules

until a configurable depth of search is reached

Page 14: Reverse Engineering Object-Oriented Distributed Systems

Identify the Distributable Feature Cores

Detect and remove edges that connect loosely coupled sets of classes - technology-aware and cohesion-based heuristics

The resulting connected components: candidate DF cores

Identify the remote communication channels

The engineer reviews the result

Page 15: Reverse Engineering Object-Oriented Distributed Systems

Classes in DF cores: ~10%

Page 16: Reverse Engineering Object-Oriented Distributed Systems

Architectural preview: the Distributed Architecture Perspective

Page 17: Reverse Engineering Object-Oriented Distributed Systems

II. Impact of distribution

Page 18: Reverse Engineering Object-Oriented Distributed Systems

Goalsfocus on the rest of the classes in the system (the

majority)

evaluate their involvement in providing the Distributable Features

identify the classes that follow the main patterns of involvement

make system-level and class-level characterizations

Page 19: Reverse Engineering Object-Oriented Distributed Systems

Class involvementSet of coupling-based metrics

The collaboration of a class with the entire system Total bidirectional coupling (TBC)

Involvement in providing a particular DF Acquaintance with a Distributable Feature (ADF)

Involvement in providing all DFs = involvement in distribution Total Acquaintance with Distributable Features (TADF)

System-wide “distributed awareness” Average Total Acquaintance with Distributable Features (Average TADF)

Page 20: Reverse Engineering Object-Oriented Distributed Systems

Visualization

Feature Affiliation

Perspective

Total Collaboration Dispersion

Dispersion of

Feature

Acquaintance

Intensity of

Feature

Acquaintance

To

tal C

olla

bo

ratio

n I

nte

nsity

- intensity: no of collaborations- dispersion: no. of collaborators

gray: total collaborationcolor: distribution-related collaboration

The Feature Affiliation Perspective

Page 21: Reverse Engineering Object-Oriented Distributed Systems

Visualization Example

Part of the visualization for EHCACHE

Page 22: Reverse Engineering Object-Oriented Distributed Systems

Patterns of InvolvementHow does a class participate in providing the DFs

- The main patterns of involvement were detected (Patterns of acquaintance)

- Define and use a set of detection strategies [Marinescu04] to detect the classes following a certain pattern

- Put the visualization to use: see the interesting classes

Page 23: Reverse Engineering Object-Oriented Distributed Systems

Pattern I.Significant Feature

Acquaintance Big Color Box

AND

Significant

Acquaintance of

Distributable

Features

Total coupling with

distributable features is high

Class is mostly coupled with

distributable features

TADFTBC ≥ AV ERAGE

TADF ≥ HIGH

Class has significant involvement with the distributed functionality

Page 24: Reverse Engineering Object-Oriented Distributed Systems

Pattern II.Local Feature Contributor Big Gray

Class has significant involvement with local (non-distributed) functionality

AND

Class is strongly coupled with the

other classes in the system

Class has (almost) no relation

with the distributable features

TBC ≥ HIGH

Local Feature

Contributor

TADF

TBC≤ LOW

Page 25: Reverse Engineering Object-Oriented Distributed Systems

Pattern III.Connector Class Color-Spotted

Gray

Class connects a local feature with a distributed one

AND

Class has significant coupling with

the distributable features

Class has significant coupling with

other classes in the system

TADF ≥ AV ERAGE

LOW <TADF

TBC≤ AV ERAGE

Connector

Class

Page 26: Reverse Engineering Object-Oriented Distributed Systems

EHCACHE

Page 27: Reverse Engineering Object-Oriented Distributed Systems

FWS

Page 28: Reverse Engineering Object-Oriented Distributed Systems

System-level characterization

FWS: 2 DF cores Average TADF=3, lot of gray - significant local functionality [80 classes belong to a local tool, system initially non distributed]

EHCACHE: 5 DF cores Average TADF=9, more color - more distributed functionality [documentation: system redesigned specifically as distributed]

Page 29: Reverse Engineering Object-Oriented Distributed Systems

Class-level characterization

• FWS- 80 classes -- the local tool for visually editing workflow specifications- 6 classes -- belonging to other local features

• EHCACHE- Less than 5 classes - Cache – highest TBC heavily used, but local - ConfigurationHelper – manages configuration files

Local Feature Contributor / Big Gray

Page 30: Reverse Engineering Object-Oriented Distributed Systems

Class-level characterization

• FWS- 5 classes, related to the Workflow Engine- Small number => the functionality is well located in the system

• EHCACHE- 12 classes, related to the Cache Peer Manager- TADF/TBC close to 1 => classes are dedicated to the distributable feature (ex: Mutex, ConcurrencyUtil, Sync)

Significant Feature Acquaintance / Big Color Spot

Page 31: Reverse Engineering Object-Oriented Distributed Systems

Class-level characterization

FWS- 5 classes- Most interesting case: ProcessDefinition - TADF=15, TADF/TBC=0.2 - Models/stores the internal representation of workflows in execution - Links the classes that run the workflow (detected as Significant Feature Acquaintances) with an XML parser that reads the workflow specifications

EHCACHE- 6 classes- Most interesting case: Element - Represents the data item cached by the system - The only class that has a noticeable relation with Cache Replicator - Links the Cache Replicator with the non-distributed feature of the system that actually stores (caches) data

Connector Class / Color-Spotted Gray

Page 32: Reverse Engineering Object-Oriented Distributed Systems

III. Support for Restructuring

Page 33: Reverse Engineering Object-Oriented Distributed Systems

Goal

Apply concepts and measurements similar to those used in the analysis

to help the engineer explore / play with

tentative restructuring scenarios

Page 34: Reverse Engineering Object-Oriented Distributed Systems

ApproachVisualize (a part of) the graph of classes

Select a set of initial classes

See what happens if they are to be extracted (removed) as a separate unit: - evaluate the redesign layout which classes should go with those selected,

which should remain in the initial system

- evaluate the cost

Apply such scenarios at will

Page 35: Reverse Engineering Object-Oriented Distributed Systems

HelpersMetrics-based visualization to help select initial classes - In-group Adequacy (IGA) metric

Compute the forecasted layout - Acquaintance with Class Group (ACG) - Configurable threshold value

Computing the extraction cost - Extraction Cost (EC)

dispersion

intensity

a) b) c)

Page 36: Reverse Engineering Object-Oriented Distributed Systems

Example

Applying successive scenarios can also help improve the system understanding

Page 37: Reverse Engineering Object-Oriented Distributed Systems

IV. The Tool

Page 38: Reverse Engineering Object-Oriented Distributed Systems

niSiDe“non-invasive Structural insight on Distributed environments”

Follows all the steps in the methodology, and provides complete support for analysis

Generates all visualizations and support diagrams

Built for extensibility

Integrated in the iPlasma environment

Page 39: Reverse Engineering Object-Oriented Distributed Systems

Conclusions

Page 40: Reverse Engineering Object-Oriented Distributed Systems

Contributions

• A methodology for understanding object-oriented distributed systems

• A model for object-oriented distributed systems

• The Distributable Features View (visualization)

• Basic restructuring support as a natural extension to the understanding techniques

• Comprehensive tool support