74
An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

Embed Size (px)

Citation preview

Page 1: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

An Environment for Comprehending the Behavior of Software Systems

Maher Salah

Department of Computer ScienceDrexel University

July 25, 2005

Page 2: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 2

Overview

• Problem: Software maintenance• Modern software systems

– Large, distributed, multi-language and developed using pre-built components

– Difficult and expensive, especially if the source code is not available Effective SC tools can greatly simplify and reduce maintenance effort

• Solution: Software Comprehension Environment• Profiling and analysis of distributed systems• Analysis of program features• Common data repository• Software views

• Evaluation: Case study• Mozilla: Web browser• Jext: Programmers’ Text Editor• TechReport: Publication database

Page 3: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 3

Outline

Motivation

• Software Comprehension

• Architecture of Software Comprehension Environment

• Software views / Case Study

• Related Work

• Conclusions

Page 4: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 4

Software maintenance: Relative Cost

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

1970s Early 1980s Late 1980s Early 1990s

New Development Software maintenance

[Maod-1990]

Page 5: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 5

Software maintenance: Cost Distribution

20%

25%

55%

Corrective maintenance

Adaptive maintenance

Perfective maintenance

Approximately 80% of the maintenance effort is spent on non-corrective tasks

[Swanson-1990]

Page 6: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 6

Software maintenance: Cost Reduction

Rigorous design and thorough testing– Corrective maintenance

Limiting non-corrective changes– Enhancements– Adapting to hardware/software infrastructure

changes– Not an option: System evolution depends on

these changes to meet its user’s needs.

Page 7: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 7

Software maintenance: Cost Reduction

Reducing amount of effort spent on maintenance

½ the maintenance effort is spent on understanding the system’s logic and behavior. [Pigoski-1994]

Effective software comprehension tools

Page 8: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 8

Outline

• MotivationSoftware Comprehension

• Architecture of Software Comprehension Environment

• Software views / Case Study

• Related Work

• Conclusions

Page 9: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 9

Software Comprehension

Program Entities

ProgramRelationships

Program

Source codeExtractor

Profiler

Source code

Compiled binaries Filt

eri

ng

& q

ue

ryto

ols

AnalyzersSoftware

views

Program Gather dataStore data into

repositoryFilter and analyze data Software views

Page 10: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 10

Outline

Software ComprehensionArchitecture of Software

Comprehension Environment

• Software Views / Case Study

• Related Work

• Conclusions

Page 11: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 11

Architecture of the Software Comprehension Environment

Languages models&

Program schema

SQL(JDBC)

SMQL

Analysis/VisualizationSubsystemRepository Subsystem

Trace Marker(Features)

Feature-implementation view Feature interaction view Feature-similarity view Module-interaction view Class-interaction view Remote-interaction view Class protocol view

Modeling(Analyzers)

Visualization(Presentation/Navigation)

Software Programs Static analyzers

Dynamic analyzers(Profilers)

Sourcecode

ExecutionTraces

Data GatheringSubsystem

Page 12: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 12

Data Gathering Subsystem

Languages models&

Program schema

SQL(JDBC)

SMQL

Analysis/VisualizationSubsystemRepository Subsystem

Trace Marker(Features)

Feature-implementation view Feature interaction view Feature-similarity view Module-interaction view Class-interaction view Remote-interaction view Class protocol view

Modeling(Analyzers)

Visualization(Presentation/Navigation)

Software Programs Static analyzers

Dynamic analyzers(Profilers)

Sourcecode

ExecutionTraces

Data GatheringSubsystem

Page 13: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 13

Data Gathering Subsystem

• Program Facts/Data– Static: Entities and Relations– Dynamic: Entities, Relations and Events

• Distributed Profiler– Local Profiler– Logical Time Server

• Remote Interactions– Network interceptorEndpoint

Page 14: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 14

Data Gathering

• Endpoint entity– Local and Remote host– Local and Remote port numbers– Time-stamp

• Connects Relation

Local Host Remote Host Local Port Remote Port Time

Local Host Remote Host Local Port Remote Port Time+Delta

Delta < Threshold

Endpoint-A

Endpoint-B

Page 15: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 15

Example: Sequence Diagram

UserClassjava.netSokect-

OutputStream

write()

Java.net.Sokect-

InputStream

read()

Java Program

Lo

gic

al T

ime

C/C++ Win32 Program

UserModule(Winsock)

Ws2_32.DLL

recv()

send()

EP(B)

EPTime(B)

EP(A)EPTime(A)

Page 16: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 16

TechReport: Remote-interaction view

Search-by-author feature

Page 17: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 17

Distributed Profiler

Distributed Profiling

Program-B

JDBC CompliantDatabase

(SQL Server)

SQL(JDBC)

SMQL

Analysis/VisualizationSubsystemRepository Subsystem

Trace Marker

Analyzers

Visualization(Presentation)

Profiling Win32 Program(Using Wdbg)

Profiling Java Program(Using JvProf)

LogicalTimeServer

Runtime events(XML Format)

Runtime events(XML Format)

Import(dbimport)

Logical-TimeDistributed Marked-Trace

Logical-TimeDistributed Marked-Trace

Trace Marker

Program-A

Data Collection

Import(dbimport)

Page 18: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 18

Data Repository

Languages models&

Program schema

SQL(JDBC)

SMQL

Analysis/VisualizationSubsystemRepository Subsystem

Trace Marker(Features)

Feature-implementation view Feature interaction view Feature-similarity view Module-interaction view Class-interaction view Remote-interaction view Class protocol view

Modeling(Analyzers)

Visualization(Presentation/Navigation)

Software Programs Static analyzers

Dynamic analyzers(Profilers)

Sourcecode

ExecutionTraces

Data GatheringSubsystem

Page 19: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 19

Data Repository

• Logical Models– Language definitions– Program data

• Implementation– Relational database

• Query/Manipulation Language– Manipulation: SQL– Query/retrieval: SMQL

Page 20: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 20

Language Definitions

• Model(General) = Graph(Eg , Rg)Eg: Set of entity types

Rg: Set of relation types

• Java, example:• Model(Java) = Graph(Ejava , Rjava)

Ejava Eg : Entity types supported by Java

Rjava Rg : Relation types supported by Java

Such that the source and destination entity Ejava

• C++, definingModel(C++) = Graph(Ec++, Rc++)

Ec++ = Ejava {template, struct, typedef, function}

- {interface, package}

Page 21: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 21

SMQL: Software Modeling Query Language

• Set-based:– Typed Set (SMQL Objects)

• Entity Set• Relation Set• Event Set

– Composite set– Generic Set

• Operations– Union, intersection, Difference

• Functions– SMQL Filters implement in Java– Closure, Composition, …, etc.

Page 22: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 22

/** MOZILLA — Dynamic analysis (C/C++) All interactions from/to MIME.DLL module of Mozilla**/

RelationSet JPEGRelations{include (project) = {"mozilla"} ;EntitySet AllEntities

{type = {"method", "function"} ;}

EntitySet JPEGEntities{type = {"method", "function"} ;include (container) = {"imgjpeg.dll"} ;}

include (invoke) = {AllEntities -> JPEGEntities, JPEGEntities -> AllEntities } ;

}

/** JETTY - Static analysis (Java) Inheritance forest rooted at HttpListener or http.HttpHandler**/

EntitySet InheritanceRoots{type = {"interface", "class"} ;include (name) = { "org.mortbay.http.HttpListener",

"org.mortbay.http.HttpHandler" } ;include (project) = { "jetty-static" } ;}

InheritanceForest = closure(null, InheritanceRoots,{ "implement", "subclass"} ) ;

/** KWIC — Dynamic analysis (Java)**/

EventSet KwicEvents{type = { "method-entry",

"method-exit","thread-start","thread-end"} ;

include (project) = {"kwic-rt"} ;}

// Convert EventSet to RelationSet (call-graph) KwicCallgraph = Callgraph(KwicEvents) ;

// Attach the SequenceDiagram Visualizer to the clone event setKwicEvents["sf.visualizer"] = "serg.sc.sd.ui.SceEventSequVisualizer" ;

KwicEvents["description"] = "Kwic Event Sequence Diagram";

/** Mixing Relationships**/

MixedRelations = InheritanceForest + JPEGRelations ;

SMQL: ExampleMozilla – Dynamic (C/C++)Jetty – Static (Java)KWIC – Dynamic (Java)Union of relations from Mozilla and Jetty

Page 23: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 23

SMQL Filters

• SMQL filters = Analyzers– Java implementation of IFilter interface

• Implemented filters– Closure (Source, Target, RelationTypes)– Composition (E1, E2, …)– Callgraph (E1, E2, …)– AnalyzeEvents (E1, E2, …)– FeatureSimilarity (E1, E2, …)– MethodSequences (E1, E2, …)– Clone (EntitySet || RelationSet || EventSet)– Output_Dot ( RelationSet )– Output_xml( EntitySet || RelationSet || EventSet)

Page 24: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 24

Analysis and Visualization

Languages models&

Program schema

SQL(JDBC)

SMQL

Analysis/VisualizationSubsystemRepository Subsystem

Trace Marker(Features)

Feature-implementation view Feature interaction view Feature-similarity view Module-interaction view Class-interaction view Remote-interaction view Class protocol view

Modeling(Analyzers)

Visualization(Presentation/Navigation)

Software Programs Static analyzers

Dynamic analyzers(Profilers)

Sourcecode

ExecutionTraces

Data GatheringSubsystem

Page 25: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 25

Analysis and Visualization

SMQLAnalysis(IFilter)

Visualization(ISoftView)

Vis

ua

l Sty

le

Na

vig

atio

na

lS

tyle

Software ViewModelRevisedModel

ProgramData

Setting the “drill-in” property for a node/edge of a relationSet

Performed manually – By applying a different Analyzer or modifying the SMQL code

Page 26: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 26

Visualization process flow

Type?Default Visualizer

Has User-defined Visual-

style?

User-defined Visualizer

Graph Visualization

Containment Tree

Drill-in on Node and Edges

VisualizeSMQL Object

Realtionshipset

EntitySet

Clustering

No

Yes Custom Visualization

Drill-in

Page 27: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 27

Illustration Example

/** MOZILLA — Dynamic analysis (C/C++) All interactions from/to MIME.DLL module of Mozilla**/

RelationSet JPEGRelations{include (project) = {"mozilla"} ;EntitySet AllEntities

{type = {"method", "function"} ;}

EntitySet JPEGEntities{type = {"method", "function"} ;include (container) = {"imgjpeg.dll"} ;}

include (invoke) = {AllEntities -> JPEGEntities, JPEGEntities -> AllEntities } ;

}

/** JETTY - Static analysis (Java) Inheritance forest rooted at HttpListener or http.HttpHandler**/

EntitySet InheritanceRoots{type = {"interface", "class"} ;include (name) = { "org.mortbay.http.HttpListener",

"org.mortbay.http.HttpHandler" } ;include (project) = { "jetty-static" } ;}

InheritanceForest = closure(null, InheritanceRoots,{ "implement", "subclass"} ) ;

/** KWIC — Dynamic analysis (Java)**/

EventSet KwicEvents{type = { "method-entry",

"method-exit","thread-start","thread-end"} ;

include (project) = {"kwic-rt"} ;}

// Convert EventSet to RelationSet (call-graph) KwicCallgraph = Callgraph(KwicEvents) ;

// Attach the SequenceDiagram Visualizer to the clone event setKwicEvents["sf.visualizer"] = "serg.sc.sd.ui.SceEventSequVisualizer" ;

KwicEvents["description"] = "Kwic Event Sequence Diagram";

/** Mixing Relationships**/

MixedRelations = InheritanceForest + JPEGRelations ;

Page 28: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 28

SceTool

SMQL Editor

Visualized as Graph

Visualized asSequenceDiagram

Visualized as Graphs

Views:• Inheritance forest• Call-graph• Class-interaction• Module-interaction• Event sequence diagram

Page 29: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 29

RelationSet: InheritanceForestJe

tty

– Ja

va s

tatic

ana

lysi

s

Page 30: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 30

RelationSet: JPEGRelationsM

ozill

a –

C/C

++

dyn

amic

ana

lysi

s

Page 31: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 31

RelationSet: MixedRelationsR

elat

ions

fro

m M

ozill

a &

Jet

ty

Page 32: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 32

RelationSet: KwicCallgraph (Call-graph)K

WIC

– J

ava

dyna

mic

ana

lysi

s

Page 33: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 33

EventSet : KwicEvents (Sequence Diagram)K

WIC

– J

ava

dyna

mic

ana

lysi

s

Page 34: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 34

Summary of Architecture

• Support– Analysis of distributed– Analysis of program features– Static and dynamic analysis– Multiple languages

• Extensible architecture– Analysis - IFilter– Visualization - ISoftView

Page 35: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 35

Software Views

Languages models&

Program schema

SQL(JDBC)

SMQL

Analysis/VisualizationSubsystemRepository Subsystem

Trace Marker(Features)

Feature-implementation view Feature interaction view Feature-similarity view Module-interaction view Class-interaction view Remote-interaction view Class protocol view

Modeling(Analyzers)

Visualization(Presentation/Navigation)

Software Programs Static analyzers

Dynamic analyzers(Profilers)

Sourcecode

ExecutionTraces

Data GatheringSubsystem

Page 36: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 36

Using Marked-Traces

Feature-interaction view

Runtime Events

Object-interaction view

Class-interaction view

Call-graphs

Feature-implementation view

Module-interaction view

Feature-similarity view

Method invocation sequence Class protocol view

Using Distributed Profiler

Remote-interaction view

Thread-interaction view

Taxonomy of Software Views

Requires information aboutruntime objects

Page 37: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 37

Software Views

• Feature views– Impact of changing one feature on other features– Feature implementation and dependencies

• Remote-interaction– Highlights interactions between components of

distributed systems

• Thread-interaction– Highlights potential concurrency issues

• Class usage protocol– Documenting usage scenarios

Page 38: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 38

Terminology

• Classification of objects Let F = { set of features }

– Local(Fk): objects created and used by Fk

– Import(Fk):objects used by Fk and created by Fj

– Export (Fk): objects created by Fk and used by Fj

• Relationships:– depends(Fk, Fj) = I(Fk) E(Fj)

– shares(Fk, Fj) = I(Fk) I(Fj)

Page 39: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 39

Feature-Interaction

• Feature Marked-Trace

• Construction Feature-interaction diagram– Identity Import (F) and Export(F) sets

– Compute depends(Fk, Fj) = I(Fk) E(Fj)

– Compute shares(Fk, Fj) = I(Fk) I(Fj)

– Draw a depends- edge if depends(Fk, Fj)

– Draw a shares- edge if shares(Fk, Fj)

Page 40: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 40

Feature-Interaction: Example

FeatureF1

FeatureF2

ObjectO4

ObjectO5

ObjectO1

ObjectO2

ObjectO2

FeatureF3

ObjectO4

ObjectO6

ObjectO1

Uses

Uses

uses

ObjectO3

ClassC1

ClassC2

ObjectO7

ClassC3

FeatureF4

ObjectO4

uses

Interactions within a given Feature are ignored

OK

Oj

The object is created by the corresponding feature

The object is used by the corresponding feature, but created by other feature

Page 41: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 41

Feature-Interaction: Example

• Import/Export sets:

• Depends/Shares:

• Feature-interaction view Import Export

Feature F1 {O1, O2} Feature F2 {O2} {O4} Feature F3 {O1, O4} Feature F4 {O4}

Depends (F2, F1) = E(F1) I(F2) = {O2} Depends (F3, F1) = E(F1) I(F3) = {O1} Depends (F3, F2) = E(F2) I(F3) = {O4} Depends (F4, F2) = E(F2) I(F4) = {O4} Shares (F3, F4) = I(F3) I(F4) = {O4}

Feature-2

Feature-3

Feature-1

Feature-4 C2: O4

C1: O2

C2: O4C2: O4

C1: O1

Page 42: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 42

Jext: Feature-interaction

Page 43: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 43

Feature-similarity view

• The similarity matrix is computed from the call-graphs– Fk = Gk (Vk, Ek)

• Using relations (edges)

|2E1E|

|2E1E|)2F,1F(Similarity

TechReport: Client-side feature similarity ad

d-p

ub

lica

tio

n

sear

ch-b

y-a

uth

or

sear

ch-b

y-t

itle

up

da

te-p

ub

lica

tio

n

add-publication 100 0 2 32

search-by-author 100 47 7

search-by-title 100 3

update-publication 100

Page 44: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 44

Class protocol view

• Terminology– Usage scenario: A description of how a class is used

by other classes– Method invocation sequence: The sequence of

invocations against an instance of a class (object).• Problem

– Determine the class-usage scenarios from its invocation sequences

• Approach– Compute the canonical set and canonical groups of

method sequences.– Approximate each canonical group as a regular

expression (Usage scenario)

Page 45: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 45

Class protocol view

• File class

• Symbols– R = Read– W = Write– O = Open– C = Close

• Method Sequences– ORC, – ORRC, – ORRRC, – ORRRRC, – OWC,– OWWWC, – ORWWWWWC,– OWWWWWWWWW

WC,– ORRRRRRRRRRWC

+Read()+Write()+Open()+Close()

File

Page 46: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 46

Class protocol view

• Canonical set– ORRC, OWWWRC

• Canonical groups– P1 = {ORC, ORRC, ORRRC, ORRRRC,

ORRRRRRRRRRWC }– P2 = {OWWWC, WWWWWWWWWWC,

ORWWWWWC, OWC}.

• Regular expressions– P1 = O (R)1,2,3,4,10 (W)0,1 C O (R)+ (W)? C– P2 = O (R)0,1 (W)1,3,5,10 C O (R)? (W)+ C

Page 47: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 47

P = {p1, p2, …, pn}

Method sequences

Kmin

Kmax

Canonical Set

Canonical Groups

Similarity function

Computing Canonical Set/Groups

Compute the similarity graph G(P) usingthe similarity function

Compute the canonical set that satisfies:

Compute the canonical groups

)IntraEdge(Weightmin

)CutEdge(Weightmax

Extra Edge

Cut EdgeIntra Edge

CanonicalSet

Compute Canonical groups

Formulate the BCS problem as an integerprogramming problem

Use semi-definite programming (SDP) tofind an approximate sooution

Canonical Set

Page 48: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 48

Similarity Measure

• Levenshtein edit-distance function– Distance d(si, sj) = Least number of operations

required to transform string si into string sj.

– Operations: deletion, insertion, substitution

• Adaptation of edit-distance– Associating a variable cost with each operation

• Frequency of a particular method invocation– Cost = exp (-frequency/length)

• Location of the particular method invocation– Cost = 1 if s[k] <> s[k-1]

– Cost = tiny if s[k] = s[k-1]

Page 49: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 49

Similarity Measure

• Examples:– InsertCost(“ORRC”, “R”, k) > InsertCost(“ORRRRRC”, “R”, k)

• ORRC has fewer Rs than ORRRRRC– InsertCost(“ORRWWC”, “R”, 5) > InsertCost(“ORRWWC”, “R”, 3)

• Similarity function

)s,s(d1

1)s,s(Similarity

jiji

Page 50: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 50

Computing Canonical Groups

• Let– P = {p1, p2, .., pn} - Method sequences– P` = {q1, q2, .., pm} - Canonical set

• For each pi, compute its similarity to all elements of P`

• Place pi in the group gk, where – similarity(pi, qk) > similarity(pi, ql), ql P`, k ≠ l

• In case of a tie in max-similarity, place pi in all corresponding groups

Page 51: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 51

Observations/Limitations

• MS can get long and complex• MS may contain subsequences that may not be

traceable to source code• Overlapping groups At this stage, our approach cannot produce a

simplified set class usage scenarios.• Analyzing closely related classes can produce

simpler MS than if each class is analyzed individually. – Example: Socket, SocketInputStreams, and

SocketOutputSteam

Page 52: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 52

Socket class example

A <init> B setImpl C getInputStream D getOutputStream E getLocalAddress F getLocalPort G getPort H getInetAddress J close I postAccept K setSoTimeout L setSoLinger M setTcpNoDelay N getSoTimeout O connect P isClosed

AEHDEFGHNC(EFGH)2(NEFGH) (N(EFGH)2N(EFGH)2N(EFGH)2(NEFGH))2

((EFGHN)(EFGH)2(NEFGH))2

···· (W)3(RW)2R(WR(W)2RWR)4

R java.net.SocketInputStream.read() W java.net.SocketOutputStream.write()

Socket, SocketInputStreams, and SocketOutputSteam

Page 53: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 53

Evaluation

Case study

Page 54: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 54

Case study

• Systems

System Platform Size (KLOC) Mozilla: Web browser (v1.0.1) Win32/C,C++ 4,410 Jext: Text Editor (v3.2) Java 98 TechReport: Publication database Java/EJB 2.3

Page 55: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 55

Tool Architecture for the Analysis of Mozilla

Build & Profiling of Mozilla

Mozilla Source codeMozilla Executable

(mozilla.exe)

Compile with debug informationand instrument the code withfunction-entry hooks (/Gh)

Data collection module(PEnterHandler.dll)

Method-Entry & Exit Events

Build Phase Profiling Phase

JDBC CompliantDatabase

(SQL Server)

SQL(JDBC)

SMQL

Analysis/VisualizationSubsystemRepository Subsystem

Import(dbimport)

Trace Marker(TraceMarker.dll) On Exit Save data

Marked-traces,Threads,

and Symbols files

Feature graph Feature-implementation view Feature module-interaction view Feature-similarity view Module-interaction view Class-interaction view

SMQL Filters

Visualization(Presentation)

Function-Entry Hooks(penter.lib)

Page 56: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 56

Tool Architecture for the Analysis of Jext

Jext Text Editor

JDBC CompliantDatabase

(SQL Server)

SQL(JDBC)

SMQL

Analysis/VisualizationSubsystemRepository Subsystem

Trace Marker Feature-implementation view Feature-interaction view Feature-similarity view Class-interaction Thread-interaction view Class protocol view

SMQL Filters

Visualization(Presentation)

Profiling Jext(Using JvProf)

Runtime events(XML Format)

Import(dbimport)

Page 57: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 57

Tool Architecture for the Analysis of TechReport

JDBC CompliantDatabase

(SQL Server)

SQL(JDBC)

SMQL

Analysis/VisualizationSubsystemRepository Subsystem

Trace Marker

Feature-implementation view Feature-interaction view Feature-similarity view Remote-interaction view Class-interaction view Thread-interaction view

SMQL Filters

Visualization(Presentation)

Profiling EJB Server(Using JvProf)

Profiling TechReport(Using JvProf)

LogicalTimeServer

Runtime events(XML Format)

Runtime events(XML Format)

Import(dbimport)

Import(dbimport)

Logical-TimeDistributed Marked-Trace

Logical-TimeDistributed Marked-Trace

EJB Server

TechReport Client

Page 58: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 58

Case study

• Views

Mozilla Jext TechReport Object-interaction √ √ Class-interaction √ √ √ Module-interaction √ Feature-implementation √ √ √ Feature-interaction √ √ Feature-similarity √ √ √ Remote-Interaction √ Thread-Interaction √ Class-usage protocol √

Page 59: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 59

Example of a maintenance scenario

• Change request– Modify send-page feature of the Mozilla web-browser

• Developer task– Study the change request– Locate source code portions related to the send-page:

• Option-1– Study the source code and documentation

• Option-2– Instrument the application’s code

– Exercise the feature using a profiler

– Analyze execution trace(s)

Page 60: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 60

Type feature name “send-page”

Exercise send-page feature

Mark start of send-page trace

Mark end of send-page trace

Run/profile Mozilla

Load and analyze runtime data

Mozilla: Send-page feature

Page 61: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 61

Mozilla: Send-page feature

Module-interaction (Clustered)

Page 62: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 62

Simplifying views

• The user can exercise enough features to identify common modules/classes

• Common modules are the application’s infrastructure and are used by most features

• Eliminating common modules/classes will result in simpler and manageable views per feature

Page 63: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 63

Send-page feature

Common modules are filtered out

Clustered

Page 64: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 64

Mozilla: “Msgbsutil” cluster

Module-interaction

Page 65: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 65

Mozilla: MIME.DLL class-interaction

Page 66: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 66

Mozilla: Feature implementation view

Page 67: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 67

Mozilla: Similarity Matrix

pri

nt-

pag

e

op

en-u

rl

bo

ok

ma

rk-

ad

d

op

en-l

ink

sta

rtu

p

bo

ok

ma

rk-

op

en

sh

utd

ow

n

sen

d-p

age

save-page 69 57 48 57 57 61 44 61

print-page 100 56 48 59 51 65 44 52

open-url 100 42 77 50 76 43 56

bookmark-add 100 43 40 48 42 36

open-link 100 45 84 46 49

startup 100 50 39 57

bookmark-open 100 47 53

shutdown 100 44

send-page 100

With all modules/classes

pri

nt-

pag

e

op

en-u

rl

bo

ok

mar

k-a

dd

op

en-l

ink

star

tup

bo

ok

mar

k-o

pen

shu

tdo

wn

sen

d-p

age

save-page 67 45 12 53 50 57 17 22

print-page 100 35 15 45 48 49 16 16

open-url 100 10 63 37 77 17 29

bookmark-add 100 10 10 11 9 3

open-link 100 36 81 16 20

startup 100 39 13 20

bookmark-open 100 20 21

shutdown 100 17

send-page 100

Without common modules/classes

Page 68: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 68

Outline

• Motivation

• Software Comprehension

• Architecture of Software Comprehension Environment

• Software Views / Case StudyRelated Work

• Conclusions

Page 69: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 69

Related work

• Analysis of Multi-language applications– Web applications [Hassan02]– Integrated conceptual model [Kullback98]

• Analysis of distributed systems– BEE++ system [Bruegge93]– X-Ray system [Mendoca99]

• Analysis of program features – Concept analysis [Eisenbarth01]– Abstract system dependency graph [Chen00]

Page 70: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 70

Outline

Software ComprehensionArchitecture of Software Comprehension

EnvironmentSoftware views / Case StudyRelated WorkConclusions

Page 71: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 71

Conclusions

• Contributions– Development of a software comprehension

environment that supports• Feature-driven profiling and analysis• Distributed profiling• Multiple languages (C, C++, Java)• SMQL language for retrieving and analyzing data

– Software views• Feature-interaction, implementation and similarity• Remote-interaction• Thread-interaction• Class protocol view• Module-interaction• Class-interaction

Page 72: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 72

Conclusions

• Opportunities for future work– Views

• Visualization• New views

– Data collection• Scripting languages• Profiling of popular frameworks: COM+ and .NET

– Integration with common IDEs

Page 73: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

July 25, 2005

Maher Salah 73

Publications

• Class-usage protocol– Salah M., Denton T., Mancoridis S., Shokoufandeh A., Vokolos F.,

Scenariographer: A Tool for Reverse Engineering Class Usage Scenarios from Method Invocation Sequences, IEEE ICSM, Budapest, Hungary, 2005.

• Analysis of large systems: Mozilla– Salah M., Mancoridis S., Antoniol G. and Di Penta M. , Employing Use-cases and

Dynamic Analysis to Comprehend Mozilla, IEEE ICSM, Budapest, Hungary, 2005.• Feature analysis – Feature-interaction view

– Salah M., Mancoridis S. A Hierarchy of Dynamic Software Views: from object-interactions to feature-interactions, IEEE ICSM, Chicago, Illinois, 2004.

• Architecture and analysis of distributed systems– Salah M., Mancoridis S., Toward an Environment for Comprehending Distributed

Component-Based Systems, IEEE WCRE, Victoria, British Columbia, Canada, November 2003.

• Profiling and dynamic analysis– Souder, T., Mancoridis, S., Salah, M., Form: A Framework for Creating Views of

Program Executions, IEEE ICSM, Florence, Italy, 2001.

Page 74: An Environment for Comprehending the Behavior of Software Systems Maher Salah Department of Computer Science Drexel University July 25, 2005

Questions

http://serg.cs.drexel.edu/~msalah/sce/