51
LIFT: A Legacy LIFT: A Legacy InFormation retrieval Tool InFormation retrieval Tool Student: Kellyton dos Santos Brito Student: Kellyton dos Santos Brito Advisor: Silvio Romero de Lemos Meira Advisor: Silvio Romero de Lemos Meira Informatics Center Informatics Center - Federal University of Pernambuco Federal University of Pernambuco C.E.S.A.R. C.E.S.A.R. - Recife Center for Advanced Studies and Systems Recife Center for Advanced Studies and Systems {ksb, srlm}@cin.ufpe.br {ksb, srlm}@cin.ufpe.br

LIFT: A Legacy InFormation retrieval Tool

Embed Size (px)

DESCRIPTION

Nowadays software systems are essential to the environment of most organizations, and their maintenance is a key point to support business dynamics. Thus, reverse engineering legacy systems for knowledge reuse has become a major concern in software industry. This article, based on a survey about reverse engineering tools, discusses a set of functional and nonfunctional requirements for an effective tool for reverse engineering, and observes that current tools only partly support these requirements. In addition, we define new requirements, based on our group’s experience and industry feedback, and present the architecture and implementation of LIFT: a Legacy InFormation retrieval Tool, developed based on these demands. Furthermore, we discuss the compliance of LIFT with the defined requirements. Finally, we applied the LIFT in a reverse engineering project of a 210KLOC NATURAL/ADABAS system of a financial institution and analyzed its effectiveness and scalability, comparing data with previous similar projects performed by the same institution.

Citation preview

Page 1: LIFT: A Legacy InFormation retrieval Tool

LIFT: A Legacy LIFT: A Legacy InFormation retrieval ToolInFormation retrieval Tool

Student: Kellyton dos Santos BritoStudent: Kellyton dos Santos Brito Advisor: Silvio Romero de Lemos MeiraAdvisor: Silvio Romero de Lemos Meira

Informatics Center Informatics Center -- Federal University of PernambucoFederal University of Pernambuco

C.E.S.A.R. C.E.S.A.R. -- Recife Center for Advanced Studies and SystemsRecife Center for Advanced Studies and Systems {ksb, srlm}@cin.ufpe.br{ksb, srlm}@cin.ufpe.br

Page 2: LIFT: A Legacy InFormation retrieval Tool

2/50

OutlineOutline

Introduction

Key Developments in the field of Reengineering

Reverse Engineering Tools

LIFT Tool

Requirements

Architecture and Implementation

Usage Scenario

Case Study

Conclusions

Page 3: LIFT: A Legacy InFormation retrieval Tool

IntroductionIntroduction

Page 4: LIFT: A Legacy InFormation retrieval Tool

4/50

Software is key point of business

Business dynamics needs Software dynamics

Objectives

Low Costs

High Productivity

High Quality

Boehm’s analysis (Boehm 1999)

working-faster savings:

8%

working-smarter savings:

17%

work-avoidance savings: 47%

Page 5: LIFT: A Legacy InFormation retrieval Tool

5/50

Software Reuse Initial ideas from McIlroy (1968)

Software reuse is the process of creating software systems from existing software rather than building them from

scratch (Krueger 1992)

Reusable Assets Products, Processes, Knowledge …

Reuse Aspects Processes, methods, environments, tools and non-technical

aspects

Based on these aspects…

Page 6: LIFT: A Legacy InFormation retrieval Tool

6/50

RiSE ProjectRiSE Project

Understand Fundamental steps to

introduce reuse in companies

Technical and non-technical

aspects

Page 7: LIFT: A Legacy InFormation retrieval Tool

7/50

One (of many) points is...One (of many) points is...

Legacy Systems

Well Tested, stable, low bugs and defects

A lot of embedded knowledge

Problems

Obsolete technologies, languages, tools and processes

Non useful documentation

Degradation due to maintenance operations

Few specialized people

Directions

Reverse engineer applications

Knowledge Reuse

Knowledge reuse from legacy systemsKnowledge reuse from legacy systems

Page 8: LIFT: A Legacy InFormation retrieval Tool

8/50

ProposalProposal

This work defines the requirements, designs and This work defines the requirements, designs and implements aimplements a

tool for reverse engineeringtool for reverse engineering,,

aiming to aid system engineers aiming to aid system engineers

to retrieval knowledge from legacyto retrieval knowledge from legacy systemssystems,,

as well as toas well as to

increase their productivityincrease their productivity

in reverse engineering and system in reverse engineering and system understanding tasksunderstanding tasks

Page 9: LIFT: A Legacy InFormation retrieval Tool

9/50

ContextContext

Page 10: LIFT: A Legacy InFormation retrieval Tool

Key Developments in the Key Developments in the field of Software field of Software ReengineeringReengineering

Page 11: LIFT: A Legacy InFormation retrieval Tool

11/50

The GoalThe Goal

To Understand

Concepts

Evolution

Approaches

Strong and Weak points

New trends

Page 12: LIFT: A Legacy InFormation retrieval Tool

12/50

Reengineering Approaches Reengineering Approaches (Garcia 2005)(Garcia 2005)

Page 13: LIFT: A Legacy InFormation retrieval Tool

13/50

Unresolved Issues Recover all system, as interface, design and database

Trace entire requirements from interface to database

Deal with large systems

Discover HOW programs work, not WHAT programs do

New Trends Aspect Orientation

Data Mining

Lack of effective tools to support Reverse Engineering

Page 14: LIFT: A Legacy InFormation retrieval Tool

Reverse Engineering Tools: Reverse Engineering Tools: The StateThe State--ofof--thethe--art and art and PracticePractice

Page 15: LIFT: A Legacy InFormation retrieval Tool

15/50

MotivationMotivation

Automate tasks

Existence of some of tools

They are useful in practice?They are useful in practice? Not in 2000 year

Muller et al. Reverse Engineering: A Roadmap

Proceedings of ICSE. Future of Software Engineering Track

Not NOWNOW Canfora & Penta

New Frontiers of reverse Engineering

Proceedings of Future of Software Engineering (FOSE’07)

“Despite the maturity of reverse engineering research, and the fact

that many pieces of reverse engineering work seem to timely solve crucial problems and to answer relevant industry needs, its adoption in industry is still

limited”(Canfora 2007)

Page 16: LIFT: A Legacy InFormation retrieval Tool

16/50

Reverse Engineering ToolsReverse Engineering Tools

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007

Santanu Paul

SCRUPLE Tool

Müller et al.

Rigi Project

Software Understanding

Favre

GSEE: a Generic

Software Exploration

EnvironmentStorey et al.

SHriMP Views

Exploration Tool

Singer et al.

TKSEE Software

Exploration Tool

Zayour et al.

DynaSee Reverse

Engineering Tool

Schäfer et al.

SEXTANT

Software

Exploration Tool

Lanza

CodeCrawler

Page 17: LIFT: A Legacy InFormation retrieval Tool

17/50

Reverse Engineering ToolsReverse Engineering Tools

Almost all of them shows a call graph

Each one implements its proper requirements set Some with explorationexploration capabilities Some with visualizationvisualization capabilities Some with cognitivecognitive capabilities

All of them highly user dependent: lack of automatic or semi-automatic code analysis

Lack of recover and traceability of entire system, from interface to database

Problems dealing with big systemsbig systems

Page 18: LIFT: A Legacy InFormation retrieval Tool

18/50

Problems dealing with big systemsProblems dealing with big systems RiGI Call Graph (928 nodes, 4203 dependencies)

Page 19: LIFT: A Legacy InFormation retrieval Tool

LIFT: A Legacy InFormation LIFT: A Legacy InFormation Retrieval ToolRetrieval Tool

Page 20: LIFT: A Legacy InFormation retrieval Tool

20/50

LIFT RequirementsLIFT Requirements Based on Surveys, RiSE and Industry Experiences.

LIFT Functional Requirements:LIFT Functional Requirements: (FR1) Visualization of entities and relations

(FR2) Abstraction mechanisms

(FR3) High user interactivity

(FR4) Search capabilities

(FR5) User activities trace capabilities

(FR6) Metrics visualization support

(FR7) Recovery of the entire system (interface, design and database)

(FR8) Trace of requirements from interface to database access

(FR9) Possibility of semi-automatic suggestions

Existent Requirements

New Requirements

Page 21: LIFT: A Legacy InFormation retrieval Tool

21/50

LIFT Non Functional RequirementsLIFT Non Functional Requirements

(NFR1) Cross Artifacts support

(NFR2) Extensibility

(NFR3) Integration with other tools

(NFR4) Scalability

(NFR5) Maintainability and Reusability Existent Requirements

New Requirements

Page 22: LIFT: A Legacy InFormation retrieval Tool

22/50

LIFT ArchitectureLIFT Architecture

Pre Processing

Persistence Layer

Legacy

Code

Parser Analyzer Visualizer

Parser Cluster

Visualization

Patterns

Detection

Normal

Visualization

Understanding Environment

Paths

Visualization

Patterns

Visualization

Paths

CalculationCluster

Analysis

Page 23: LIFT: A Legacy InFormation retrieval Tool

23/50

Implementation: Parser ComponentImplementation: Parser Component

Parser Module Parses NATURAL/ADABAS source code

First version developed by Pitang team

Uses C# technology

Integrated as a component

Pre-Processing Module Works with parser output

Store useful information in the database SQL ANSI

Performs the system slice

Deduction of database layer

Page 24: LIFT: A Legacy InFormation retrieval Tool

24/50

Implementation: Analyzer ComponentImplementation: Analyzer Component

Call Graph Generation Paths Calculations

Full paths Minimal paths

Using Dijkstra shortest path algorithm Running time O(n.log n)

Cluster analysis Hierarchical Clustering Mark Newman's “edge betweenness clustering algorithm”

Running time O(k.m.n)

Patterns detection (second interaction) Text pattern detection Graph pattern detection Clone detection

Page 25: LIFT: A Legacy InFormation retrieval Tool

25/50

Implementation: Visualizer ComponentImplementation: Visualizer Component

Based on JUNG: Java Universal Network/Graph Framework

Visualizations of call graph and Analyzer modules

Normal visualization

Cluster visualization

Paths visualization

Patterns visualization

Uses Polimetric-Views concept

Page 26: LIFT: A Legacy InFormation retrieval Tool

26/50

Understanding EnvironmentUnderstanding Environment

Graphical interface

Integrate the other components

Areas for paths, graphs and details

Shows source code

Works with views concept

Isolate subgraphs

Allow comments

Views comments

Modules comments

Source code comments

Page 27: LIFT: A Legacy InFormation retrieval Tool

27/50

LIFT Usage: Initial StepsLIFT Usage: Initial Steps

Parser and Organizer

Called by simple menu commands

Page 28: LIFT: A Legacy InFormation retrieval Tool

28/50

LIFT Usage: Initial GraphLIFT Usage: Initial Graph

Page 29: LIFT: A Legacy InFormation retrieval Tool

29/50

LIFT Usage: Isolating GraphsLIFT Usage: Isolating Graphs

Page 30: LIFT: A Legacy InFormation retrieval Tool

30/50

LIFT Usage: Detecting ClustersLIFT Usage: Detecting Clusters

Page 31: LIFT: A Legacy InFormation retrieval Tool

31/50

Page 32: LIFT: A Legacy InFormation retrieval Tool

Case StudyCase Study

Page 33: LIFT: A Legacy InFormation retrieval Tool

33/50

The ContextThe Context

Pitang Software Factory

Infra-structure

Experienced staff

Real demands for reverse engineering

NATURAL/ADABAS systems of a financial institution

Previous experience with reverse engineering: Almost 2 million LOC

Company time and budget constraints

Customer dependent

Page 34: LIFT: A Legacy InFormation retrieval Tool

34/50

The DefinitionThe Definition

Goal:

To analyze the reverse engineering tool for the purpose of evaluating it with respect to its efficiency of the tool from the point of view of researchers and software engineers in the context of software reverse engineering projects.

Questions

Does the tool provides effort reduction in reverse engineering projects?

Is the tool scalable enough to be used in large projects?

Do the subjects have difficulties to use the tool?

Page 35: LIFT: A Legacy InFormation retrieval Tool

35/50

The PlanningThe Planning

Method of comparison Comparison with two sibling projects

Same technologies: NATURAL/ADABAS

Same domain: Financial

Same customer

Same understanding process

Same number of participants

Similar engineers experience: more than 10 years

Different tools

Performed nearly same time (February-June 2007)

Page 36: LIFT: A Legacy InFormation retrieval Tool

36/50

The PlanningThe Planning

Null Hipothesis

H0: µproductivity by LOC with previous tools > µproductivity by LOC using LIFT

H0: µproductivity by program modules with previous tools > µproductivity by program modules using LIFT

H0: µproductivity by recovered requirement with previous tools > µproductivity by recovered requirement using LIFT

The Projects

LIFT Project: 210 KLOC NATURAL/ADABAS

Sister projects: 65 KLOC and 131KLOC NATURAL/ADABAS systems of financial domain

Page 37: LIFT: A Legacy InFormation retrieval Tool

37/50

The OperationThe Operation

Training

28 hours. 3 meetings with all staff, 3 lectures to the user and 2 days using the tool with previous projects data.

Subjects

One system engineer in each project

Costs

Planning: C.E.S.A.R

Operation: Pitang

Period

February-June, 2007

Page 38: LIFT: A Legacy InFormation retrieval Tool

38/50

The Quantitative AnalysisThe Quantitative Analysis

Lines/Hour Productivity

66% higher than Project 1 and 41% higher than Project 2

Null Hipothesis:

H0: µproductivity by LOC with previous tools =>

µproductivity by LOC using LIFT

Page 39: LIFT: A Legacy InFormation retrieval Tool

39/50

The Quantitative AnalysisThe Quantitative Analysis

Modules/Hour Productivity

12% higher than Project 1 and 127% higher than Project 2

Null Hipothesis:

H0: µproductivity by program modules with previous tools >

µproductivity by program modules using LIFT

Page 40: LIFT: A Legacy InFormation retrieval Tool

40/50

The Quantitative AnalysisThe Quantitative Analysis

High Level Requirements/Hour Productivity

the same of Project 1 and 167% higher than Project 2

Null Hipothesis:

H0: µproductivity by recovered requirement with previous tools

> µproductivity by recovered requirement using LIFT

Page 41: LIFT: A Legacy InFormation retrieval Tool

41/50

Increase Rate

0

0,5

1

1,5

2

Size (KLOC) Parse Time

(s)

Pre-

Processing

Time (s)

Minimal

Paths Time

(s)

Full Analysis

and Graph

Creation (s)

Rate Increase Rate

The Quantitative AnalysisThe Quantitative Analysis Scalability

LIFT project and “Project 2” evaluation

Pentium IV / 512MB Database Server x Dual Core 2 / 2GB Client

Page 42: LIFT: A Legacy InFormation retrieval Tool

42/50

The Qualitative AnalysisThe Qualitative Analysis

Based on a questionnaire

Tool effectivity

Effort reduction of about 20%

Easy to locate system features and to generate system documentation

Weak Points

Delay to load the application (Full Analysis and Graph Creation)

Efficient Training Program

Page 43: LIFT: A Legacy InFormation retrieval Tool

43/50

Case Study SummaryCase Study Summary

Questions

Does the tool provides effort reduction in reverse engineering projects?

YesYes

Is the tool scalable enough to be used in large projects?

YesYes

Do the subjects have difficulties to use the tool?

NoNo

Page 44: LIFT: A Legacy InFormation retrieval Tool

44/50

Case Study Case Study –– Lessons LearnedLessons Learned

Training

Questionnaire

Management Commitment

Subjects Motivation

Parser problems

Page 45: LIFT: A Legacy InFormation retrieval Tool

ConclusionsConclusions

Page 46: LIFT: A Legacy InFormation retrieval Tool

46/50

ContributionsContributions

The survey about Reverse Engineering Tools

The Requirements of an Effective Reverse Engineering tool

The Tool for Reverse Engineering

Cluster approach

Minimal-paths approach

Scalability using database approach

The Case Study

Page 47: LIFT: A Legacy InFormation retrieval Tool

47/50

ContributionsContributions

Brito, K. S.; Garcia, V. C.; Lucrédio, D.; Almeida, E. S.; Meira, S. R. L. LIFT: Reusing Knowledge from Legacy Systems, In the Brazilian Symposium on Software Components, Architectures and Reuse (SBCARS), Campinas, São Paulo, Brazil. August, 2007.

Brito, K. S.; Garcia, V. C.; Almeida, E. S.; Meira, S. R. L. A Tool for Knowledge Extraction from Source Code, 21st Brazilian Symposium on Software Engineering (SBES), Tools Session, João Pessoa, Paraíba, Brazil. October, 2007 (to appear).

Invited Paper

Journal of Universal Computer Science (JUCS), Special Issue: Software Components, Architectures and Reuse. April, 2008

Page 48: LIFT: A Legacy InFormation retrieval Tool

48/50

Other Contributions during the CourseOther Contributions during the Course

Conferences Brito, K. S.; Alvaro, A.; Lucrédio, D.; Almeida, E. S.; Meira, S. R. L.,

Software Reuse: A Brief Overview of the Brazilian Industry's Case, In the 5th ACM-IEEE International Symposium on Empirical Software Engineering (ISESE), Short Paper, Rio de Janeiro, Brazil, 2006

Invited Talk Software Reuse: Brazilian Industry Case. In II Workshop para

Introdução do Reuso em Empresas de Desenvolvimento de Software (WIRE). Porto de Galinhas, Brazil. June, 2007.

JournalJournal Daniel Lucredio, Kellyton dos Santos Brito, Alexandre Alvaro, Vinicius

Cardoso Garcia, Eduardo Santana de Almeida, Renata Pontin de Mattos Fortes, and Silvio Romero de Lemos Meira. Software Reuse: The Brazilian Industry Scenario (to appear). Journal of Systems and Software, 2008

Page 49: LIFT: A Legacy InFormation retrieval Tool

49/50

Future WorksFuture Works

Development of Patterns Detection and Visualization modules

Plug-ins for other input languages

Automatic documents generation

Metrics Extraction and Reports Generation

More Case Studies

To be a part of a complete framework for reengineering: Processes, Methods and Tools

Page 50: LIFT: A Legacy InFormation retrieval Tool

50/50

ReferencesReferences

Boehm, B. (1999). "Managing software productivity and reuse." IEEE Computer Vol.(32), No. 9, p. 111-113.

McIlroy, M. D. (1969). "Mass Produced Software Components". NATO Software Engineering Conference Report, Garmisch, Germany, p. 79-85.

Krueger, C. W. (1992). "Software Reuse." ACM Computing Surveys Vol.(24), No. 2, p. 131-183.

Garcia, V. C. (2005), "Phoenix: An Aspect Oriented Approach for Software Reengineer(in portuguese). M.Sc Thesis." Federal University of São Carlos, São Carlos, Brazil, March/2005.

Müller, H. A., Jahnke, J. H., Smith, D. B., Storey, M.-A., Tilley, S. R. and Wong, K. (2000). "Reverse Engineering: A Roadmap". Proceedings of the 22nd International Conference on Software Engineering (ICSE'2000). Future of Software Engineering Track, Limerick Ireland, p. 47--60.

Canfora, G. and Penta, M. D. (2007). "New Frontiers of reverse Engineering". Future of Software Engineering (FOSE), IEEE Computer Society, p. 326--341.

Page 51: LIFT: A Legacy InFormation retrieval Tool

LIFT: A Legacy LIFT: A Legacy InFormation Retrieval InFormation Retrieval

ToolTool

Student: Kellyton dos Santos BritoStudent: Kellyton dos Santos Brito Advisor: Silvio Romero de Lemos MeiraAdvisor: Silvio Romero de Lemos Meira

Informatics Center Informatics Center -- Federal University of PernambucoFederal University of Pernambuco

C.E.S.A.R. C.E.S.A.R. -- Recife Center for Advanced Studies and SystemsRecife Center for Advanced Studies and Systems {ksb, srlm}@cin.ufpe.br{ksb, srlm}@cin.ufpe.br