Thesis_Final_Draft_MSE_2010_07_(1).pdf

Master Thesis Software Engineering Thesis no: MSE-2010:07 April 2010

School of Computing Blekinge Institute of Technology Box 520 SE 372 25 Ronneby Sweden

A Systematic Mapping Study on Software Reuse

Bhargava Mithra Konda and Kranthi Kiran Mandava

ii

This thesis is submitted to the School of Engineering at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Software Engineering. The thesis is equivalent to 20 weeks of full time studies.

Contact Information: Author(s): Bhargava Mithra Konda Address: Rum 10, Folkparksvgen 16, 372 40, Ronneby, Sweden E-mail: [email protected] Kranthi Kiran Mandava Address: Rum 10, Folkparksvgen 16, 372 40, Ronneby, Sweden E-mail: [email protected]

University advisor(s): Dr. Mikael Svahnberg Department of System and Software Engineering, Blekinge Institute of Technology

School of Engineering Blekinge Institute of Technology Box 520 SE 372 25 Ronneby Sweden

Internet : www.bth.se/tek Phone : +46 457 38 50 00 Fax : + 46 457 271 25

ABSTRACT

Context: Software reuse is considered as the key to a successful software development because of its potential to reduce the time to market, increase quality and reduce costs. This increase in demand made the software organizations to envision the use of software reusable assets which can also help in solving recurring problems. Till now, software reuse is confined to reuse of source code in the name of code scavenging. Now a day, software organizations are extending the concepts of software reuse to other life cycle objects as they realized that reuse of source code alone does not save money. The academia has put forward some assets as reusable and presented methods or approaches for reusing them. Also, for a successful software reuse the organizations should assess the value of reuse and keep track on their reuse programs. The other area which is vital for software reuse is the maintenance. Maintenance of reusable software has direct impact on the cost of the software. In this regard, academia has presented a number of techniques, methods, metrics and models for assessing the value of reuse and for maintaining the reusable software. Objectives: In our thesis, we investigate on the reusable assets and the methods/ approaches that are put forward by the academia for reusing those assets. Also a systematic mapping study is performed to investigate what techniques, methods, models and metrics for assessing the value of reuse and for maintaining the reused software are proposed and we also investigate their validation status as well. Methods: The databases like IEEE Xplore, ACM digital library, Inspec, Springer and Google scholar were used to search for the relevant studies for our systematic mapping study. We followed basic inclusion criteria along with detailed inclusion/exclusion criteria for selecting the appropriate article. Results: Through our systematic mapping study, we could summarize the list of 14 reusable assets along with the approaches/methods for reusing them. Taxonomy for assessing the value of reuse and taxonomy for maintaining the reusable software are presented. We also presented the methods/metrics/models/techniques for measuring reuse to assess its value and for maintaining the reusable software along with their validation status and areas in focus.

Conclusion: We conclude that, there is a need for defining a standard set of reusable assets that are commonly accepted by the researchers in the field of software reuse. Most metrics/models/methods/approaches presented for assessing the value of reuse and for maintaining the reuse software are academically validated. Efforts have to be put on industrially validating them using the real data.

Keywords: Software, Reuse, Reusable Assets, Value, Measuring, Maintenance, systematic mapping study, Metrics, Models, Methods, Approaches.

ii

ACKNOWLEDGEMENT

First and foremost, we owe a lot to Lord Shiridi Sai Baba and Durga Devi Talli for his blessings. We are greatly indebted to Dr. Mikael Svahnberg for his advices, guidance and support. We thank him for allocating some of his valuable time for meetings, apart from his busy schedule. These meetings guided us in all walks during this thesis period and also in building confidence, without which we wouldnt had made it this far.

We thank our parents for their effort in sending us to attain quality education and also we thank them for their affection and moral support towards us. We would like to thank Mrs. Eva Norling for her advices in designing the search terms. We are thankful to the librarians, whose support in providing the literature helped us a lot throughout our thesis. Last but not the least, we are grateful to our friends especially Vinod, for their moral support where and when needed and for sharing everlasting memories during our stay in Sweden.

iii

Table of Contents

1 INTRODUCTION ..................................................................................................................... 1

1.1 SOFTWARE REUSE .............................................................................................................. 1 1.2 AIMS AND OBJECTIVES ....................................................................................................... 2 1.3 RESEARCH QUESTIONS ....................................................................................................... 2 1.4 THESIS STRUCTURE ............................................................................................................ 3 1.5 BACKGROUND AND RELATED WORK ................................................................................. 3 1.6 RESEARCH METHODOLOGY .............................................................................................. 5 1.6.1 Search Strategy ........................................................................................................... 5 1.6.2 Search Process Execution ........................................................................................... 7

1.6.2.1 Search Term identification and Search Questions framing process ............................................................................................................................. 7 1.6.2.2 Basic Inclusion Criteria ...................................................................................... 8 1.6.2.3 Detailed Inclusion/ Exclusion Criteria .............................................................. 8 1.6.2.4 Snowball sampling ............................................................................................... 9 1.6.2.5 The Analysis ......................................................................................................... 9 1.6.3 Validity Threats .......................................................................................................... 9

2 REUSABLE ASSETS .......................................................................................................... ...11

2.1 METHODOLOGY EXECUTION ........................................................................................... 11 2.2 RESULTS ............................................................................................................................ 13 2.2.1 Reusable Assets ......................................................................................................... 13 2.2.2 Bubble Graph ............................................................................................................ 16 2.2.3 Results ........................................................................................................................ 17

2.3 ANALYSIS ........................................................................................................................... 18 2.3.1 State of Validation..................................................................................................... 19 2.3.1.1 Overall Validation Status .................................................................................. 19 2.3.1.2 Validation Status for Each Asset ...................................................................... 19 2.3.2 Assets in Focus .......................................................................................................... 20

3 VALUE OF REUSE ............................................................................................................... 21

3.1 METHODOLOGY EXECUTION ........................................................................................... 21 3.2 RESULTS ............................................................................................................................ 22 3.2.1 Taxonomy ................................................................................................................. 22 3.2.2 Bubble Graph ............................................................................................................ 25 3.2.3 Results ........................................................................................................................ 26

3.3 ANALYSIS ........................................................................................................................... 29 3.3.1 State of Validation..................................................................................................... 29 3.3.1.1 Overall Validation Status .................................................................................. 29 3.3.1.2 Validation Status for Each Category ................................................................ 31 3.3.2 AREAS IN FOCUS ....................................................................................................... 33 3.3.3 REPRESENTATION METHODS ................................................................................... 34

4 MAINTENANCE OF REUSABLE SOFTWARE .............................................................. 36

4.1 METHODOLOGY EXECUTION ........................................................................................... 37 4.2 RESULTS ............................................................................................................................ 39 4.2.1 Maintenance Taxonomy .......................................................................................... 39 4.2.2 Bubble Graph ............................................................................................................ 40 4.2.3 Results ........................................................................................................................ 41

iv

4.3 ANALYSIS ........................................................................................................................... 44 4.3.1 State of Validation..................................................................................................... 45 4.3.1.1 Overall Validation Status .................................................................................. 45 4.3.1.2 Validation Status for Each Category ................................................................ 45 4.3.2 Areas in Focus .......................................................................................................... 47 4.3.3 Validity Threats ....................................................................................................... 48

5 CONCLUSION ....................................................................................................................... 49

6 REFERENCES ........................................................................................................................ 51

7 Appendix .................................................................................................................................. 66

v

List of Tables

Table 1: Population, Intervention, Context, Outcome for each Research Question ..... 8 Table 2: Basic Inclusion Criteria ....................................................................................... 8 Table 3: Detailed Inclusion/ Exclusion Criteria ............................................................... 9 Table 4: Search Terms and Search Questions for Reusable Assets ............................. 12 Table 5: Hits after each phase for RQ1............................................................................ 13 Table 6: Reusable Assets Table ....................................................................................... 13 Table 7: Search Terms and Search Questions for Value of Reuse ................................ 21 Table 8: Hits after each phase for RQ2............................................................................ 21 Table 9: Contribution Table for Value of reuse ............................................................. 26 Table 10: Search Terms and Search Questions for Maintenance ................................ 37 Table 11: Hits after each criterion for RQ3 .................................................................... 38 Table 12. Contribution Table for Maintenance ............................................................. 41 Table 13: Abbreviations used in Tables and Figures ..................................................... 66 Table 14: Result Table for Algorithms Reuse ................................................................ 68 Table 15: Result Table for Architecture Reuse .............................................................. 68 Table 16: Result Table for Data Reuse ........................................................................... 69 Table 17: Result Table for Design Reuse ........................................................................ 69 Table 18: Result Table for Documentation Reuse ......................................................... 70 Table 19: Result Table for Estimates Reuse ................................................................... 71 Table 20: Result Table for Human Interface Reuse ...................................................... 71 Table 21: Result Table for Knowledge Reuse ................................................................ 71 Table 22: Result Table for Model Reuse ......................................................................... 72 Table 23: Result Table for Modules Reuse ..................................................................... 72 Table 24: Result Table for Plans Reuse .......................................................................... 73 Table 25: Result Table for Requirements Reuse ............................................................ 73 Table 26: Result Table for Service Contract Reuse ....................................................... 75 Table 27: Result Table for Test Case/ Test Plan Reuse ................................................. 75 Table 28: Result Table for Cost Benefit Analysis .......................................................... 76 Table 29: Result Table for Maturity Assessment Models ............................................. 77 Table 30: Reuse Table for Amount of Reuse Metrics .................................................... 77 Table 31: Reuse Table for Failure Modes Model ........................................................... 78 Table 32: Result Table for Reusability Assessment ....................................................... 79 Table 33: Result Table for Reuse Library Metrics ........................................................ 79 Table 34: Result Table for Strategies .............................................................................. 80 Table 35: Result Table for Change Impact Analysis ..................................................... 80 Table 36: Result Table for Software Configuration Management ............................... 82 Table 37: Result Table for Module Dependencies ......................................................... 83 Table 38: Result Table for Legal Issues .......................................................................... 84 Table 39: Result Table for Aging Symptoms .................................................................. 84 Table 40: Reference List for Chapter 2, 3 and 4 ............................................................ 85

vi

List of Figures

Figure 1: Search Strategy .................................................................................................... 6 Figure 2: Detailed Analysis through Bubble Graph ........................................................ 16 Figure 3: Validation Status (Reusable Assets) .................................................................. 19 Figure 4: Assets in Focus (Reusable Assets) .................................................................... 20 Figure 5: Reuse Metrics and Models Taxonomy .............................................................. 23 Figure 6: Systematic Mapping for Value of Reuse ........................................................... 25 Figure 7: Validation status (Value of Reuse) ................................................................... 31 Figure 8: Percentage of Academic, Industrial and Survey Validations .......................... 33 Figure 9: Areas in Focus (Value of Reuse) ...................................................................... 34 Figure 10: Taxonomy of Maintenance ............................................................................. 40 Figure 11: Systematic Mapping for Maintenance ............................................................ 40 Figure 12: Validation Status (Maintenance of Reuse) ..................................................... 45 Figure 13: Percentage of Academic, Industrial and Survey Validations ........................ 47 Figure 14: Areas in Focus (Maintenance of Reuse) ........................................................ 47

1

1 INTRODUCTION This section deals with the introduction to our thesis along with our aim and

objectives, research questions, thesis structure and research methodology.

1.1 Software Reuse Software reuse is the process of creating a new system from that of existing system rather than creating the new one from the scratch. In other words, it is the reusing of existing software artifacts or software assets to build a new system. The concept of software reuse has been introduced to overcome the software crisis i.e., the problem of building large and reliable software system in a cost effective and controlled way by McIlroy in 1968 [177]. Initially, software reuse was limited to source code. Due to the increase in customer needs and market demand for sophisticated software, the software companies started thinking beyond source code. This leads to the reuse of other life cycle assets. By reusing the other life cycle assets like design, algorithms, knowledge etc, new software can be brought in to the market faster. Software reuse helps in not only reducing the time but also reduces the cost [34] [177].

Though research is going on in the field of software reuse, the software industry is still in its initial stages. Many sources said that the research in the field of software reuse was started in 1968 [177] [96] [121], since then many questions arose which were left unanswered. Some of them are:

1) Are there any standard set of reusable assets defined?

2) Is there a standard taxonomy defined for assessing the value of reuse?

3) Is there a standard taxonomy defined for maintaining of reusable software?

Through our research, we would proceed further to find an answer to the above questions. Source code is most commonly reused and thus many had the misconception that software reuse is the reuse of source code alone [36]. Frakes[176] mentions that little is known about the reuse of assets other than coding. But Reuse of source code cannot alone save money. Some studies have shown that though 50% of the code was reused the cost savings on the software product was much smaller. This motivated us to find the other reusable assets apart from coding [92]. So for successful software reuse and getting more benefit through software reuse, the industries are looking forward to extend the concept of reuse to the assets other than coding, which are reusable. Researchers like R. J. Leach [23], Swanson [68], Bollinger [81] and Jones [21] presented some reusable assets along with code, but the list of reusable assets they presented do not exactly match with each other. There are some assets that were mentioned commonly by them but there is no common understanding between the researchers on a standard set of reusable assets. Assessing the value of reuse is a major concern in the software industry. For assessing its value, reuse should be measured by using the metrics and models. Measuring reuse will help the organization to know their progress in software reuse, to know how much amount of reuse is done or to assess the cost benefits of software reuse etc. For this, W. B. Frakes in 1996 has done a review on some of the existing important models or metrics or methods. However, there are no widely accepted models and the organizations are still unsure of getting success by using those models which are predicted [89]. It is observed that the researchers have just started to realize the importance of extending the concept of reuse to other life cycle objects beyond coding but very few authors have worked on assessing the value of reuse in them, most of the authors dealt with the assessing the value of reuse in a code. The metrics or models which are designed for assessing the value of reuse

2

in other life cycle objects are inspired from those designed for coding. For example: For assessing the amount of reuse metrics, W. B. Frakes [22] had derived a formula for assessing the amount of reuse in whole life cycle which is inspired by the formula for coding.

The other issue concerning to software reuse is the maintenance. Maintenance of reusable software is the most expensive part of the software life cycle. Software maintenance involves modification of a software component after it has been handed over to the client. The changes are made to the software to ensure error corrections, performance or other improvements, functionality up-gradations, or adaptations to changed environments. There have been situations where more than 50% of the budget is spent on the maintenance of the software. Maintenance is treated as reuse-oriented task. The life cycle assets like requirements, design, documentation etc, from the earlier versions of the system must be revisited for maintaining the software reuse which in turn makes it easy for the maintenance programmers to understand the problem [36].

Most of the research works proposed models or methods for assessing the value of reuse and for maintaining the reusable software. However, there are no widely accepted methods or models. The organizations are still unsure of getting success by using those models [89]. Literature that published the experiences of industry or success stories of industry in using a particular model is scarce. By seeing this, we assume that industries are not fully confident in getting success by reusing software [113].

The aim of our research is to investigate the status of the research performed since 1968 on the assets other than source code, that are mentioned in the literature as reusable. Also, to investigate on software reuse particularly in assessing the value of reuse and maintaining the reusable software through our systematic mapping study and to investigate the trends and developments in this field of research particularly in assessing the value of reuse and maintaining the reused assets. Also, we aim to study if there is any effort put by the industry in validating the proposed models till first half of 2009.

Our research report will help the industry to know exactly which assets can be reusable along with coding as we come out with a set of reusable assets that we have found through our review and to know what are the metrics and models for measuring reuse to assess its value and what are the methods for maintaining the reusable software. Our report will also help them to know which models are industrially validated up to now so that they can feel confident in using those industrially validated models. It also encourages the industry and researchers to further validate the non validated models either academically or industrially, so that they can be useful for the industry to use them.

1.2 Aims and Objectives The aim of this research is to do a systematic mapping study to find out reusable assets other than coding that are mentioned in the literature, models or metrics that are used for assessing the value of reuse, methods/models/metrics/approaches/tools for maintaining the reusable software and to report the results and analysis of our review.

This aim will be fulfilled by the following objectives:

To identify the assets other than coding that are mentioned in the literature as reusable.

To identify the methods or approaches for reusing these assets.

To identify the metrics and models used for assessing the value of reuse.

To identify the methods for maintaining reusable software.

1.3 Research Questions RQ1. What are the assets other than coding have been mentioned in the literature as reusable?

- RQ1.1. What are the methods or approaches for reusing these assets?

3

- RQ1.2. What is their validation status? - RQ1.3. What are the assets in focus?

RQ2. What are the metrics and models that have been proposed for measuring reuse to assess its value?

-RQ2.1. What is the validation status? -RQ2.2. What are the areas in focus?

RQ3. What approaches have been proposed for maintaining the reusable software?

-RQ3.1. What is their validation status? -RQ3.2. What are the areas in focus?

1.4 Thesis structure Chapter 1: In this chapter, we start with introduction to our report and present our aims objectives and research questions. In section 1.5, we discuss the background and related work. Section 1.6, deals with the research methodology in which we discuss in detail about our search strategy and give a clear picture of design and execution of our systematic mapping study. In section 1.6.3, we present the validity threats.

Chapter 2: This chapter deals with answering the research question RQ1. In section 2.1, we present the methodology execution along with the search terms and search combinations we used in finding the relevant studies and present the names of the databases used along with the studies obtained for each criteria. In Section 2.2, we present the results of our systematic review in which we present the reusable assets along with their definitions. In Section 2.3, we present the analysis of our review for RQ1 and RQ1.1.

Chapter 3: This chapter deals with research question RQ2. In section 3.1, we present the methodology execution along with search terms and search combinations we used in finding the relevant studies and present the names of the databases used along with studies obtained for each criteria. In Section 3.2, we present the results of systematic mapping study in which, we present the taxonomy of Reuse metric and models for assessing the value of reuse. In Section 3.3, we present the analysis of our review for RQ 2.

Chapter 4: This chapter deals with research question RQ3. In section 4.1, we present the methodology execution along with search terms and search combinations we used in finding the relevant studies and present the names of the databases used along with the studies obtained for each criteria. In Section 4.2, we present the results of our review in which, we present the maintenance of software reuse taxonomy. In Section 4.3, we present the analysis of our review for RQ 3.

Chapter 5: In this chapter, we present the conclusion of our report along with future work and recommendations for the future research work.

Chapter 6: In this chapter, we present the references used in our research.

Chapter 7: This chapter is an appendix which contains the definitions to the abbreviations that were used in the figures and tables of different chapters. Section A2, Section A3, Section A4 contains the results of our systematic mapping study (for answering RQ1, RQ1.1, RQ2 and RQ3) in the form of tables.

1.5 Background and Related Work In the earlier days of software development, the software used to be built from the scratch. In this rapidly changing world, user needs and expectations are never constant and changes time to time and users expect new versions as fast as possible. The organizations have focused on finding new ways to bring out the products to its customers as fast as they can, within shorter time and with reduced cost, which meets the user expectations. Product success is affected by

4

many success parameters like time to market, product cost, delivering optimal quality, level of effort, engineering overhead etc [181, 182]. The need for faster development of software and for introducing a successful product into the market led to the concept of reuse of software assets from the existing systems. Software reuse is not a new topic to discuss. The idea of Software reuse was first introduced by McIlroy in 1968 [177] and its role is predominant in software development. D. L. Parnas [187] in 1972 was the first to use the word "Modularization". He put his efforts in the field of modular programming in which the system is decomposed in to smaller modules. By this, smaller code modules can be developed. These modules can be used for reassembling or can be replaced by the other existing module. Most famous researchers like McIlroy [177], D. L. Parnas [187] etc thought that software reuse is nothing other than code reuse.

There are many definitions for software reuse by many researchers. We would like to quote Krueger's general view definition of software reuse. "Software reuse is the process of creating software systems from existing software rather than building software systems from scratch [34, 179]. Now a day, most of the software companies are tending towards software reuse. Since, with software reuse we can minimize redundant work, produce high quality, reduce development cost, release the product in time, minimize maintenance and training cost, reduce team size, share expertise (code) and reduce documentation [179].

Reuse of software involves use of already existing assets from the previous versions of the software, finding the appropriate ones that are needed at present for reuse and integrating them with the currently new ones [60]. Many people assume that reuse of software means reuse of code (code scavenging) alone. But, there are several other assets which can be considered apart from code, such as requirements, design, test cases, test plans, architectural framework, look and feel of the applications, knowledge generated, reasoning, templates for any asset and so on [37, 58, 19, 50].

Though research is going on in the field of software reuse, the software industry is still in its initial stages of software reuse. By seeing this we assume that they are not fully confident in getting success by reusing software [113].

Regarding the assets many authors have only just dealt with code reuse. But for successful software reuse and getting more benefit through software the industries are looking forward to extend the concept of reuse to the assets other than coding, which are reusable. But some authors like Leach [23] and Jones [21] tried to present some reusable assets along with code, but the list of reusable assets they presented do not exactly match with each other. There are some assets that were mentioned commonly by both but there does no common understand between the researchers on a standard set of reusable assets.

Measuring or assessing the value of reuse is a major concern among the organizations and there are works predicting models or methods. However, there are no widely accepted models and the organizations are still unsure of getting success by using those models which are predicted [89]. W.B Frakes [22] in 1996 has done a review on some of the existing important models or metrics or methods. Clearly, it is observed that the researchers have just started to realize the importance of extending the concept of reuse to other life cycle objects beyond coding but very few authors have worked on assessing the value of reuse in them, most of the authors dealt with the assessing the value of reuse in a code. The metrics or models which are designed for assessing the value of reuse in other life cycle objects are inspired from those designed for coding. For example: For assessing the amount of reuse metrics, W.B Frakes [22] had derived a formula for assessing the amount of reuse in whole life cycle which is inspired by the formula for coding. Literatures that published the experiences of industry or success stories of industry in using a particular model are scarce [113,179].

Maintenance of reusable software is another area of concern. Maintenance of reusable software is an expensive task. There are many factors which has impact on the maintenance of reusable software. These factors include coupling and cohesion, software configuration management, change impacts, aging of a legacy system, licensing, contractual and negotiation

5

issues. Approaches/ models/ methods have been proposed by academia for each factor. But, there are no standard maintenance models which could solve the problems related to the complete set of above mentioned factors. Moreover, there are no widely accepted models.

Related Work Authors like Jones [21], Frakes [22], Leach [23] and Bollinger [81] have presented their own list of reusable assets. They have some reusable assets in common. But, it can be clearly understood that no author have actually tried to present a set of reusable assets that can be accepted by most of the researchers in the software reuse community. The above mentioned authors have actually tried to mention the assets but it seems that there is no common agreement between them regarding reusable assets. Each author presented his own list of reusable assets. The list proposed by one author differs with the list proposed by the other. Regarding metrics and models for measuring reuse to assess its value Frakes [22] in 1996 has done a major work in bringing different reuse metrics and models together and categorizing them based on their application to different areas of software reuse. His taxonomy of reuse metrics and models was an inspiration to later works in this field. Mohagheghi [79] in 2007 reviewed the journals between 1994 and 2005 to gather the evidences of successful software reuse programs in industry. This work helped us to know the validation status (industrial validation) of the studies in the past before 2007. In our report, we also tried to trace out the industrially validated studies along with academically validated till the first half of mid 2009. Curry et al. [94] had done a review study on amount of reuse metrics and Frakes [95], Mascena [112, 113] have done study regarding amount of reuse metrics. In these studies, we found few additional subcategories in the amount of reuse metrics along with those mentioned by Frakes [22]. These subcategories are added to the taxonomy of reuse metrics and models in our review report. In 2005, Frakes [107] presented a paper on present status and future of software reuse. This paper gives us an idea on present status of research in the field of reuse metrics and models. Victor Basili [120] in 1990 was the first person to discuss reuse in terms of maintenance and development. Few researchers like Kwon [121] proposed integrated approaches which are a combination of software reuse, maintenance and SCM for the maintenance of reusable software. Michael Jiang et al. [125] proposed integrated approaches which are combination of data mining, defect tracking system and SCM for the maintenance of reusable software. Reddy [197] in 1996 started his research in the field of maintenance of software reuse and introduced another type of maintenance called Reconstructive maintenance. None of the research works found so far has defined taxonomy for the maintenance of software reuse. Moreover, the research works discussed maintenance of reusable software in terms of their individual topic areas like software configuration management, change impact analysis, module dependencies, legal issues and aging symptoms. Every topic area plays a vital role in maintaining reusable software. So, we would be presenting these topic areas under taxonomy. We categorized each topic area and also introduced a category named strategies which deals with the integrated approaches [121] [125] and approaches for reuse as whole like full reuse maintenance model [124] and simple reuse model [120].

1.6 Research Methodology In this report, we followed Kitchenhams guidelines for performing a systematic mapping study [180, 183]. A systematic mapping study is a precursor to a systematic literature review. It is another type of review which complements systematic literature review. It is also known as scoping study. A systematic mapping study is used to identify the extent and form of the literature on a particular topic. A systematic mapping study is suitable when we notice that there is very little evidence available or when the topic area is too broad during the initial examination of the domain before a systematic review is executed [183]. A systematic mapping study helps in identifying the evidence that is available for a particular topic and can be represented at high level of granularity. The results obtained through mapping study would help in identifying the evidence clusters and evidence deserts which would

6

suggest which areas are to be more focused by the future systematic reviews and the areas where there is a need to conduct more primary studies [183].

The systematic mapping study consists of finding an answer to the research questions. This involves analyzing, identifying, evaluating and interpreting all research works that are relevant to a particular research question, or topic area, or phenomenon of interest. Most important of all and which is applicable for our work is to identify and summarize the extent of current technology or treatment till date and identifying the gaps in current research work in order to throw light on what has to be done as future work. The search strategy followed in finding the relevant studies is discussed in section 1.6.1, Search Strategy. The methodology discussed in this section is being followed in the chapter 2, chapter 3 and chapter 4.

1.6.1 Search Strategy The search strategy which is being followed in our systematic mapping study is presented in figure 1. The search strategy consists of four phases:

Phase 1: First phase consists of executing the search both electronically and by citation search. This phase involves identification of search terms and search questions. This search results are documented and are maintained including the selected and rejected documents which makes the search process easy. The search terms and search questions are framed based on the research questions, the topic area and the phenomenon as a whole.

Phase 2: This phase involves the execution of inclusion and exclusion criteria which are explained in sections.

Phase 3: Apart from the first two phases, the third phase is slightly different. In the third phase, we slightly deviated from the Kitchenham's guidelines and conducted snowball sampling. This is discussed in more detail in section 1.6.2.5.

Phase 4: This phase consists of the analysis of the studies obtained from the three phases.

A central database is used for the storing and retrieval of the studies for each phase.

Figure 1: Search Strategy

7

Initially, we found 191343 studies after the citation and electronic search. In order to refine these hits, we applied basic inclusion criteria and detailed inclusion/ exclusion criteria. On applying basic inclusion criteria, we found 2299 studies. We eliminated the duplicates in the basic inclusion criteria itself. We further executed the detailed inclusion and exclusion criteria which resulted in 165 full text studies. We performed snowball sampling based on the obtained full texts which in turn resulted in 42 studies. Finally, a total of 207 studies where found for our research work. (Note: The total number of studies mentioned here is the final figure after removing the 6 duplicates (duplicates means some references/studies falls into more than one category or chapter)). A list of references for each chapter is given in table 40 in appendix.

We presented a table in each chapter in order show the number of studies found initially, number of studies found after the basic inclusion criteria, number of studies found after the detailed inclusion/ exclusion criteria and studies found through snowball sampling respectively for each database. Databases used: The databases used during the systematic mapping study are:

1. IEEE Xplore

2. Inspec + Compendex

3. ACM Digital

4. Elsevier

5. Springer

These are the five major databases we used. We used Google scholar for performing the citation search which is discussed in section 1.6.1. Through this, we could find

The articles which cites an article,

The article having particular keywords since a particular year

The articles which are similar to the current article.

We also used Google scholar for our snowball sampling which is discussed in section 1.6.2.4. Some of the research works found during snowball sampling are from the Citeseer database.

1.6.2 Search Process Execution Search process execution consists of identifying the search terms and search questions and defining the inclusion/ exclusion criterias. 1.6.2.1 Search Term identification and Search Questions framing process The search terms are derived by considering the population, intervention, outcome, context and comparison. The population in this study represents the domain of software reuse. The intervention represents the application of search techniques used for the analysis of the different types of assets, value of software reuse and the maintenance of software reuse. Comparison in our case is not applicable for the reason being that the three research questions are of three different topic areas in the same domain. However, comparison is performed within each research question.

8

Table 1: Population, Intervention, Context, Outcome for each Research Question

Other terms are obtained by identifying the alternative terms and synonyms to the major search terms. Some terms are obtained from the keywords which are mentioned in the research paper relevant to our topic. Search questions are framed by using the Boolean OR and AND. Some databases like Inspec, Compendex etc facilitate the use of truncation * and wildcards ? in the keywords which can also be used to perform efficient search. For example: reus* instead of reuse, reusable, reusability and wom?n instead of woman or women.

1.6.2.2 Basic Inclusion Criteria

The inclusion and exclusion criterias are used for data extraction in obtaining the most appropriate studies which are necessary in answering the research questions. The basic inclusion criteria will help in initial refinement of the articles. In order to perform the basic inclusion criteria, we are considering three criteria. These three criteria will help us in deciding which articles are to be included. They are discussed in table 2.

Basic inclusion criteria

1. Include article if the title matches with the topic area. 2. Include article if the abstract matches with the topic area. 3. Include only non-redundant articles.

Table 2: Basic Inclusion Criteria

By applying the basic inclusion criteria, 191341 studies as shown in figure 1, were reduced to 2299 studies. The reason for this huge variation is due to the removal of duplicates in the basic inclusion itself. 1.6.2.3 Detailed inclusion/ exclusion criteria After the basic inclusion criteria, a detailed inclusion/ exclusion criteria is applied on the results obtained by the basic inclusion criteria. The detailed inclusion/ exclusion criteria are discussed in the table 3.

Population Intervention Context Outcome

RQ 1 Software Reuse Review of reusable assets Academia Reusable Assets

RQ 1.1 Software Reuse Methods for reusing assets Academia Methods

RQ 1.2 Software Reuse Validation status for assets Academia Graph representing the percentage of validation.

RQ 1.3 Software Reuse Assets in focus Academia Graph showing the research contribution for each asset per year

RQ 2 Software Reuse Assessing the value of reuse Academia Reuse metrics and models

RQ 2.1 Software Reuse Validation status for value of reuse

Academia Graph representing the percentage of validation.

RQ 2.2 Software Reuse Areas in focus for value of reuse

Academia Graph showing the research contribution for each value (Reuse metrics and models) category per year

RQ 3 Software Reuse Maintenance of software reuse

Academia Methods/Models/ Metrics/ Approach

RQ 3.1 Software Reuse Validation status for maintenance of reusable software

Academia Graph representing the percentage of validation.

RQ 3.2 Software Reuse Areas in focus for maintenance of reusable software

Academia Graph showing the research contribution for each maintenance category per year

9

Detailed inclusion/ exclusion criteria

Inclusion criteria

1. The article must be peer reviewed 2. The article must be available as full text 3. The article should relate to the software reuse 4. The article should be in the topic area of assets, value or maintenance in software reuse. 5. The article should be literature review, systematic review, systematic mapping study, case study, experiment or experience report, survey or a comparitative study. 6. The article should be included if it proposes a model, metrics, approach or method. 7. The article should be included if it deals with the extension to existing model. 8. The article should be included if it deals with the validation to existing model or the currently proposed model.

Exclusion criteria

1. Articles which do not follow inclusion criteria can be excluded. 2. Some articles mentions management as maintenance should also be excluded. Management of software reuse refers to naming, storage and retrieval of reusable assets which is not related to our report and hence, it has to be excluded. 3. Non-English articles should be excluded.

Table 3: Detailed Inclusion/ Exclusion Criteria

1.6.2.4 Snowball Sampling

Snowball sampling is an approach to study the hidden population. Hidden population refers to the research works which are not found when search process is executed. Snowball sampling is performed on the works of the researchers which fits our study requirements. In this approach, if we find a reference in an article, we will make use of that reference to find two more and so on. Kitchenham [183] suggests manual scanning of reference lists from relevant primary studies and appropriate article to find suitable articles. The other reason for choosing snowball sampling is that, this approach can also be applied to find the articles which use different terminologies. Though, many researchers presented the same idea, they made use of different terminology. We followed the same basic inclusion and detailed inclusion/exclusion criteria (mentioned in table 2 and table 3) for snowball search results.

1.6.2.5 The Analysis

The results obtained from the systematic mapping study are analyzed in order to answer the research questions. The analysis mainly focuses on the state of validation, areas in focus of the research work and shift in trends in the research.

1.6.3 Validity threats It is essential to know the validity of study results and how these threats impact the results of the research work. In this section, we will be discussing the threats noticed during the systematic mapping study. There are four types of validity threats:

1. Conclusion Validity

This type of validity relates to the statistical significance between the treatment and the outcome [205] [188]. These types of threats are commonly noticed during the search process execution. Wrong framing of search question formed by choosing inappropriate search terms leads to irrelevant study results. Usage of many appropriate search terms will help in filtering the irrelevant studies, but usage of one inappropriate search term may also result in many irrelevant studies. In order to avoid this, we framed the search terms and search questions under the supervision of the librarian. The conversation with the librarian helped us in framing

10

the basic search terms. By executing the basic search terms, we could derive few other search terms. Usage of inappropriate Boolean operators (like AND in place of OR and vice versa) would also result in many irrelevant studies. The execution of search strategy may result in some irrelevant studies due to bad framing of inclusion/ exclusion criteria. In order to avoid these irrelevant studies, we consulted experts like the librarian and a senior researcher during the framing of inclusion/ exclusion criteria. Some standard terms, when given as input for the search process leads to too many hits. In order to avoid this kind of threat, usage of quotes will help in finding the relevant studies. For example: When we use software AND configuration AND management instead of "software configuration management", the number of hits found in IEEEXplore would be around 2000. Number of hits with the quotes would be 140. Such type of threats can be avoided by making use of quotes.

2. Internal Validity

This type of validity relates to a casual relationship between treatment and the outcome [205] [188]. Creswell [206] states a general view definition as "Internal validity threats are experimental procedures, treatments or experiences of the participants that threaten the researchers ability to draw correct inferences from the data in an experiment". The inclusion of certain unpublished research works can introduce unwanted outcomes. Some of the literature works owned by organizations and research institutes are not available as full texts. This introduces a gap in the research work. Other reason is that, the native language of both the authors involved in this research work is not English. Therefore, the chances of interpreting the things in different perspectives may persist. In order to avoid this, cross checking the works of each other is necessary to ensure a common perspective. We noticed two types of review studies. Some studies deals with only reviews which provides suggestions or summary. The other type of review studies deals with reviewing the other researchers work followed by validating or non-validating that work. Usually, percentage of validated and non-validated studies should sum up to 100%. But, in this report, we included reviews also. So, percentage of validated and non-validated studies along with review will sum up to 100%. When we encounter a study which consists of a review and proposes a model which is either validated or non-validated, we would be marking them in the subsequent column in the result table. The threat noticed here is that when both the columns are marked, the overall percentage exceeds 100%. In order to avoid such threats, we introduced two columns namely Extension to (E.T) and Validation Of. Whenever there is a review followed of a model and its validation, the Validation Of column is marked instead of review. Whenever there is a review followed by an introduction to new model which is not validated, then Extension To column is marked instead of marking it in review column. Such threats occurred for two studies [122, 124]. Apart from that a study by Oh-Cheon Kwon [122] has a non-validated tool and a review which made the total percentage go beyond 100%. The threat still remained for such discussions.

3. External Validity

This type of validity relates to the generalization of results outside the scope or time span of the study [205] [188]. Creswell [206] provides a general view definition as "External validity arise when experimenter draw incorrect inferences from the sample data to other persons, other settings and past or future situations". For example, Cyril [207] proposed an approach for requirements in August 2009. Since, we have limited our time span till first half of 2009, we are not considering this study for our research work.

4. Construct Validity

This type of validity refers to the relationship between the theory and the application [205] [188]. This type of validity threat is noticed when relevant studies are excluded. In order to avoid these threats, we introduced an exhaustive search strategy in section 1.6. In this search strategy, we made use of four phases. These four phases will help in overcoming the construct validity threats.

11

2 REUSABLE ASSETS Reusable assets are considered as the building blocks for software reuse. The

reusable assets can be of a technical or managerial in nature, large grained or fine grained, simple or composite. The reusable assets can have varying degree of leverage. A leverage of a reusable asset happens when a reuse of one asset leads to the reuse of chain of assets in a downstream process. The reusable assets may consist of single asset or several assets in one asset (nested) [44] [63].

Ezran [44] [63] defines reusable asset as Software assets are composed of a collection of related software work products that may be reused from one application to another. The terms like components and work products are also used in place of reusable assets. Jacobson et al. [35] states that the assets and work products are also used in place of components.

We present the definitions of components and work products to distinguish between the three terms. A component is defined as an executable asset that may be integrated as-is into an application [44]. An asset is made from a set of related work products. And these work products represents a same piece of software at different abstraction levels. These work products can be used at every step of software life cycle [44].

When considering the reusable assets most works mentions about coding. But there are other assets that can be reusable. So, we aimed at searching for other reusable assets apart from coding that are mentioned in the literature. Through our Systematic mapping study, we found 14 assets that can be reusable. Here, we excluded the code intentionally as we are searching for the reusable assets that are other than coding. The other reusable assets are:

1. Algorithms 2. Architecture 3. Data 4. Designs 5. Documentation 6. Estimation Templates 7. Human Interfaces 8. Knowledge 9. Models 10. Modules 11. Plans 12. Requirements 13. Service Contracts 14. Test Cases

2.1 Methodology Execution The search terms and search combinations are formed in order to answer the research questions RQ1 and sub-researches question RQ1.1, RQ1.2 and RQ1.3.Very few authors contributed in the field of software reusable assets. Initially, the search process was carried out by using eight terms like 1, 4, 5, 6, 20, 21, 22, and 24 in table 4. Through these eight search terms, we could find the research works and books of few renowned researchers. Among them are Jones [21], Leach [23], Frakes [22] and Swanson [68] who presented lists of reusable assets. In order to gain more knowledge about each asset in their list, we used the asset names as search terms.

12

Search Terms:

1. Assets 2. Algorithms 3. Architecture 4. Artifacts 5. Aspects 6. Components 7. Data 8. Designs 9. Document 10. Estimates 11. HCI 12. Human Computer Interface 13. Human Interface 14. Knowledge 15. Lifecycle 16. Models 17. Modules 18. Plans 19. Requirements 20. Reusable 21. Reuse 22. Reused 23. Service contracts 24. Software 25. Test

Search Questions:

1. asset AND reuse OR reusable AND software 2. Lifecycle AND reuse AND software 3. software AND reusable AND aspects 4. software AND reusable AND artifacts 5. software AND reusable AND components 6. {Reusable OR reuse} AND Data 7. {Reusable OR reuse} AND Documentation 8. {Reusable OR reuse} AND Estimates(Templets) 9. {Reusable OR reuse} AND plans(project plans) 10. {Reusable OR reuse} AND {Test cases OR Test designs} 11. {Reusable OR reuse} AND Service contracts 12. {Reusable OR reuse} AND Algorithms 13. {Reusable OR reuse} AND Designs 14. {HCI} AND {software AND reus*} AND {asset OR artifact OR component} 15. Knowledge AND {software AND reus*} AND {asset OR artifact OR component} 16. Requirements AND {software AND reus*} AND {asset OR artifact OR component} 17. Architecture AND {software AND reus*} AND {asset OR artifact OR component} 18. Human Interface AND {software AND reus*} AND {asset OR artifact OR component} 19. Human computer Interface AND {software AND reus*} AND {asset OR artifact OR component} 20. Models AND {software AND reus*} AND {asset OR artifact OR component} 21. Modules AND {software AND reus*} AND {asset OR artifact OR component}

Table 4: Search Terms and Search Questions for Reusable Assets

The articles are obtained by executing the basic inclusion criteria along with the detailed inclusion/ exclusion criteria. These criteria are discussed in section 1.6. The results or the number of hits obtained are tabulated in table 5.

13

Number of hits Basic inclusion criteria


Snowball Sampling

Google Scholar - - - 21

ACM 17761 24 13 -

Inspec 36673 108 12 -

IEEE 244 36 19 -

Elsevier 12469 42 5 -

Springer 68892 363 9 -

Total 136039 573 58 21

Table 5: Hits after Each Phase for RQ1

NOTE: A list of references for this Chapter 2 are shown in Table 40 in Appendix

2.2 Results In this section, the results found through our review are presented. We have plotted the results in a bubble graph. 2.2.1 Reusable Assets In Table 6, we present the list of assets that are found through the review. The table also includes the number of articles found for each asset along with the author name, year of publication and the reference number of the article. Reusable Asset Number of Articles Authors Reference

Number

Algorithms 1 [Karsten, 97] [25]

Architecture 11 [Krueger, 92] [34]

[Leach, 97] [23]

[Jacobson et al, 97] [35]

[Sametinger, 97] [36]

[Peter Eeles, 08] [37]

[Li. H et al, 92] [38]

[White et al, 98] [39]

[Baum et al, 98] [40]

[Gomaa, 95] [41]

[Griss et al, 99] [42]

[Clements et al, 01] [65]

Data 5 [Giuseppe, 94] [1]

[I Issenin, 04] [2]

[Jones, C, 93] [21]

[W Frakes, 96] [22]

[R. Leach, 97] [23]

14

Designs 10 [Kevin W. Jameson, 89] [26]

[V Upadhyay, 92] [29]

[G Arango, 93] [33]

[ Jones, 93] [21]

[S Komiya, 94] [30]

[Paul Kogut, 95] [28]

[W Frakes, 96] [22]

[S Channarukul, 05] [32]

[P Gomes, 06] [31]

[J E Ettlie, 08] [27]

Documentation

9

[J.Sametinger, 96] [3]

[Childs and Sametinger, 96] [4]

[David M. Levy, 93] [5]

[Aida Boukottaya et al, 06] [6]

[David Barta et al, 96] [7]

[E. Guerrieri, 98] [8]

[Jones C, 93] [21]

[W Frakes, 96] [22]

[R. Leach, 97] [23]

Estimation template 2 [ Jones, C. 93] [21]

[W Frakes 96] [22]

Human Interface

5 [Jones C, 1993] [21]

[Lozano-Tello et al, 02] [66]

[Frakes, W et al, 96] [22]

[Robert Bogue, 06] [67]

[Swanson, 89] [68]

Knowledge 7 [P. Gomes et al, 06] [58]

[Parsons et al, 04] [59]

[Kucza et al, 01] [60]

[Von Krogh et al, 05] [61]

[Liu Xue-Mei et al, 09] [62]

[Wai Fong Boh, 08] [31]

[Althoff et al, 99] [64]

[Hall, 87] [117]

[Yglesias, 93] [118]

[Soundarajan, 98] [119]

15

Models 1 [Larsen, 06] [71]

Modules 4 [Isoda, 92] [69]

[Frakes, W.B, et al, 94] [70]

[Leach, 97] [23]

[D. L. Parnas, 1972] [187]

Plans 7 [Bernhard Nebel, 94] [9]

[Subbarao Kambhampati,94] [10]

[Subbarao Kambhampati,90] [11]

[L Spalazzi, 01] [12]

[Jones, C. 93] [21]

[W Frakes, 96] [22]

[R.Leach, 97] [23]

Requirements 19 [Krueger, 92] [34]

[Leach, 97] [23]

[Jacobson et al, 97] [35]

[Sametinger, 97] [36]

[Monzon, A. 08] [43]

[Cyril Montaberta et al, 09] [207]

[Thais Ebling et al, 09] [45]

[Lam, W. 97] [46]

[Spanoudakis et al, 96] [47]

[C. Montabert et al, 05] [48]

[Erdvinas Perednikas 08] [49]

[B. Keepence et al, 95] [50]

[Lam, W et al, 97] [51]

[Antonellis et al, 93] [52]

[Johnson et al, 91] [53]

[Gotzhein et al, 98] [54]

[Lopez et al, 02] [55]

[Philippe Massonet et al, 97] [56]

[Moon et al, 05] [57]

Service Contracts 2 [Haibin Zhu, 05] [24]

[Lucas et al, 97] [202]

Test Cases/ Test Designs

10

[Mark Folkerts, 08] [13]

[David Binkley et al., 95] [14]

[A V Mayrhauser, 93] [15]

16

[Mikko Karinsalo et al., 04] [16]

[D D Lonngren, 98] [17]

[J D. McGregor, 02] [18]

[Yunwei Dong, 08] [19]

[J A Dallal, 08] [20]

[Jones, C. 93] [21]

[W Frakes, 96] [22]

Table 6: Reusable Assets Table

2.2.2 Bubble Graph In figure 2, we present a systematic mapping using a bubble graph. Bubble graph briefing is given below.

Figure 2: Detailed Analysis through Bubble Graph

17

The size of the bubble depends upon the number of studies in that bubble. The bubbles at the intersection of the axes contain reference numbers of the studies. The X-Axis is divided in to two halves i.e., the left and right halves. On the right half of the X-Axis in figure 2, we show the validation status of the studies and also indicate which type of validation; the study falls in to (like industrial case study, academic case study, academic experiment, industrial experiment, survey). On the left half of the X-Axis we present the studies which proposed a method, model or an approach for reusing a particular asset. The Y-Axis deals with the asset categories like Algorithms, Architecture, Data, Design, Documentation, Estimates, Human Interfaces, Knowledge, Models, Modules, Plans, Requirements, Service Contracts and Test Cases.

2.2.3 Results

Detailed tables of reusable assets obtained through our review are shown in appendix section A2. We briefly introduce each asset type: 1. Algorithms: Algorithmic reuse is the reuse of algorithms as a solution every time for the same type of problems that occur. Reusable algorithms are used in software designs [25].

2. Architecture: The architecture is an organizational structure of a system or component [154].

3. Data: Data reuse in a particular project makes it easier to achieve the continuous processes improvement or in improving the development process. Data in the sense, an experience that is recorded during the previous projects [1].

4. Design: The key to reusing design is to use the models to capture design knowledge and facilitate the early analysis of system properties [28].

5. Documentation: A document may contain important information of a project and can be reused for the similar projects or next version of the project. Generally new documents are designed which often share features of the old ones. This is all to reduce time and cost [4] [5] [7].

6. Estimation Template: For estimating the new project in order to forecaster what it takes to successfully complete it, reusing the estimation templates of the older projects is a better choice. It makes our estimation work easy. Regarding estimation templates, we could find only 2 studies that too they have only mentioned about it.

7. Human Interface: An interface enables information to be passed between a human user and hardware or software components of a computer system [154].

8. Knowledge: Knowledge generated during the software development process can be a valuable asset for a software company. But in order to take advantage of this knowledge, the company must store it for reuse. Knowledge can be obtained from all the phases of Software Development Life Cycle [86]. The knowledge may represent the experience, idea or reasoning [33] [117] [118] [119].

9. Models: A model can depict critical solutions and insights to a problem and hence it can be considered as an asset for an organization. A pattern which explains a recurring problem and solutions to those recurring problems can be expressed as a model. A model is a type of asset which may or may not implement a pattern specification [71].

10. Modules: Module is a file that contains instructions. "Module" implies a single executable file that is only a part of the application, such as a DLL [164].

11. Plans: Plans mean project plans. The parts of the old plans can be reused by the planner for the new versions.

12. Requirements: A condition or capability that must be met or possessed by a system or system component to satisfy a contract, standard, specification, or other formally imposed documents [154].

18

13. Service Contracts: Service contract is an agreement between developers/designers and the user of the reusable asset. This is also called reuse contracts. The service contracts acts as an interface for the reusers of an asset. The service contract helps us in guiding how a software asset can be reused, how and why the asset is being reused. This information can be helpful in predicting where and how the system can be tested, what problems might occur and how to rectify the problem, after the system is evolved [202]. The service contracts should have the below properties [24]: (1) It should be concise and understandable (2) It should be clear and unambiguous (3) It should be concrete and easily evaluated to easily evaluate the quality of a service.

14. Test Case/ Test Design: Test cases that are developed for the previous versions can be reused for the next version and so on. They can be reused many times for different versions belonging to same family [13, 15, and 17].

Other Assets: Swanson in [68] mentioned few other assets apart from the above mentioned assets which includes user application components, system software, data resources, distributed processing components, network gateways, communication facilities, network management, application design/ development tools and end user computing facilities. Leech [23] mentioned few other assets like reconfiguration of flexible and reusable systems, negotiation with software vendors, classes and instances, transformation systems, filters, glue ware, negotiations with customers, interface specifications, inputs to application generators and inputs to very high level languages.

2.3 Analysis Source code is most commonly reused and thus many had the misconception that software reuse is the reuse of source code alone [36]. But the true cost of the system depends upon all the activities and assets during the software development lifecycle. Reuse of source code cannot alone save money. Some studies have shown that though 50% of the code was reused the cost savings on the software product was much smaller. This motivated us to find the other reusable assets apart from coding [92]. Through our review, we could find 14 reusable assets. These 14 reusable assets are shown in section 2.2.1.

1. Algorithms 2. Architecture 3. Data 4. Designs 5. Documentation 6. Estimates 7. Human Interfaces 8. Knowledge 9. Models 10. Modules 11. Plans 12. Requirements 13. Service Contracts 14. Test Cases

Through our analysis, we could find that the authors who mentioned or discussed about the reusable assets have only some reusable assets in common. That is each author presented his own set of reusable assets, but they don't exactly match with the set proposed by other authors. It can be clearly understood that, no author have actually tried to present a set of reusable assets that can be accepted by most of the researchers in the software reuse community. By doing a review, we tried to present a set of reusable assets in our report, which are mentioned by different authors.

19

2.3.1 State of Validation

We present a graph (in figure 3) to show the validation status of each reusable asset. In the graph, X-Axis represents the reusable assets and Y-Axis represents the percentage of validation (i.e., number of validated and non-validated studies and reviews as well). As the gathered studies also contain reviews which dont come under validated or non-validated studies, they are presented in the graph along with the validated and non-validated studies. The validated and non-validated studies along with reviews will sum up to 100 percent.

AlgorithmsArchitecture

DataDesign

DocumentationEstimates

HCIKnowledge

ModelsModules

PlansRequirements

Service ContractsTest cases

0

20

40

60

80

100

120

Validated Non-ValidatedReview s

Reusable Assets

Percentag

e of Validation

Figure 3: Validation Status (Reusable Assets)

Overall validation status

There are 3 types of studies found regarding this RQ1 and RQ1.1. Among these three types of studies, some studies just mentioned that a particular asset can be reusable, some studies have proposed a model or method or approach for reusing a particular asset and some studies are reviews of the previous studies.

Excluding the review studies, when we observe the other two types of studies, the percentage of validated ones is very less and is about 27%.

The percentage of non-validated studies reaches its height with about 68% of the total number of found studies and the remaining share is occupied by the reviews with about 5%

Among the found validated studies about 55% are academically validated, 25% are industrially validated and 20% are validated through surveys.

Figure 3 show that the industries are not keeping much effort on validation as we can observe that most of validated studies are only validated academically.

Validation status of studies for each reusable asset

From figure 3, more than half of the reusable assets are non-validated. A good contribution of the validated studies can be observed for knowledge (54%) with 3 industrial validations, 1 academic validation and 2 studies validated through surveys from the total of the 11 studies. The validation studies can also be noticed for plans (43%), documentation (33%), design (30%), requirements (21%) and test cases (20%) respectively. Some assets like algorithms, architecture, data, estimates, human interfaces, models, modules and service contracts are

20

discussed but not validated. And hence, the non-validated bar shows 100% for these reusable assets in figure 3. This shows that there is less contribution in terms of these reusable assets.

2.3.2 Assets in Focus In this section, we present a surface graph (in figure 4) which shows the assets in focus. Figure 4 shows the number of studies found per year.

19721973

19741975

19761977

19781979

19801981

19821983

19841985

19861987

19881989

19901991

19921993

19941995

19961997

19981999

20002001

20022003

20042005

20062007

20082009

0

2

4

6

8

10

12

14

16

Assets in Focus

Test casesService contractsRequirementsPlans

ModulesModelsKnow ledgeHuman InterfacesEstimates/ TemplatesDocumentationDesign

DataArchitectureAlgorithm

Years

Num

ber o

f Studies

Figure 4: Assets in Focus

From figure 4, we can notice that research contribution is more during the periods between 1991 and 1999. Research contribution focused on reusable assets like algorithms, architecture, data, designs, documentation, estimation templates, human interfaces, knowledge, modules, plans, requirements and test cases during this period. Much focus is put on requirements, documentation, and architecture when compared with other assets. From 2000 to 2009, the research contribution extended to other reusable assets like models and service contracts. Above all, a very less focus is put on algorithms, service contracts, models and estimation templates. From the figure, we can notice a fall in the research contributions during 2009. The reason for this is that, we have limited our search till the first half of 2009. And so, only few articles are found during the first half of 2009.

Problems:

1. The lack of tool support for indexing, searching, retrieval and browsing of reusable assets makes it difficult for reusing the software assets [203].

2. The basic problem faced by the software reuse community is that, though it can be extended to many related research areas, it is remaining as a closed group. This can be treated as a reason for not introducing a standard set of reusable assets till date [168].

3. Reusing the assets involve lot of capital to be invested on the domain engineering, building libraries of assets, organizing these assets and for training the engineers for systematically reusing these assets [63].

4. From the organizations perspective, most organizations deal with one project at a time. Reuse of assets comes in to picture when the organizations are dealing with series of projects [63].

21

3 VALUE OF REUSE In this section, we introduce reuse metrics and models for measuring reuse to assess

its value. Now a day, organizations are interested in implementing reuse program. As the reuse is growing in software industry, there is a growing need to assess the value of reuse by measuring it, which helps to know their success. According to Frakes [107], software reuse is based on science and engineering and so it must be treated as an empirical discipline. As the concepts like reuse and reusability emerged, a question arose on how to measure them in order to get success through reuse. For measuring reuse, reuse metrics and models have been defined for many areas of software reuse and categorized into 6 categories [22]. (A Metric is a quantifiable measurement of an attribute of a software product [107] [22]. A Model is a stated relationship among metric variables [107] [22]). The six categories are discussed in section 3.2.

3.1 Methodology Execution The search terms and search combinations are formed in order to answer the research questions RQ2, RQ2.1 and RQ2.2. In the table 7, search terms and search combinations are presented. Some search terms 1, 2, 3, 4, 5, 6, 14, 15 and 23 in table 7 were considered as initial search terms. After applying these search terms, we could find few studies and among them is a study by Frakes W B [22] in which he categorized the reuse metrics and models in to 6 categories. For discussing in details regarding these 6 categories mentioned by [22], we have again formed the search terms for each and every category as shown in table 7. For example: A category named Maturity Assessment is present in Frakes Taxonomy. In order to get detailed information regarding maturity assessment, we have used "Maturity" and "Assessment" as our search terms.

Search Terms

1. Software 2. Reuse 3. Amount 4. Cost 5. Benefit 6. Investment 7. Assessment 8. Maturity 9. Reusability 10. Failure 11. Modes 12. Models 13. Metrics 14. Measurement 15. Value 16. Library 17. Quality 18. Business 19. Level 20. Frequency 21. Ratio 22. Density 23. Economics 24. Reused

22

Search Questions 1. software AND reuse AND metrics AND measurement 2. Amount AND reuse 3. cost AND benefit AND analysis 4. Quality AND investment AND reuse AND{ software AND reuse AND metrics AND measurement} 5. return AND investment AND{software AND reuse AND metrics AND measurement} 6. business AND reuse AND metrics 7. reusability AND assessment 8. Maturity AND assessment AND reuse 9. Failure AND modes AND models 10. Reuse AND library AND metrics 11. reuse AND value 12. reuse AND level 13. reuse AND frequency 14. reuse AND ratio 15. reuse AND density

Table 7: Search Terms and Search Questions for Value of Reuse

The articles are obtained by executing the basic inclusion criteria along with the detailed inclusion/ exclusion criteria. These criteria are discussed in section 1.6. The number of hits obtained after each phase are tabulated in table 8.

Number of Hits Basic inclusion criteria


Snowball sampling

Google scholar - - - 9

ACM 10253 156 13 -

Inspec 2279 333 8 -

IEEE 30 30 15 -

Elsevier 428 86 5 -

Springer 12116 58 2 -

Total 25106 663 43 9

Table 8: Hits after Each Phase for RQ2

NOTE: A list of references for this Chapter 3 are shown in Table 40 in Appendix

3.2 Results In this section, the results which were found through our review are presented.

3.2.1 Taxonomy We present taxonomy of reuse metrics and models in which different categories and sub-categories of reuse metrics/models/methods are presented. This taxonomy is based on the taxonomy defined by Frakes in [22]. Frakes [22] in his taxonomy does not show subcategories. But going deep into the report, we could find that some categories do have the subcategories. And to his taxonomy, we have added some other subcategories in cost benefit analysis models and amount of reuse metrics. These are not mentioned by Frakes [22]. But, we have gone through other studies of Jorge Mascena [113], Frakes [95] and Suri [111] in which they have mentioned the subcategories to amount of reuse metrics category along with those mentioned by Frakes [22].

23

Figure 5: Reuse Metrics and Models Taxonomy

The reuse metrics and models are divided into 6 categories as shown figure 5 [22, 107].

1. Cost Benefit Analysis Models

2. Maturity Assessment Models

3. Amount of Reuse Metrics

4. Failure Modes Model

5. Reusability Assessment

6. Reuse Library Metrics

The resultant tables for value of reuse obtained through our review are shown in Appendix section A3. Here we also consider the articles which deal with reuse metrics and models for reusable code.

1. Cost Benefits Analysis Models

Cost benefit analysis helps to know the cost benefits of implementing reuse. These models include economic cost benefit analysis, return on investment, quality of investment and productivity pay-offs. These models are for assisting the organization in estimating their cost, effort, and time which is involved in systematic reuse.

The value of software reuse refers to whether it is more cost effective, in terms of time, money, or personnel, to reuse software as opposed to developing it from scratch each time it is needed Frakes [102].

This cost benefit analysis models category is subdivided into 4 types [22, 107]

1. Economic Cost Benefit Analysis: Economic Cost Benefit Analysis helps in assessing the costs of reusing a reusable component.

2. Return on Investment Analysis: It is one of several approaches to evaluating and comparing investments. It helps us to know the benefits. A good Return on Investment means that the investment returns compare favorably to investment costs. This analysis is crucial for reuse investments.

24

3. Quality of Investment: Quality of investment helps in making a good reuse investment.

4. Business Reuse Metrics: These metrics help in assessing or estimating the effort saved by reuse.

2. Maturity Assessment Models

Maturity assessment is needed by an organization in assessing the degree of maturity of its reuse implementation process. Reuse maturity assessment models will help the organizations in estimating how advanced the reuse programs are in implementing systematic reuse [22]. This helps the organizations to know their progress in implementing reuse programs.

3. Amount of Reuse Metrics

Amount of reuse metrics is used to estimate how much of reuse is done in a give life cycle object. According to [22], amount of reuse metrics are used to assessing and also monitoring the reuse improvement effort by tracking of the percentages of reuse for life cycle objects. The amount of reuse metrics is subdivided into six types:

1. Reuse level: Reuse level is the ratio of number of reused items to the total number of items [198,112] [186].

2. Reuse percent: Reuse percent is the ratio of number of reused lines of code to the total number of lines of code [115, 112] [186].

3. Reuse frequency: Reuse frequency is the ratio of number of references to the reused items to the total number of references [198,112]

4. Reuse ratio: Reuse ratio considers partially changed items as reused and is same as reuse percent [200,112].

5. Reuse density: Reuse density is the ratio of number of reused parts to the total number of lines of code [112, 201].

6. Reuse size and frequency: Reuse size and frequency is similar to reuse frequency and considers the size of items in number of lines of code (LOC) [200, 112].

4. Failure Modes Models

Failure modes analysis is used to identify and order the obstacles to reuse in an organization. Failure modes analysis gives us an approach for measuring the reuse process and improving it which is based on a model of the ways a reuse process fails [108, 22].

5. Reusability Assessment

Reusability metrics indicate the possibility that an artifact is reusable or the readiness of an artifact or asset to be reusable. In this the attributes of a component which indicate its reusability are measured [22].

6. Reuse Library Metrics

Reuse library metrics are used for managing and tracking the reuse repository usage. The Indexing schemes in the reuse library are evaluated by using these metrics. For evaluating the indexing schemes the reuse library metrics and their definitions are are [22]:

1. Indexing costs: Measuring the cost of creating, maintaining, updating a classification scheme.

2. Searching effectiveness: Assess how well the classification schemes help users to search effectively for reusable components.

25

3. Support for understanding: Measures how well a classification scheme helps the users to understand the components.

4. Efficiency: Measure the efficiency of reusable library in terms of memory, fastness etc.

In addition to this, Quality of the assets is also a measure for reuse library metrics which was derived by Frakes in 1987.

3.2.2 Bubble graph The size of the bubble depends upon the number of studies in that bubble. The bubbles at the intersection of the axes contain reference numbers of the studies. The X-Axis is divided in to two halves i.e., the left and right halves. On the right half of the X-Axis in figure 6, we show the validation status of the studies and also indicate which type of validation; the study falls in to (like industrial case study, academic case study, academic experiment, industrial experiment, survey). On the left half of the X-Axis we present the studies which proposed a method, model, metrics or an approach for measuring reuse to assess its value. The Y-axis has six reuse metrics and models categories (Cost benefit analysis models, Maturity assessment models, Amount of reuse metrics, Failure modes models, Reusability assessment, and Reuse library metrics).

Figure 6: Systematic Mapping for Value of Reuse (X-Axis: Study category; Y-Axis: Reuse metric categories)

26

3.2.3 Results Table 37 shows the categories and their subcategories along with studies and their contributions. In the "Contribution name (or) study name column, we presented the contribution name only if the contribution is named by the author and if not we have presented the study name itself. In this column, the text in between the inverted comas is the study name and the text without inverted comas is the contribution name. The category and subcategories presented in the table 37 are based on the Frakes [22] taxonomy.

Note: Not all the studies, those belonging to this category are included because some of the studies like the review papers are not presented in this table. Only the studies with a contribution to particular category are presented. For further details regarding the review papers in each category see Appendix section A3 (result tables).

Category

Sub-category

Contribution Contribution name

or Study title name

Author name and

Reference number Cost benefit analysis models

Economic cost benefit analysis

Model Model for Economics of reuse Barnes, B et al [74] Model Cost of Development Model Gaffney, J E et al [72]

Validation of previous model

in [74]

"What price reusability? A case study"

Favor, J [73]

Validation of previous model

in [72]

"Software reuse economics: cost benefit analysis on a large

scale Ada project".

Margono, J et al [75]

Model Development phase model Malan, R et al [76] A study of existing

industrial case studies

Quality, productivity and economic benefits of software reuse: a review of industrial

studies

P Mohagheghi et al [79]

Model Cost estimation model Jasmine, K S et al [80] Return on Investment

Metric "A reuse metrics and return on investment model"

J S Poulin et al [115]

Formulae "Return on invest models" K El Emam [116] Quality of investment

Metric An analytical approach for making good reuse

investments

Barns, B H et al [81]

Business Reuse metrics

Metric "A reuse metrics and return on investment model"


Metric Metrics to estimate the effort saved by reuse (used by IBM)


Maturity assessment models

No subcategories

Model Koltun and Hudson reuse maturity model

Koltun, P et al [83]

Model STARS reuse maturity model Davis, M J [84] Model A reuse capability model Davis, T [85]

Validation of model in [85]

"Investments in reusable software, A study of software reuse investment success

factors"

Rine, D C et al [106]

Model "A phased reuse adoption model"

S Wartik et al [86]

Model Reuse reference model D C Rine et al [87] Model RiSE maturity model Almeida, E S et al [88]

27

Extension to a model

"Towards a maturity model for reuse incremental adoption" (Extension to RiSE Maturity

model)

Garcia, V C et al [89]

Amount of reuse metrics

Reuse level

Model Reuse level metrics Frakes, W B [109] Metric Reuse level metric Frakes, W B et al [198] Model Reuse level metrics Terry, C [110] Model "Modeling reuse across the

software life cycle" W B Frakes et al [91]

Method "Methods of measuring software reuse for the

prediction of maintenance effort"

Leach, R J [92]

Reuse level

Metric "Software reuse: metrics and models"

W B Frakes

Documents

Thesis_Final_Draft_MSE_2010_07_(1).pdf