Download pdf - Building Cost Estimation Models using Homogeneous Data

BuildingCost Estimation Models

Using Homogeneous Data

Rahul Premraj

Thomas Zimmermann

Saarland University, Germany

University of Calgary, Canada

software engineering

data

Cross versus Within-Company CostEstimation Studies: A Systematic Review

Barbara A. Kitchenham, Member, IEEE Computer Society, Emilia Mendes, andGuilherme H. Travassos

Abstract—The objective of this paper is to determine under what circumstances individual organizations would be able to rely oncross-company-based estimation models. We performed a systematic review of studies that compared predictions from cross-

company models with predictions from within-company models based on analysis of project data. Ten papers compared cross-company and within-company estimation models; however, only seven presented independent results. Of those seven, three found

that cross-company models were not significantly different from within-company models, and four found that cross-company modelswere significantly worse than within-company models. Experimental procedures used by the studies differed making it impossible to

undertake formal meta-analysis of the results. The main trend distinguishing study results was that studies with small within-companydata sets (i.e., < 20 projects) that used leave-one-out cross validation all found that the within-company model was significantly

different (better) from the cross-company model. The results of this review are inconclusive. It is clear that some organizations wouldbe ill-served by cross-company models whereas others would benefit. Further studies are needed, but they must be independent (i.e.,

based on different data bases or at least different single company data sets) and should address specific hypotheses concerning theconditions that would favor cross-company or within-company models. In addition, experimenters need to standardize their

experimental procedures to enable formal meta-analysis, and recommendations are made in Section 3.

Index Terms—Cost estimation, management, systematic review, software engineering.

Ç

1 INTRODUCTION

EARLY studies of cost estimation models (e.g., [12],[8]) suggested that general-purpose models such as

COCOMO [1] and SLIM [24] needed to be calibrated tospecific companies before they could be used effectively.Taking this result further and following the proposals madeby DeMarco [4], Kok et al. [14] suggested that costestimation models should be developed only from single-company data. However, three main problems can occurwhen relying on within-company data sets [3], [2]:

1. The time required to accumulate enough data onpast projects from a single company may beprohibitive.

2. By the time the data set is large enough to be of use,technologies used by the company may havechanged, and older projects may no longer berepresentative of current practices.

3. Care is necessary as data needs to be collected in aconsistent manner.

These problems motivated the use of cross-companymodels (models built using cross-company data sets, whichare data sets containing data from several companies) foreffort estimation and productivity benchmarking, and,subsequently, several studies compared the predictionaccuracy between cross-company and within-companymodels. In 1999, Maxwell et al. [18] analyzed a cross-company benchmarking database by comparing the accu-racy of a within-company cost model with the accuracy of across-company cost model. They claimed that the within-company model was more accurate than the cross-companymodel, based on the same holdout sample. In the same year,Briand et al. [2] found that cross-company models could beas accurate as within-company models. The following year,Briand et al. [3] reanalyzed the data set employed byMaxwell et al. [18] and concluded that cross-companymodels were as good as within-company models. Twoyears later, Wieczorek and Ruhe [26] confirmed this sametrend using the same data set employed by [2]. Three yearslater, Mendes et al. [20] also confirmed the same trend usingyet another data set.

These results seemed to contradict the results of theearlier studies and pave the way for improved estimationmethods for companies that did not have their own projectdata. However, other researchers found less encouragingresults. Jeffery et al. undertook two studies, both of whichsuggested that within-company models were superior tocross-company models [6], [7]. Two years later, Lefley andShepperd claimed that the within-company model wasmore accurate than the cross-company model, using thesame data set employed by Wieczorek and Ruhe [26] andBriand et al. [2]. Finally, a year later Kitchenham and

316 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 33, NO. 5, MAY 2007

. B.A. Kitchenham is with the School of Computing and Mathematics,University of Keele, Keele Village, Staffordshire, ST5 5BG, UK.E-mail: [email protected].

. E. Mendes is with the Computer Science Department, University ofAuckland, Private Bag 92019, Auckland, New Zealand.E-mail: [email protected].

. G.H. Travassos is with UFRJ/COPPE, Systems Engineering andComputer Science Program, PO Box 68511, 21941-972 Rio de Janeiro—RJ, Brazil. E-mail: [email protected].

Manuscript received 6 June 2006; revised 27 Nov. 2006; accepted 2 Jan. 2007;published online 20 Feb. 2007.Recommended for acceptance by A. Mockus.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log Number TSE-0129-0606.Digital Object Identifier no. 10.1109/TSE.2007.1001.

0098-5589/07/$25.00 ! 2007 IEEE Published by the IEEE Computer Society

systematic ReviewMay, 2007

Cross versus Within-Company CostEstimation Studies: A Systematic Review

Barbara A. Kitchenham, Member, IEEE Computer Society, Emilia Mendes, andGuilherme H. Travassos

Abstract—The objective of this paper is to determine under what circumstances individual organizations would be able to rely oncross-company-based estimation models. We performed a systematic review of studies that compared predictions from cross-

company models with predictions from within-company models based on analysis of project data. Ten papers compared cross-company and within-company estimation models; however, only seven presented independent results. Of those seven, three found

that cross-company models were not significantly different from within-company models, and four found that cross-company modelswere significantly worse than within-company models. Experimental procedures used by the studies differed making it impossible to

undertake formal meta-analysis of the results. The main trend distinguishing study results was that studies with small within-companydata sets (i.e., < 20 projects) that used leave-one-out cross validation all found that the within-company model was significantly

different (better) from the cross-company model. The results of this review are inconclusive. It is clear that some organizations wouldbe ill-served by cross-company models whereas others would benefit. Further studies are needed, but they must be independent (i.e.,

based on different data bases or at least different single company data sets) and should address specific hypotheses concerning theconditions that would favor cross-company or within-company models. In addition, experimenters need to standardize their

experimental procedures to enable formal meta-analysis, and recommendations are made in Section 3.

Index Terms—Cost estimation, management, systematic review, software engineering.

Ç

1 INTRODUCTION

EARLY studies of cost estimation models (e.g., [12],[8]) suggested that general-purpose models such as

COCOMO [1] and SLIM [24] needed to be calibrated tospecific companies before they could be used effectively.Taking this result further and following the proposals madeby DeMarco [4], Kok et al. [14] suggested that costestimation models should be developed only from single-company data. However, three main problems can occurwhen relying on within-company data sets [3], [2]:

1. The time required to accumulate enough data onpast projects from a single company may beprohibitive.

2. By the time the data set is large enough to be of use,technologies used by the company may havechanged, and older projects may no longer berepresentative of current practices.

3. Care is necessary as data needs to be collected in aconsistent manner.

These problems motivated the use of cross-companymodels (models built using cross-company data sets, whichare data sets containing data from several companies) foreffort estimation and productivity benchmarking, and,subsequently, several studies compared the predictionaccuracy between cross-company and within-companymodels. In 1999, Maxwell et al. [18] analyzed a cross-company benchmarking database by comparing the accu-racy of a within-company cost model with the accuracy of across-company cost model. They claimed that the within-company model was more accurate than the cross-companymodel, based on the same holdout sample. In the same year,Briand et al. [2] found that cross-company models could beas accurate as within-company models. The following year,Briand et al. [3] reanalyzed the data set employed byMaxwell et al. [18] and concluded that cross-companymodels were as good as within-company models. Twoyears later, Wieczorek and Ruhe [26] confirmed this sametrend using the same data set employed by [2]. Three yearslater, Mendes et al. [20] also confirmed the same trend usingyet another data set.

These results seemed to contradict the results of theearlier studies and pave the way for improved estimationmethods for companies that did not have their own projectdata. However, other researchers found less encouragingresults. Jeffery et al. undertook two studies, both of whichsuggested that within-company models were superior tocross-company models [6], [7]. Two years later, Lefley andShepperd claimed that the within-company model wasmore accurate than the cross-company model, using thesame data set employed by Wieczorek and Ruhe [26] andBriand et al. [2]. Finally, a year later Kitchenham and

316 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 33, NO. 5, MAY 2007

. B.A. Kitchenham is with the School of Computing and Mathematics,University of Keele, Keele Village, Staffordshire, ST5 5BG, UK.E-mail: [email protected].

. E. Mendes is with the Computer Science Department, University ofAuckland, Private Bag 92019, Auckland, New Zealand.E-mail: [email protected].

. G.H. Travassos is with UFRJ/COPPE, Systems Engineering andComputer Science Program, PO Box 68511, 21941-972 Rio de Janeiro—RJ, Brazil. E-mail: [email protected].

Manuscript received 6 June 2006; revised 27 Nov. 2006; accepted 2 Jan. 2007;published online 20 Feb. 2007.Recommended for acceptance by A. Mockus.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log Number TSE-0129-0606.Digital Object Identifier no. 10.1109/TSE.2007.1001.

0098-5589/07/$25.00 ! 2007 IEEE Published by the IEEE Computer Society

systematic ReviewBarbara Kitchenham

Emilia Mendes

Guilherme Travassos

May, 2007

Company Specific Models

CrossCompany Models

No Trend


CrossCompany Models

No Trend

(four studies) (four studies) (two studies)

Barbara Kitchenham Emilia Mendes Katrina Maxwell

Lionel Briand Martin ShepperdIsabella Wieczorek


CrossCompany Models

No Trend





CrossCompany Models

No Trend


2

2

2




CrossCompany Models

No Trend


2

3

2 2

2

1

1




CrossCompany Models

No Trend


2

3

2 2

2

1

1

1

1

meet Erica

she works here

she is a metrics

consultant

her job

Erica’s Boss has a

new project for her

what are my

options?


CrossCompany Models


CrossCompany Models

Business Specific Models

why

BusinessSector?

An Empirical Analysis of Software Productivity Over Time

Rahul PremrajBournemouth University, UK

[email protected]

Martin ShepperdBrunel University, UK

[email protected]

Barbara Kitchenham!

National ICT, [email protected]

Pekka ForseliusSTTF Oy, Finland

[email protected]

Abstract

OBJECTIVE - the aim is to investigate how softwareproject productivity has changed over time. Within thisoverall goal we also compare productivity between differ-ent business sectors and seek to identify major drivers.METHOD - we analysed a data set of more than 600projects that have been collected from a number of Finnishcompanies since 1978.RESULTS - overall, we observed a quite pronounced im-provement in productivity over the entire time period,though, this improvement is less marked since the 1990s.However, the trend is not smooth. We also observed pro-ductivity variability between company and business sec-tor.CONCLUSIONS - whilst this data set is not a ran-dom sample so generalisation is somewhat problematic,we hope that it contributes to an overall body of knowl-edge about software productivity and thereby facilitates theconstruction of a bigger picture.Keywords: project management, projects, software produc-tivity, trend analysis, empirical analysis.

1. Introduction

Given the importance and size of the software industry it isno surprise that there is a great deal of interest in productiv-ity trends and in particular whether the industry, as a whole,is improving over time. Obviously this is a complex ques-tion for at least three reasons.

First, productivity is difficult to measure because the tra-ditional definition, i.e. the ratio of outputs to inputs re-quires that we have objective methods of measuring both

! Barbara Kitchenham is also with Keele University, UK [email protected]

commodities. Unfortunately, for software the notion of out-put is not straightforward. Lines of code are problematicdue to issues of layout, differing language and the fact thatmost software engineering activity does not directly involvecode. An alternative is Function Points (FPs), in its variousflavours, which although subject to some criticism [?] arein quite widespread use and so in a sense represent the leastbad alternative. In our analysis the output (or size) measurecollected is Experience Points 2.0 [?], a variant of FPs.

Second, productivity is impacted by a very large num-ber of factors, many of which are inherently difficult to as-sess, e.g. task difficulty, skill of the project team, ease ofinteraction with the customer/client and the level of non-functional requirements imposed such as dependability andperformance.

Third, there are clear interactions between many of thesefactors so for instance, it is easier to be productive if qualitycan be disregarded.

Despite these caveats, this paper seeks to analyse soft-ware project productivity trends from 1978-2003 froma data set of more than 600 projects from Finland. Theprojects are varied in size (6 - 5000+ FPs), business sec-tor (e.g. Retail) and type (New Development or Mainte-nance). However, we believe there are sufficient data todraw some preliminary conclusions.

The remainder of the paper is organised as follows. Thenext section very briefly reviews some related work includ-ing a similar, earlier study by Maxwell and Forselius [?].Next we describe the data set used for our analysis. We thengive the results of our analysis, first overall and then af-ter splitting the data set into groups of more closely relatedprojects. We conclude with a discussion of the significanceof the results and some comments on the actual process ofanalysing the data.

METRICS, 2005

Business Specific Models

0 200 400 600 800

All Data Cleaned Data

Finnish data set

788

395

E!ort = !Size!

Regression modelE

!ort

Size

Test Sets

Test Sets

companies

Test Sets

companiesA B C D E

Research ObjectivesTo develop company-specific cost models for comparisons against other models.

1.

Training Data Testing Data

Research ObjectivesTo develop cross-company cost models to compare against company-specific cost models.

I1.


Research ObjectivesTo develop business-specific models to compare their accuracy against company-specific and cross-company cost models.

I1I.


Research ObjectivesTo develop business-specific cost models to determine if they can be used by companies from other business sectors.

IV.


Pred (50)Pred (25)

1.00 25.75 50.50 75.25 100.00

Pred (50) Pred (25)

better

Pred (50)Pred (25)

1.00 25.75 50.50 75.25 100.00

Pred (50) Pred (25)

1.00 25.75 50.50 75.25 100.00

MdMRE MMRE

better

MdMREMMRE

better

for comparability

Company-SpecificCost Models

TestingTraining

A

B

C

D

E

0 25 50 75 100

Pred50 Pred25

better

A

B

C

D

E

0 25 50 75 100

MdMRE MMRE

better

Cross-CompanyCost Models

TestingTraining

A

B

C

D

E

0 25 50 75 100

Pred50 Pred25

better

A

B

C

D

E

0 25 50 75 100

MdMRE MMRE

better

Business-SpecificCost Models

TestingTraining

A

B

C

D

E

0 25 50 75 100

Pred50 Pred25

better

A

B

C

D

E

0 25 50 75 100

MdMRE MMRE

better

Cross-BusinessCost Models

TestingTraining

• Projects from some sectors could be used to predict for projects from other sectors.

• For example, Retail sector projects could predict with high accuracy (Pred50 > 50%).

• But projects from sectors are best used to predict for themselves.

Picture: Mike, Delfini Group

Threats to Validity

Threats to Validity

Threats to Validity

external • Projects originated from Finland only.

Threats to Validity

external • Projects originated from Finland only.

internal • Data cleaning removed nearly half the projects.• Only used Size as independent variable.

Conclusions

Conclusions



Company

Specific Models

Cross

Company ModelsNo Trend


2

3

2 2

2

1

1

1

1

Conclusions



Company

Specific Models

Cross



2

3

2 2

2

1

1

1

1

what are my

options?

Conclusions



Company

Specific Models

Cross



2

3

2 2

2

1

1

1

1

what are my

options?

Company

Specific Models

Cross

Company Models

Business

Specific Models

Conclusions

Conclusions• No model performed consistently well

across all experiments.



• Business-specific models performed comparably to company-specific models.




• Business-specific models performed better than cross-company models.





• Reducing heterogeneity in data may increase their applicability to problems.





• Reducing heterogeneity in data may increase their applicability to problems.

• ... and lead to better prediction models.

Open Questions

Open Questions

• Can we use other algorithms such as decision trees and statistical clustering?

Open Questions


• What are the commonalities amongst projects?

Open Questions



• Does heterogeneity in data sets impact other software engineering areas?

Open Questions



• Does heterogeneity in data sets impact other software engineering areas?

Thank you!