Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
1
Software Process and Management
Gu QingNanjing University
2006-2-28
2
Chapter 3. Resource Estimation
Software Size EstimatingDuration and Cost Estimating
3
1. Software Size EstimatingSize prediction of product deliverables needed to fulfill the project requirements
Sizing, estimating and scheduling are intertwined during the project planning process
All begin with the best possible understanding of the project breakdown, as shown in the WBS
The 5 useful techniques for sizing are: LOC, function point, feature point, blitz modeling, and wideband Delphi
Account for reuse when doing size estimation
4
Process of Sizing and Estimating
5
The Estimating Steps in DetailEstablish estimation objectives
For funding requests, decisions, or planningPlan the estimation activities, allocate the resources
Domain experts, tools, detailed WBSClarify software requirements, document any assumptionsExplore as much detail as feasible
The more explored, the more accurate the estimating results, andthe less likely to miss functions
Use several independent techniquesStrengths and weaknesses of various methods are complementary
Compare, Understand and Iterate EstimatesReview estimate accuracy with actual data
6
Inputs to Size EstimatingProject proposal, project goal and scope, or statement of workStatement of requirements
Performance, specific features, results evaluationConstraints, contract for services, procedures and standards to be usedPrior experience on similar tasks, historical estimating and actual data of the organizationSystem design information, concepts of software architectureReusable software information, programming languages to be used
7
Different Size MeasuresLines of code (LOC)Function pointsFeature pointsNumber of bubbles on a data-flow diagramNumber of entities on an entity-relationship diagramCount of process / control boxes on a structure chartNumber of objects, attributes and services on an object diagramNumber of “shall” vs. “will” in a government specification or contractAmount of documentation
Doesn’t matter which one to choose as long as uses it consistently
8
Estimation Techniques
Lines of CodeFunction PointsFeature PointsBlitz ModelingWideband Delphi
9
Lines of Code
WhyOver the last 20 years, the average programmer productivity rate remains, at about 3000 LOC per programmer year
Bottom-Up SummationsDevelop the most complete WBS
Estimate the size of each lowest-level component (work packages)
Sum it upward until the whole (product) size is obtained
10
Estimate the Lowest-level Size
Expert opinion or AnalogyStandard ComponentFuzzy Logic
11
Expert opinion or AnalogyAsk experts who have developed similar components, or based on old similar ones
Provide an optimistic, pessimistic, and realistic size estimates
Compute the final estimates:(optimistic + pessimistic + 4×realistic)/6
An exampleFor a widget object, estimated between 200~400 LOC, with a belief that it will be closer to 200The result will be: (200+400+250×4)/6 = 267 LOC
12
Standard Component (1)
Make standard components based on previous projects
Judge how many these components will likely be in the new program (work package)
The smallest, likely, and largest number
Estimate the number of componentsEstimated number = (smallest + 4×likely + largest)/6
Compute the total LOC
13
Standard Component (2)An illustration
Standard component SLOC per component S M L X=(S+4×M+L)/6 SLOC
SLOC 1
Object instruction 0.28
Files 2,535 3 6 10 6.17 15,633
Modules 932 11 18 22 17.5 16,310
Subsystems 8,175
Screens 818 5 9 21 10.3 8,453
Reports 967 2 6 11 6.17 5,963
Interactive programs 1,769
Batch programs 3,214
Total 46,359
14
Fuzzy Logic (1)Get the fuzzy-logic size ranges (by Putnam)
Get the expected range of program sizesFrom smallest to largest
Divide the range into 5 equal categories on a logarithmic scaleSubdivide each category into 5 equal sets on the logarithmic scale
Make estimationDecide which category the new program falls inCompare the old programs in this category, place the new one into a sub-range
15
Fuzzy Logic (2)Fuzzy-logic size ranges (in LOC)
very small small medium large very largevery small 1,148 1,514 2,000 2,630 3,467
small 4,570 6,025 8,000 10,471 13,804medium 18,197 23,988 32,000 41,687 54,954
large 72,444 95,499 128,000 165,958 218,776very large 288,403 380,189 512,000 660,693 870,964
Fuzzy-logic size ranges ( in log(LOC) )
very small small medium large very largevery small 3.06 3.18 3.30 3.42 3.54
small 3.66 3.78 3.90 4.02 4.14medium 4.26 4.38 4.50 4.62 4.74
large 4.86 4.98 5.10 5.22 5.34very large 5.46 5.58 5.70 5.82 5.94
16
Guidelines for Counting LOCEnsure that each LOC counted contains only 1 source statementCount all delivered, executable statements, including support utilities not used by customerCount data definitions onceCount each invocation, call or inclusion of a macro as part of the source (only once)Do not count lines that contain only commentsDo not count temporary code such as debug or test codeTranslate the LOC number to assembly language equivalent LOCs, to allow cross project comparisons
17
LOC Conversion TableLanguage Assembler SLOC Average SLOC per
Function PointBasic Assembler 1 320
C 2.5 128~150
COBOL 3 105~107
Pascal 3.5 91
FORTRAN 95 4.5 71
BASIC (ANSI) 5 64
C++ 6 53
Java 6 53
Ada 95 6.5 49
Delphi 11 29
UNIX Shell Scripts (PERL) 15 21
CORBA 16 20
SQL 25 13~16
HTML 3.0 22 15
Excel 50 6
18
Advantages of LOC
It is widely used and universally accepted
It permits comparison of size and productivity metrics between diverse projects
It measures software from the developer’s point of view
It directly relates to the end product, allow for continuous improvement by post-project analysis
19
Disadvantages of LOCLOC is difficult to estimate for new software early in the life cycleThere are no industry standards for counting LOCHard to relate to functional requirements, varies with the type of platforms, design methods and programmer stylesLOC count should distinguish between generated code and hand-crafted codeLOC cannot count for other costs such as requirement specifications and design documentsLOC tends to the volume, but the essence of software is functionality and performance
20
Function Points
Put forward by A. J. Albrecht of IBM in the late 1970s, expanded by Capers Jones
In 1986, the International Function Point User Group (IFPUG) was formed
In 1987, the British government adopted a modified FP for the standard software productivity metricIn 1994, IFPUG publicize the “Function Point Counting Practices Manual” v4.0
21
The Function Point ProcessCount the functions in each category of end-user business functions
Establish the complexity of each function, assign weightsfor each complexity
Multiply each function by its weight and sum up to get a raw total FP
Apply environmental factors, calculate the complexity adjustment factor (CAF)
Compute the adjusted FPsConvert the FPs to LOCs if needed
22
Category of Functions
Count only software requirements functionsOutputsInputsInquiriesData Structures (Files)Interfaces
23
Illustration of Functions
User (Person or Application)
Business ProcessInterfaces
System Boundary
Files
24
Counting OutputsExternal things produced by the software that go to outside of the systemUnits of business information produced for the end userEach unit should be unique, i.e. with a different format / require different processing logicOutputs can be counted using context or source / sink nodes in a dataflow diagramExamples include screen data, report data, error message, etc.
25
Counting Inputs
External things received by the software from outside of the systemUnits of business information input by the user for processing or storageEach unit of input should be unique
26
Counting Inquiries (Input/Output)External commands or requests generated from outside, cause a software response
e.g. Direct accesses to a database that are real-time, use simple keys, retrieve specific data, and perform no update
Each inquiry should be unique, i.e. have a different format in input / output portions, or require different processing logic
Count both the input and output portions, but with different complexity weighting factors
27
Associated Complexity Weighting
1~5 data itemsreferenced
6~19 data items referenced
≥20 data items referenced
0 or 1 filereferenced Simple Simple Average
2 or 3 file referenced Simple Average Complex
≥4 files referenced Average Complex Complex
28
Counting Data Structures
Internal logical files within the software
Primary logical groups of user data permanently stored entirely within the boundary of the software system
Available to users via inputs, outputs, inquires, or interfaces
29
Counting Interfaces
External machine generated files used by the software
Data (control) stored outside the boundary of the software system
Data shared between systems are counted as both interfaces and data structures
Count data and control flow in each direction as a unique interface
30
Associated Complexity Weighting
1~19 data itemsreferenced
20~50 data items referenced
≥51 data items referenced
1 logical recordformat / relation Simple Simple Average
2~5 logical record format / relation Simple Average Complex
≥6 logical record format / relation Average Complex Complex
31
Environmental Factors (1)Environmental
Factors Examples of High-Scoring Software
Data Communications
Software for a multinational bank that must handle electronic monetary transfers from financial institutions around the world
Distributed Computing
A Web search engine in which the processing is performed by more than a dozen servers working in tandem
Performance Requirements
An air-traffic-control system that must continuously provide accurate, timely positions of aircraft from radar data
Constrained Configuration
A university system in which hundreds of students register for classes simultaneously
Transaction Rate
A banking software that must perform millions of transactions overnight to balance all books before the next business day
Online Inquiry / Data Entry
Mortgage approval software for which clerical workers enter data interactively into computer from paper applications
End-User Efficiency
System with touch screens by which consumers at a subway station can purchase tickets using their credit cards
32
Environmental Factors (2)Environmental
Factors Examples of High-Scoring Software
Online Update Airline system in which travel agents can book flights and obtain seat assignments
Complex Processing
Medical software that takes a patient’s various symptoms and performs extensive logical decisions to arrive at a preliminary diagnosis
Reusability A word processor that is designed so that its menu bars can be incorporated into other applications such as a spreadsheet
Ease of Conversion / Install
An equipment-control application that non-specialists will install on an offshore oil rig
Ease of Operation A software for analyzing historical financial records that would minimize the number of times the operators have to unload and reload different tapes
Used at Multiple Sites
Payroll software for a multinational corporation that must take into account of different currencies, languages, and income tax rules
Potential for Function Change
A financial forecasting software that can issue monthly, quarterly, or yearly projections with different format tailored to a particular business manager
33
Environmental Factors (3)
14 suggested environmental factors, each weighted on a scale of 0~5CAF (complexity adjustment factor) is calculated byCAF = 0.65 + (0.01×ΣFn)Then CAF is between: 0.65~1.35, i.e. ±35%
34
An Example (1)Count the raw FPs
Simple Average Complex Function PointsOutputs 12×4 = 48 11×5 = 55 5×7 = 35 138
Inputs 8×3 = 24 9×4 = 36 6×6 = 36 96
Inquiry Outputs 5×4 = 20 7×5 = 35 3×7 = 21 76
Inquiry Inputs 5×3 = 15 8×4 = 32 4×6 = 24 71
Files 12×7 = 84 3×10 = 30 2×15 = 30 144
Interfaces 9×5 = 45 6×7 = 42 4×10 = 40 127
Total raw FPs 652
35
An Example (2)Factor Rating Factor Rating
Data communications 5 Distributed computing 5
Performance requirements 3 Constrained configuration 0
Transaction rate 5 On-line inquiry / entry 4
End-user efficiency 5 On-line update 4
Complex processing 2 Reusability 2
Ease of conversion / install 3 Ease of operation 4
Multiple sites 5 Function change 4
sum of influence factors 51
CAF = 0.65+(0.01×N) = 0.65+(0.01×51) = 1.16
Adjusted FP = Raw FP × CAF = 652 × 1.16 = 756.32
LOC for C Language = Adjusted FP × LOC per FP = 756.32 × 128 = 96,808.96LOC
36
Advantages of Function PointsFP can be applied early in the software development life cycle, independent of programming language, application area and techniques
FP provides a reliable relation to effort, creating more FP per week or month is a desirable productivity goal
FP is requirements oriented, users can readily understand and relate it to software size
FP provides a mechanism to track and monitor scope creep, by counting FPs at the end of each stage
37
Disadvantages of Function Points
FP requires subjective evaluations, with much judgment involvedMany effort and cost models depend on LOC, so FPs must be convertedFP is not well-suited to non-MIS applicationsFP counting is hard to automate
38
Feature PointsAn extension of the FP method dealing with non-MIS applications
e.g. embedded, system, real-time, mathematical, AI software, etc.
Those applications are usually heavy in algorithmic complexity but light on inputs and outputsAn algorithm is a bounded set of rules required to solve a definite problem with single entry and exit pointsCount each input, output, inquiry, file, interface, and algorithm
39
Modified Environmental FactorsLogic complexity
1 – Simple algorithms and calculations2 – Majority of simple algorithms3 – Average complexity of algorithms4 – Some difficult algorithms5 – Many difficult algorithms
Data complexity1 – Simple data2 – Numerous variables but simple relationships3 – Multiple fields, files, and interactions4 – Complex file structures5 – Very complex files and data relationships
Sum the 2 values and convert to CAF (0.6~1.4)
40
An Example
Functions Average Feature Points Functions Average Feature
PointsInputs 12×4 48 Outputs 15×5 75
Files 22×7 154 Inquiries 17×4 68
Interfaces 8×7 56 Algorithms 43×3 129
Total raw Feature Points 530
Logic Complexity Some difficult algorithms – 4 Data
ComplexityMultiple fields, files, and interactions – 3
Complexity Adjustment Factor 7 1.1
Adjusted Feature Points = Raw Feature Points × CAF 530×1.1 = 583
LOC for the Java language = Adjusted Feature Points × 53 30,899
41
Advantages and Disadvantages
AdvantagesSame as FP, but also handle algorithmically intensive systems
DisadvantagesSame as FP, and have subjective classification of algorithmic complexity
42
Blitz Modeling
Based on Tom DeMarco’s bang metric
Count the component pieces of the system, then multiply the count by a productivity factor
The components may be: process bubbles, data flows, data repositories, entities, relationships, objects, attributes, etc.
The productivity factor maybe based on history data
43
Examples of Blitz ModelingFor a structured program described using data-flow diagramEstimated size = Number of process bubbles × Average number of
modules per bubble × Average module size= 7 bubbles × 4 modules per bubble × 350 LOC (SQL) per module= 9,800 LOC (SQL)
For a object-oriented program described using class diagramEstimated size = Number of object classes × Average number of
methods per class × Average method size= 20 object classes × 5 methods per class × 75 LOC (C) per
method= 7,500 LOC (C)
44
Combined with Fuzzy Logic
C++ object size in LOC per method
Category very small small medium large very
large
calculation 2.34 5.13 11.25 24.66 54.04
data 2.60 4.79 8.84 16.31 30.09
I/O 9.01 12.06 16.15 21.62 28.93
logic 7.55 10.98 15.98 23.25 33.83
set-up 3.88 5.04 6.56 8.53 11.09
text 3.75 8.00 17.07 36.41 77.66 58.6231.0916.488.734.63text
30.4917.5910.155.863.38print
90.2746.6024.0612.426.41logic
41.7422.1411.746.233.30file
15.4611.508.556.354.72display
74.7136.4617.798.684.24control
very largelargemediumsmallvery
smallCategory
Pascal object size in LOC per method
45
A Complicated ExampleComponent class Number of methods in C++ Total LOC
A. Logic12 large + 5 medium + 2 small= 12×23.25 + 5×15.98 + 2×10.98
380.86
B. Calculation3 very large + 2 large + 3 medium= 3×54.04 + 2×24.66 + 3×11.25
245.19
C. I/O5 large + 6 medium + 7 small= 5×21.62 + 6×16.15 + 7×12.06
289.42
D. Set-up2 large + 1 medium + 2 small= 2×8.53 + 1×6.56 + 2×5.04
33.7
E. Text3 very large + 6 large + 7 medium= 3×77.66 + 6×36.41 + 7×17.07
570.93
Sum of Total 1,520 (C++)
46
Advantages and DisadvantagesAdvantages
It is easy to use along with the structured and object-oriented methodologiesAccuracy increases with use of historical data, allow continuous improvement for estimation techniques
DisadvantagesIt requires the use of design methodology, and needs historical dataIt does not evaluate environmental factors
47
Wideband DelphiA group of experts (3~5 in high risk areas) are each given the program’s specification and an estimation formThey meet to discuss the software (document) and any estimation issuesThey each anonymously list rationale and estimate size, include a min, expected, and max valueThe estimates are given to the coordinator, who tabulates the resultsResults are given to each expert, with his/her estimate & mean valueidentifiedThe experts then meet to discuss the results, review the rationales and revise their estimatesThe cycle continues until the estimates converge to an acceptable range (or 2 consecutive ones remain unchanged)
48
An Estimation Form
Project:Estimator: Date:
Here is the range of estimates from the round:
X – estimatesX* – your estimateX! – median estimate
Please enter your estimate for the next round: SLOC.Please explain any rationale for your estimate.
0 20 40 60 80 100
X X* X! X X
49
Advantages of Wideband Delphi
The implementation is easy, and takes advantages of the expertise of several people
All participants become better educated about the software
It does not require historical data
It can be applied to both high-level and detailed estimation
50
Disadvantages of Wideband Delphi
It is difficult to repeat with a different group of experts
Experts may be all biased in the same subjective direction, and develop a false sense of confidence and an incorrect estimate
Experts may fail to reach a consensus
51
Problems with EstimatingSoftware engineers are notoriously poor estimators — Tom DeMarcoThe requirements are not well understood by developers / customers, either missing facts or distorted by unsubstantiated opinions or biasesThere is little or no historical data upon which to base future estimatesManagement uses estimates as performance or motivational goals, reluctant to re-estimateDevelopers are optimistic and desire to please their management, peers, and customers
52
Estimating Accuracy
Feasibility Plans and Requirements Product Design Detailed Design Development
and Test
Rel
ativ
e C
ost R
ange
0.25x
4x
x
0.5x
2x
1.5x
0.67x
1.25x
0.8x
Accepted Software
Requirements Specification
53
Mitigate the Estimating RisksProduce a WBS, divide and conquer, decomposed into the lowest level possibleReview assumptions with all stakeholders, including operations, maintenance and support departmentsWherever possible, do the research into past organizational experiences instead of just guessingStay in close communication with developers working on other parts of the systemUpdate estimates at frequent intervals, its accuracy improves over the course of the life cycleUse multiple estimating methods to increase confidenceEducate software development staff in estimating methods
54
The Effect of ReuseReuse terminology
New code – totally new or with large amount of modificationModified code – with a modest amount of modificationReused code – without any kind of change
Count different type of LOCExamine the smallest level of unit, e.g. module (typically about 100 LOC)No change Reused≤ 50% Modified> 50% New
55
Count the Different LOCComponents New LOC Modified LOC Reused LOC Total LOC
A 1,233 0 0 1,233
B 0 988 0 988
C 0 0 781 781
D 560 245 0 805
E 345 549 420 1,314
Total 2,138 1,782 1,201 5,121
Count different kind of Modification
Components New LOC Modified to Fix Bugs
Modified to Add Enhancements Reused LOC Total LOC
A 1,233 0 0
0
0
245
247
492
0 1,233
B 0 988 0 988
C 0 0 781 781
D 560 0 0 805
E 345 302 420 1,314
Total 2,138 1,290 1,201 5,121
56
Convert to New CodeCalculate the factors based on the actual data from the organization
New LOC Modified LOC Reused LOC Total LOC
Delivered 2,138 1,782 1,201 5,121
Factor 100% 60% 30%
Equivalent Net 2,138 1,069 360 3,567
To be more accurate
Process Step Requirements Design Code Integration
Percent 18% 25% 25% 32%
2,138 New 100% 100% 100% 100% 2,138
1,782 Modified 20% 40% 70% 100% 1,124
1,201 Reused 10% 0% 0% 100% 406
5,121 3,668
Equivalent NetDelivered
57
2. Duration and Cost Estimating
Effort means the amount of person-effortrequired to perform a taskUsually measured by person-hours, person-days or person-monthsA useful standard for person-month (staff-month) may be:
19 person-days or 152 person-hours by COCOMO for USA
58
Typical Inputs to Effort EstimatingWBS – tasks to be performed
Development tasks – system, requirements, design, code, testSupport tasks – CM, QA, documents, management
Additional dollar costsTravel, equipment
Size estimatesHistorical data on effort and productivityHigh-level scheduleProcess and methodsProgramming language, tools used, target OSStaff experience level
59
A Simple ExampleHistorical data
For complex software: 4 (2~8) SLOC per person-dayFor simple software: 8 (6~10) SLOC per person-day
Size estimated for new software: 12,000For complex software: 3,000 (1,500~6,000)person-dayFor simple software: 1,500 (1,200~2,000)person-day
60
Productivity Factor (1)LOC Accounting for a module in 60 hours
Added Deleted Modified Reused
+500
-200
100
+600
0
300
500
900 Total LOC = 500-200+600=900New & Changed LOC = 500+100=600
61
Productivity Factor (2)LOC Productivity
Option LOC Productivity (LOC/Hour)
Added 500 8.33
Added + Modified 600 10.00
Added + Modified + Deleted 800 13.33
Added + Modified + Reused 1200 20.00
Added + Modified + Deleted + Reused 1400 23.33
Finished Product 900 15.00
62
A Realistic Example (1)Historical Data in IBM
Size in KCSI (thousands of instructions)
Product Class <10 10-50 >50
Language 1.8 3.9 4.0
Control 1.6 1.8 2.4
Communications 1.0 1.6 2.0
Base: 3.9, means 400 LOC per person-monthVersus percent of new & changed code
1.91.81.4Communications
2.32.31.5Control
6.66.03.0Language
>40%20-40%<20%Product Class
% New or Changed
63
A Realistic Example (2)Size EstimatesDate: 5/25/87 Estimator: WSH Program: Satellite
Base Contingency Total
Component KLOC % KLOC KLOC
A Executive 9 100 9 18
B Function Calculation 15 100 15 30
C Control / Display 12 100 12 24
D Network Control 12 200 24 36
E Time Base Calculation 18 200 36 54
Total 66 145 96 162
64
A Realistic Example (3)Effort Estimates
Size Productivity
KLOC Base Adjust. LOC/PM PM
A Executive Control 18 400 1.8/3.9 185 97
B Function Calculation Language 30 400 3.9/3.9 400 75
C Control / Display Control 24 400 1.8/3.9 185 130
D Network Control Communication 36 400 1.6/3.9 164 220
E Time Base Calculation Language 54 400 4.0/3.9 410 132
Total 162 248 654
Program Name Program Class
65
Project Duration and Staff Size
1 2 4 8 16 >321 Day
1 Month
3 Months
6 Months
> 2 YearsP
roje
ct D
urat
ion
Staff Size
Slow response to market
Uninteresting
Almost impossible
Difficult to manage
UnwieldyQuick response
Typical new product or servide
Major asset
Too little staff makes the project stretch out too longToo many staff working on an “urgent” project is often impossible to manage
66
The Estimation Paradigm
Budget Resources
Schedule
AchievableDesired
67
COCOMO
COnstructive COst MOdelA regression-based model developed by Barry W. BoehmIn the early 1970s, he analyzed 63 software projects, to observe the relation between LOCand effort expended / schedule duration
68
3 COCOMO Modes (1)Organic
Systems such as payroll, inventory, and scientific calculationThe project team is small, little innovation is requiredConstraints and deadlines are few, development environment is stable
SemidetachedSystems such as compilers, database systems, and editorsThe project team is medium-size, some innovation is requiredConstraints and deadlines are moderate, development environment is somewhat fluid
69
3 COCOMO Modes (2)
EmbeddedReal-time systems such as air traffic control, ATMs, or weapon systemsThe project team is large, a great deal of innovation is requiredConstraints and deadlines are tight, development environment is complex
70
3 COCOMO Levels (1)Basic
Use only size and mode to determine the effort and scheduleUseful for fast, rough estimates of small to medium-size projects
IntermediateApply 15 additional variables to determine effortThe environmental adjustment factor (EAF) –relate to product, personnel, computer, and project attributes
71
3 COCOMO Levels (2)Detailed
Introduce the additional phase-sensitive effort multipliers and a 3-level product hierarchyEffort-multiplier
e.g. memory constraints for coding or testing phases, but not for analysis phase
3-level product hierarchySystem, subsystem, and modulee.g. language experience may apply at the module level, analyst’s capability at the subsystem level, required reliability at the system level
72
Basic COCOMO
Basic Effort FormulaE = a × (Size)b
TDEV = c × (E)d
a, b, c, d – constants derived from regression analysisSize – KLOCE – person-monthsTDEV – months
73
Basic COCOMO Formulas
Mode Effort Development Time
Organic E = 2.4×(Size)1.05 TDEV = 2.5×(E)0.38
Semidetached E = 3.0×(Size)1.12 TDEV = 2.5×(E)0.35
Embedded E = 3.6×(Size)1.20 TDEV = 2.5×(E)0.32
Average Staff = Effort ÷ TDEV
Productivity = Size ÷ Effort
74
2 Examples
Example 1 Example 2
Project Mode
Project Size
Effort (PM)
TDEV (Month)
Staff 20÷8 = 2.5 267÷17.67 = 15.11
Productivity
Organic Semidetached
7.5 KLOC 55KLOC
2.4×(7.5)1.05 = 20 3.0×(55)1.12 = 267
2.5×(20)0.38 = 8 2.5×(267)0.35 = 17.67
7,500÷20 = 375LOC/PM 55,000÷267 = 206LOC/PM
75
Intermediate COCOMO
Introduce 15 additional variables called cost drivers to estimate effortThe Effort formula is changed
Organic mode: E = 3.2×(KLOC)1.05×EAFSemidetached mode: E = 3.0×(KLOC)1.12×EAFEmbedded mode: E = 2.8×(KLOC)1.20×EAF
Where EAF (Effort Adjustment Factor) isEAF = C1×C2×…×C15
76
Cost Drivers (1)4 categories and othersProduct attributes
Required reliability (L)Database size (L)Product complexity (L)
Computer attributesExecution time constraint (≥1)Main storage constraint (≥1)Virtual machine volatility (L)Computer turnaround time (L)
Project attributesModern programming practices (H)Modern programming tools (H)Schedule compression (≥1)
77
Cost Drivers (2)Personnel attributes
Analyst capability (H)Application experience (H)Programmer capability (H)Virtual machine experience (H)Programming language experience (H)
Other cost driversRequirements volatility (L)Development machine volatility (L)Security requirements (L)Access to data (L)Impact of standards and imposed methods (H)Impact of physical surroundings (H)
78
Product Complexity
Defined differently for 4 different applications
Control OperationsComputational OperationsDevice-Dependent OperationsData Management Operations
79
An Example (1)Cost Driver Situation Rating Effort Multiplier
Required reliability Local use, no serious recovery problems Nominal 1.00
Database size 30,000 bytes Low 0.94
Product complexity Communication processing Very high 1.30
Time constraint Use 70% of available time High 1.11
Storage constraint Use 45M of 64M store (70%) High 1.06
Machine volatility Commercial microprocessor hardware Nominal 1.00
Turnaround time 2-hour average Nominal 1.00
Analyst capability Good senior analysts High 0.86
Application experience 3 years Nominal 1.00
Programmer capability Good senior programmers High 0.86
Machine experience 6 months Low 1.10
Language experience 12 months Nominal 1.00
Modern practices Most techniques in use over 1 year High 0.91
Modern tools Basic minicomputer tool level Low 1.10
Required schedule 10 months Nominal 1.00
EAF = 1.17
80
An Example (2)An embedded-mode software with size of 10 KLOCEn = 2.8×(10)1.20 = 44 PME = En × EAF = 44×1.17 = 51.5 PM
Relax capable personnel to normal fellowAnalyst and Programmer capability will be 1Staff cost will decrease from $6,000 to $5,000 per PM
EAF = 1.17÷0.86÷0.86 = 1.58E = En × EAF = 44×1.58 = 69.5 PMTotal cost = 69.5×5,000 = $347,500
Compared with 51.5×6,000 = $309,000
81
Detailed COCOMO (1)The software product is decomposed into sub-products and components
3-level hierarchy: system, subsystem, module
The project development activities are partitioned into phases
For development: Requirements, Product design, Detailed design, Coding & unit testFor maintenance: Integration & testing, Maintenance
All with different cost drivers (and value assignment) and specific coefficients
82
Detailed COCOMO (2)Product Hierarchy Applicable Cost Drivers
System Required reliability, Machine volatility, Turnaround time, Modern practices, Modern tools, Required schedule
Subsystem Add: Database size, Time constraint, Storage constraint, Analyst capability, Application experience
ModuleFurther add: Product complexity, Programmer capability, Machine experience, Language experience, KLOC, Adaptation of existing modules
83
Distribution of Effort and Schedule (1)
Plans and Requirements
Product Design Coding Integration
and Test
% of total effort 1% 21% 50.5% 27.5%
Effort of each phase 6.9 PM 145.3 PM 349.5 PM 190.3 PM
% of total time 4% 33% 38% 25%
Schedule 0.8 Months 6.6 Months 7.6 Months 5 Months
Average number of staff 8.6 22.0 46.0 38.0
Estimated Total effort = 692 PM, TDEV = 20 Months
84
Distribution of Effort and Schedule (2)
Plans and Requirements
Product Design Coding Integration
and Test
% of total effort 17% 25% 25% 33%
Effort of each phase 117.6 PM 173 PM 173 PM 228.4 PM
% of total time 30% 30% 15% 25%
Schedule 6 Months 6 Months 3 Months 5 Months
Average number of staff 19.6 28.8 57.7 45.7
Current Distribution, Total effort = 692 PM, TDEV = 20 M
85
Tailoring of COCOMO
Each organization and project is unique, the model should be tailored for a given environment
Cost drivers may be added, modified, and deleted, different values may be assigned to each rating
Exponent and coefficient constants can be calibrated using actual data
At least 5 projects are required to re-calibrate the coefficient, 10 are needed to re-calibrate the exponent
86
Advantages of COCOMO
It is a repeatable process, allowing the addition of specific adjustment factors
The model can be refined with linear regressionusing historical data
It works well on projects that are not dramatically different in size, complexity, or process
It is easy to use with thorough documentation and many supporting tools
87
Disadvantages of COCOMOIt needs factors for requirements volatility, customer attributes, security issues, documentation issues, and many othersIt depends on LOC estimates, hard to obtain at early stages but critical for COCOMO accuracyIt primarily represents development effort; maintenance, rework, porting and reuse issues don’t fit cleanly into the modelIt assumes a very basic level of effort for SCM and SQA, only 5% of the total budgetIt assumes a basic waterfall process model
88
COCOMO II
Allow the estimation ofObject-oriented software
Spiral or evolutionary software life cycle models
Software developed from COTS (commercial-off-the-shelf) software
89
3 Models of COCOMO II (1)The application composition model
For software built with GUI builder tools using rapid prototypingSuitable during the conceptual stages of a projectUse object points estimates as size inputs
The early design modelSuitable during the early design stages, before the entire architecture has been determinedUse raw function points estimates as size inputs
90
3 Models of COCOMO II (2)
The post-architecture modelSuitable after the development of the project’s overall architectureUse KLOC estimates as size inputs
91
5 Scaling Factors of COCOMO II
Use to replace COCOMO modesPrecedent: if the product is familiar
Flexibility: if the requirements can be relaxed
Risk elimination: if major interfaces are specified and significant risks eliminated
Team: if the team is highly cooperative
Process Maturity: if the organization has a high CMM level
92
COCOMO II for Model 3
Effort estimationPM = 2.45×A×(KLOC)B×(Sced)Schedule estimationTdev = 3.67×(PM)(0.28+0.2×(B-1.01))×(Sced)WhereA = Π(EMi)B = 0.91+0.01×∑(SFj)Sced – an overall schedule factor
93
SLIMIn the 1960s, Peter V. Norden of IBM observed a Rayleigh distribution between project effort and time
In 1978, Lawrence H. Putnam applied Norden’s observation to his Quality Software Management (QSM), obtained his Software Lifecycle Management (SLIM) methodologies
Including SLIM-Estimate, SLIM-Control, and SLIM-Metrics
SLIM is used by US Army projects, suitable for large projects with >70KLOC and lifetime in years
SLIM has 4 phases: design & code, test & validation, maintenance, management, each with specific Rayleigh curves
94
SLIM Curves
Requirements
Design & CodeTest & Validation
Maintenance
Project
Time
Staff Level
Management
95
SLIM Equations (1)Basic equationS = C × E1/3 × td
4/3
S – software size in LOCE – total effort for overall project (including maintenance) in person-yearstd – delivery time constraint in yearsC – environmental and technology factor, rated from 610 up to 57,314
Typical rating of CReal-time embedded: 1,500Batch development: 4,894Supported and organized: 10,040
96
SLIM Equations (2)Productivity Index or Manpower AccelerationD0 = E / (td
3)Then E can be calculated byE = (S/C)9/7D0
4/7
Typical values of D0
Project Type D0 Value Project Type D0 Value
Scientific systems 14.8
Telecommunications 11.4Stand-alone systems 15
Re-implementation of systems 27Real-time systems 8.3
Microcode systems 6.3
Business systems 17.3New software with many interfaces 12.3
97
An ExampleA software with estimated size 200,000LOC, C is assigned to be 4,000, time constraint is 2 yearsE = (1/td)4×(S/C)3 = (1/16)×(50)3 = 7,812.5 PYDevelopment effort Ed = 39.45% × E = 3,082 PY
Effort change when time constraint varies
td Ed E
2 3,082 7,814
2.5 1,262 3,200
3 609 1,543
Note: 10% decrease in time constraint results in a 52% increase in total life-cycle effort
98
Advantages of SLIMProvide a comprehensive set of software development management tools, offer value-added planning for large projects
Provide an optimal staffing policy in the context of the development environment
It is a repeatable process, allowing refinements using historical data
Can apply a sensitivity analysis showing how cost and effort varies when time constraint changed
99
Disadvantages of SLIMRayleigh distribution may not be suitable for size-time-effort relationshipSLIM is based on many non-software projects, leaving suspect for estimating software developmentIt works best for large projects valued in years, may not be suitable for small projectsIt assumes a waterfall life cycle, does not map to incremental iterative or spiral processes suitable for modern softwareSLIM is complex and sensitive, many important factors must be modeled and are difficult to determine
100
Summary
Software Size EstimatingEstimating process, problems and risks; effect of reuseSizing techniques: LOC, Function Points, Feature Points, Blitz modeling, Wideband Delphi
Duration and Cost EstimatingEstimation paradigmCOCOMO, COCOMO II, SLIM