Experience with Empirical Studies in Industry: Building...

5/20/13 USC-CSSE 1

Experience with Empirical Studies in Industry: Building Parametric Models

Barry Boehm, USC!boehm@usc.edu!

!CESI 2013!

May 20, 2013!!!

Outline •  Types of empirical studies with Industry

–  Types, benefits, challenges –  Comparative methods, emerging technologies,

parametric modeling •  Experiences with parametric modeling

–  Range of software engineering parametric models and forms

–  Goals: Model success criteria –  8-step model development process

•  Examples from COCOMO family of models

•  Conclusions 5/20/13 USC-CSSE 2

Types of Empirical Studies •  Comparative Methods: Inspection, Testing, Pair

Programming –  Benefits: Cost-effectiveness, Sweet spot insights –  Challenges: Representative projects, personnel, environment

•  Emerging Technologies: Agile, Model-Driven, Value-Based –  Benefits: Maturity, Cost-effectiveness, Sweet spot insights –  Challenges: Baselining, learning curve, subject skills

•  Parametric Modeling: Cost, Schedule, Quality Estimation –  Benefits: Budget realism, Progress monitoring, Productivity,

quality improvement areas –  Challenges: Community representativeness, Proprietary data,

data consistency 5/20/13 USC-CSSE 3

Value-Based Testing: Qi Li at Galorath, Inc. Business value of tests completed

5/17/13 Qi Li _Defense 4

APBIE-‐1 70.99%

APBIE-‐2 10.08%

APBIE-‐3 32.10%

–  H-‐t1: the value-‐based prioriAzaAon does not

increase APBIE –  reject H-‐t1

–  Value-‐based prioriAzaAon can improve the cost-‐

effecAveness of tesAng

74.07%

77.78%

83.95%

90.12% 93.83%

95.06%

100.00%

4.94% 6.17% 9.88%

16.05% 22.22%

25.93%

58.02%

35.80%

39.51%

45.68%

51.85%

58.02% 61.73%

87.65%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

8 10 12 14 16 18 20 22 24 26 28 30

PBIE-1

PBIE-2

PBIE-3

Stop Testing

Value-based

Value-neutral

Worst Case

5/20/13 USC-CSSE 5

Range of SE Parametric Models •  Outcome = f (Outcome-driver parameters) •  Most frequent outcome families

–  Throughput, response time; workload –  Reliability, defect density; usage –  Project cost, schedule; sizing –  Other costs: facilities, equipment, services,

licenses, installation, training –  Benefits: sales, profits, operational savings –  Return on investment = (Benefits-Costs)/Costs

COQUALMO 1998

COCOMO 81 1981

COPROMO 1998

COSoSIMO 2007

Legend: Model has been calibrated with historical project data and expert (Delphi) data

Model is derived from COCOMO II Model has been calibrated with expert (Delphi) data

COCOTS 2000

COSYSMO 2005

CORADMO 1999,2012

iDAVE 2004

COPLIMO 2003

COPSEMO 1998

COCOMO II 2000

DBA COCOMO 2004

COINCOMO 2004,2012

COSECMO 2004

Software Cost Models

Software Extensions

Other Independent Estimation Models

Dates indicate the time that the first paper was published for the model

COTIPMO 2011

AGILE C II 2003

COCOMO Family of Cost Models

5/20/13 USC-CSSE

5/20/13 USC-CSSE 7

Parametric Model Forms

•  Analogy: Outcome = f(previous outcome, differences) –  Example: yesterday’s weather

•  Unit Cost: Outcome = f(unit costs, unit quantities) –  Example: computing equipment

•  Activity-Based: Outcome = f(activity levels, durations) –  Examples: operational cost savings, training costs

•  Relationship-Based: Outcome = f(parametric relationships) –  Examples: queuing models, size & productivity cost models

5/20/13 USC-CSSE 8

Goals: Model Success Criteria •  Scope: Covers desired range of situations? •  Granularity: Level of detail sufficient for needs? •  Accuracy: Estimates close to actuals? •  Objectivity: Inputs repeatable across estimators? •  Calibratability: Sufficient calibration data available? •  Contructiveness: Helps to understand job to be done? •  Ease of use: Parameters easy to understand, specify? •  Prospectiveness: Parameters values knowable early? •  Parsimony: Avoids unnecessary parameters, features? •  Stability: Small input changes mean small output

changes? •  Interoperability: Easy to compare with related models?

5/20/13 USC-CSSE 9

Outline!•  Range of software engineering parametric

models and forms •  Goals: Model success criteria •  8-step model development process

–  Example from COCOMO family of models •  Conclusions

5/20/13 USC-CSSE 10

Determine Model Needs!!Step 1!!

USC-CSE Modeling Methodology!

Analyze existing literature!!Step 2!!

Perform Behavioral analyses!!Step 3! Define relative

significance,data, ratings!Step 4!!

Perform expert-judgment Delphi assessment, formulate a priori model!Step 5!!

Gather project data!!Step 6! !

Determine Bayesian A-Posteriori model!Step 7! Gather more data;

refine model!!Step 8!!

- concurrency and feedback implied!

5/20/13 USC-CSSE 11

Step 1: Determine Model Needs

•  Similar to software requirements determination –  Identify success-critical stakeholders

•  Decision-makers, users, data providers –  Identify their model needs (win conditions) –  Identify their ability to provide inputs, calibration data –  Negotiate best achievable (win-win) model capabilities

•  Prioritize capabilities for incremental development •  Use Model Success Criteria as checklist

5/20/13 USC-CSSE 12

Major Decision Situations Helped by COCOMO II

•  Software investment decisions!–  When to develop, reuse, or purchase!–  What legacy software to modify or phase out!

•  Setting project budgets and schedules!•  Negotiating cost/schedule/performance

tradeoffs!•  Making software risk management decisions!•  Making software improvement decisions!

–  Reuse, tools, process maturity, outsourcing!

5/20/13 USC-CSSE 13

Step 2: Analyze Existing Literature •  Understand underlying phenomenology

–  Sources of cost, defects, etc. •  Identify promising or unsuccessful model

forms using Model Success Criteria –  Narrow scope, inadequate detail –  Linear, discontinuous software cost models –  Model forms may vary by source of cost,

defects, etc. –  Invalid assumptions (queuing models)

•  Identify most promising outcome-driver parameters

5/20/13 USC-CSSE 14

Nonlinear Reuse Effects!

0.25! 0.5! 0.75! 1.0!

0.046!

Usual Linear!Assumption!

Data on 2954!NASA modules![Selby, 1988]!

Cost fraction!

Fraction modified!

5/20/13 USC-CSSE 15

Reuse Cost Increment for Software Understanding!

Very Low Low Nom High Very HighStructure Very low

cohesion, highcoupling,

spaghetti code.

Moderately lowcohesion, high

coupling.

Reasonablywell -

structured;some weak

areas.

High cohesion,low coupling.

Strongmodularity,information

hiding indata/controlstructures.

ApplicationClarity

No matchbetween

program andapplication

world views.

Somecorrelation

betweenprogram andapplication .

Moderatecorrelation

Goodcorrelation

Clear matchbetween

world views.Self-

DescriptivenessObscure code;documentation

missing,obscure orobsolete.

Some codecommentary andheaders; some

usefuldocumentation.

Moderate levelof code

commentary,headers,

documentation.

Good codecommentaryand headers;

usefuldocumentation;

some weakareas.

Self-descriptive

code;documentationup-to-date,

well-organized,with designrationale.

SU Increment toESLOC

50 40 30 20 10

5/20/13 USC-CSSE 16

Step 3: Perform Behavioral Analysis

•  Behavior Differences: Required Reliability Levels

Rating Rqts and Product Design Integration and Test

Very Low • Little detail • Many TBDs • Little Verification • Minimal QA, CM, draft user manual, test plans • Minimal PDR

• No test procedures • Many requirements untested • Minimal QA, CM • Minimal stress, off-nominal tests • Minimal as-built documentation

Very High • Detailed verification, QA, CM, standards, PDR, documentaion • IV&V interface • Very detailed test plans, procedures

• Very detailed test procedures, QA, CM, standards, documentaion • Very extensive stress, off-nominal tests • IV&V interface

5/20/13 USC-CSSE 17

Determine Model Needs!!Step 1!!

USC-CSE Modeling Methodology!

Analyze existing literature!!Step 2!!

Perform Behavioral analyses!!Step 3! Define relative

significance,data, ratings!Step 4!!

Perform expert-judgment Delphi assessment, formulate a priori model!Step 5!!

Gather project data!!Step 6! !

Determine Bayesian A-Posteriori model!Step 7! Gather more data;

refine model!!Step 8!!

- concurrency and feedback implied!

5/20/13 USC-CSSE 18

Step 4: Relative Significance: COSYSMO Rate each factor H, M, or L depending on its relatively high, medium, or low influence on system engineering effort. Use an equal number of H’s, M’s, and L’s.!

N=6!3.0!2.5!2.3!1.5!1.7!1.7!1.5!1.2!!!!

1.5!2.7!2.7!3.0!2.0!1.5!2.0!1.3!

Application Factors!__H___Requirements understanding!_M - H_Architecture understanding!_L - H_ Level of service rqts. criticality, difficulty!_L - M_ Legacy transition complexity!_L – M COTS assessment complexity!_L - H_ Platform difficulty!_L – M_Required business process reengineering!_L – M_Database size !______ TBD!!Team Factors!_L - M_Number and diversity of stakeholder communities!_M - H_Stakeholder team cohesion!_M - H_Personnel capability/continuity!__ H__ Personnel experience!_L - H_ Process maturity!_L - M_Multisite coordination!_L - H_Degree of system engineering ceremony!_L - M_Tool support!______ TBD!______ TBD!

5/20/13 USC-CSSE 19

Step 4: Define Relations, Data, Rating Scales!PM estimated

= 3.67 i

× ( Size ) ( SF ) × EM i ∏ ⎛

⎜ ⎜ ⎜ ⎜

⎠ ⎟ ⎟ ⎟ ⎟

SF = . 0.91 + 0 . 01 × w i ∑

Scale Factors ( Wi ) Very Low Low Nominal High Very High Extra High

PREC thoroughly unprecedented largely

unprecedented somewhat unprecedented generally

familiar largely familiar throughly familiar

FLEX rigorous occasional relaxation some

relaxation general conformity

some conformity

general goals RESL little (20%) some (40%) often (60%) generally

(75%) mostly (90%) full (100%) TEAM very difficult

interactions some difficult interactions basically

cooperative interactions

largely cooperative

highly cooperative

seamless interactions

PMAT weighted sum of 18 KPA achievement levels

5/20/13 USC-CSSE 20

Step 5: Initial Delphi Assessment •  Data definitions and rating scales established for

significant parameters •  Convene experts, use wideband Delphi process

–  Individuals estimate each parameter’s outcome-influence value

•  E.g, ratio of highest to lowest effort multiplier –  Summarize results; group discussion of differences

•  Usually draws out significant experience –  Individuals re-estimate outcome-influence values –  Can do more rounds, but two generally enough

•  Produces mean, standard deviation of outcome-influence values

•  Often uncovers overlaps, changes in outcome drivers

5/20/13 USC-CSSE 21

Req Und

COSYSMO Requirements Understanding Delphi !

5/20/13 USC-CSSE 22

Step 6: Gather, Analyze Project Data

•  Best to pilot data collection with early adopters –  Identifies data definition ambiguities –  Identifies data availability problems –  Identifies need for data conditioning

•  Best to collect initial data via interviews –  Avoids misinterpretations

•  Endpoint milestones; activities included/excluded; size definitions

–  Uncovers hidden assumptions •  Schedule vs. cost minimization; overtime effort reported

5/20/13 USC-CSSE 23

Initial Data Analysis May Require Model Revision

•  Initial COCOTS model adapted from COCOMO II, with different parameters –  Effort = A* (Size)B* ∏(Effort Multipliers)

•  Amount of COTS integration glue code used for Size

•  Data analysis showed some projects with no glue code, much effort –  Effort devoted to COTS assessment, tailoring

5/20/13 USC-CSSE 24

COCOTS Effort Distribution: 20 Projects

Mean % of Total COTS Effort by Activity ( +/- 1 SD )

49.07% 50.99% 61.25%

20.27% 20.75% 21.76% 31.06%

11.31%

-7.57% -7.48% 0.88% 2.35%

-20.00% -10.00%

0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00%

assessment tailoring glue code system volatility

5/20/13 USC-CSSE 25

Revised COCOTS Model •  COCOMO-like model for glue code effort •  Unit cost approach for COTS assessment

effort –  Number of COTS products to assess –  Number of attributes to assess, weighted by

complexity •  Activity-based approach for COTS

tailoring effort –  COTS parameters setting, script writing,

reports layout, GUI tailoring, protocol definitions

5/20/13 USC-CSSE 26

Step 7: Bayesian Calibration •  Multiple regression analysis of project data

points (model inputs, actual outputs) produces outcome-influence values –  Mean, variance, statistical significance

•  For COCOMO II, 161 data points produced mostly statistically significant parameters values –  Productivity ranges of cost drivers –  One with wrong sign, low significance (RUSE)

•  Bayesian approach favors experts when they agree, data where results are significant –  Result: RUSE factor with correct sign

5/20/13 USC-CSSE 27

Results of Bayesian Update: Using Prior and Sampling Information

Literature,behavioral analysis

A-prioriExperts’ Delphi

Noisy data analysis

A-posteriori Bayesian update

Productivity Range =Highest Rating /Lowest Rating

1.451.51

Language and Tool Experience (LTEX)

5/20/13 USC-CSSE 28

Step 8 Example: Software Understanding Increment Too Large!

•  Needed to add a Programmer Unfamiliarity factor!Very Low Low Nom High Very High

Structure Very lowcohesion, high

coupling,spaghetti code.

Moderately lowcohesion, high

coupling.

Reasonablywell -

structured;some weak

areas.

High cohesion,low coupling.

Strongmodularity,information

hiding indata/controlstructures.

ApplicationClarity

No matchbetween

world views.

Somecorrelation

Moderatecorrelation

Goodcorrelation

Clear matchbetween

world views.Self-

DescriptivenessObscure code;documentation

missing,obscure orobsolete.

Some codecommentary andheaders; some

usefuldocumentation.

Moderate levelof code

commentary,headers,

documentation.

Good codecommentaryand headers;

usefuldocumentation;

some weakareas.

Self-descriptive

code;documentationup-to-date,

well-organized,with designrationale.

SU Increment toESLOC

50 40 30 20 10

5/20/13 USC-CSSE 29

Some Ways to Get Started •  Build on small empirical-study homework assignments

to local-industry students •  Assign empirical studies in industry short courses •  Look for industry pain points

–  COSYSMO: Need to be CMMI Level 3 in systems engineering

•  Have your grad students do empirical studies as summer interns –  Or do a similar internship yourself

Experience with Empirical Studies in Industry: Building...

Documents

Dewey and the Empirical Unity of Opposites · Dewey and the . Empirical Unity of Opposites i This paper concentrates on a single paragraph in John Dewey's Experience and Nature, one

Experiential marketing by attributes of experience design ...827658/FULLTEXT01.pdf · Experiential marketing by attributes of experience design for hotel APPs An empirical study from

Sur les collectifs humains - shs-conferences.org

The Relationship Between Tourism Experience Co …2018/04/01 · little empirical research regarding the relationship between tourism experience value co creation at the destination

MYSTICAL EXPERIENCE AS FUTURE SCIENCE · Mystical experience is the empirical (albeit subjective) proof of this complementary view. It is the experiential basis for the Vedantic expression,

Antecedents of Online Shopping Experience: An Empirical Study · An Empirical Study Arijit Bhattacharya¹ Manjari Srivastava² ... broadband service in both big metros and smaller

Empirical Research of European Experience and Russian Trends

Empirical Experience and Experimental Evaluation of

Value in experience - VTTThe starting point for this dissertation was the finding that the empirical user experience research data provided new insights with regard to user’s subjective,

An Empirical Study of Customer Satisfaction in Online ... Empirical Study of Customer Satisfaction in Online Shopping Experience of Tourism Products in India Dr. Shuchi Singhal1, Shashi

Exq an empirical contruct of customer experience quality (cranfield school of management)

Research-Based Innovation with Industry: Project …2013.icse-conferences.org/documents/publicity/MiSE-WS...Research-Based Innovation with Industry: Project Experience and Lessons

Political Judgemertt between Empirical Experience and ...hg0090/pdfe/Brakensiek_Political Judgement.… · Empirical Experience and Scfiolarty Tradition: EngeCbert Kaempfer's

Introduction · adlerian Therapy The inclusion of Adlerian theory is based more on clinical wisdom and experience than empirical evidence, although some empirical support exits. Adlerian

Capital Flight: China’s Experience* - Buffalo State …faculty.buffalostate.edu/qianx/index_files/China'sKF_oct...Capital Flight: China’s Experience Abstract We study the empirical

Re-epithelization and density of ... - bio-conferences.org

Experience Report - ISSTAissta2015.cs.uoregon.edu/slides/dahse-issta15.pdfExperience Report: Experience Report: An Empirical Study of PHP Security Mechanism Usage Johannes Dahse and

Antichresis Leases: Theory and Empirical Evidence from …Dr Geoffrey Turnbull).pdf · 1 Antichresis Leases: Theory and Empirical Evidence from the Bolivian Experience * Ignacio Navarro

An Empirical Study of a CS 1 Studio Experience Designing, Visual

Empirical investigation of how user experience is …publications.lib.chalmers.se/records/fulltext/250229/250229.pdf · Empirical investigation of how user experience is affected