38
1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal IEEE TOSE May 2001

1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

Embed Size (px)

Citation preview

Page 1: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

1f02laitenberger7

An Internally Replicated Quasi-Experimental Comparison of

Checklist and Perspective-Based Reading of Code Documents

Laitenberger, etal

IEEE TOSE May 2001

Page 2: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

2f02laitenberger7

Exp Design Guidelines - TTYP

D1: Identify the population from which the subjects and objectsare drawn.

D2: Define the process by which the subjects and objects wereselected.

D3: Define the process by which subjects and objects are assignedto treatments.

Page 3: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

3f02laitenberger7

Experimental Design

Page 4: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

4f02laitenberger7

Conduct and collect – DC5 and 6?

DC5: For observational studies and experiments, record dataabout subjects who drop out from the studies.

DC6: For observational studies and experiments, record dataabout other performance measures that may be affected by thetreatment, even if they are not the main focus of the study.

Page 5: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

5f02laitenberger7

Analysis

A1: Specify any procedures used to control for multiple testing.

A2: Consider using blind analysis.

A3: Perform sensitivity analysis.

Page 6: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

6f02laitenberger7

Analysis

A4: Ensure that the data do not violate the assumptions of thetests used on them.

A5: Apply appropriate quality control procedures to verify yourresults.

A3: Perform sensitivity analyses.

Page 7: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

7f02laitenberger7

Presentation

P1: Describe or cite a reference for all statistical procedures used.

P2: Report the statistical package used.

P3: Present quantitative results as well as significance levels.Quantitative results should show the magnitude of effects andthe confidence limits.

Page 8: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

8f02laitenberger7

Presentation

P5: Provide appropriate descriptive statistics.

P4: Present the raw data whenever possible. Otherwise, confirmthat they are available for confidential review by the reviewersand independent auditors.

P6: Make appropriate use of graphics.

Page 9: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

9f02laitenberger7

Interpretation

I1: Define the population to which inferential statistics andpredictive models apply.

I2: Differentiate between statistical significance and practicalimportance.

I3: Define the type of study.

I4: Specify any limitations of the study.

Page 10: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

10f02laitenberger7

Results?

Page 11: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

11f02laitenberger7

Detection Effectiveness

Page 12: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

12f02laitenberger7

Cost per defect

Page 13: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

13f02laitenberger7

An Experiment Measuring the Effects of Personal Software

Process (PSP) Training

Prechelt and Unger

IEEE TOSE May 00

Page 14: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

14f02laitenberger7

What did you learn?Two Issues

PSP

Experimentation

Page 15: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

15f02laitenberger7

What are goals of PSP?

Page 16: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

16f02laitenberger7

PSP success

These data show, for instance, that the estimation accuracy

increases considerably, the number of defects introduced

per 1,000 lines of code (KLOC) decreases by a factor of two,

the number of defects per KLOC to be found late during

development (i.e., in test) decreases by a factor of three or

more, and productivity is not reduced

Page 17: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

17f02laitenberger7

What is problem/issue?

C2: If a specific hypothesis is being tested, state it clearly prior toperforming the study and discuss the theory from which it isderived, so that its implications are apparent.

The experiment investigated the following hypotheses (plusa few less important ones not discussed here, see [11]):. Reliability. PSP-trained programmers produce a more reliable program for the phoneword task than non-PSP-trained programmers.. Estimation Accuracy. PSP-trained programmers estimate the time they need for solving the phoneword task more accurately than non-PSP-trained programmers.

Page 18: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

18f02laitenberger7

Exp Context Guidelines

C1: Be sure to specify as much of the industrial context aspossible. In particular, clearly define the entities, attributes,and measures that are capturing the contextual information.

Page 19: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

19f02laitenberger7

Exp Context Guidelines

C3: If the research is exploratory, state clearly and, prior to dataanalysis, what questions the investigation is intended toaddress and how it will address them.

C4: Describe research that is similar to, or has a bearing on, thecurrent research and how current work relates to it.

Page 20: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

20f02laitenberger7

Exp Design Guidelines

D1: Identify the population from which the subjects and objectsare drawn.

D2: Define the process by which the subjects and objects wereselected.

D3: Define the process by which subjects and objects are assignedto treatments.

Page 21: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

21f02laitenberger7

Exp Design Guidelines

D4: Restrict yourself to simple study designs or, at least, todesigns that are fully analyzed in the statistical literature. Ifyou are not using a well-documented design and analysismethod, you should consult a statistician to see whether yoursis the most effective design for what you want to accomplish.

Page 22: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

22f02laitenberger7

Exp Design Guidelines

D8: If you cannot avoid evaluating your own work, then makeexplicit any vested interests (including your sources ofsupport) and report what you have done to minimize bias.

D7: Use appropriate levels of blinding.

D9: Avoid the use of controls unless you are sure the controlsituation can be unambiguously defined.

Page 23: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

23f02laitenberger7

Exp Design Guidelines

D11: Justify the choice of outcome measures in terms of theirrelevance to the objectives of the empirical study.

D10: Fully define all treatments (interventions).

Page 24: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

24f02laitenberger7

Conduct and collect

DC1: Define all software measures fully, including the entity,attribute, unit and counting rules.

DC2: For subjective measures, present a measure of interrateragreement, such as the kappa statistic or the intraclasscorrelation coefficient for continuous measures.

DC3: Describe any quality control method used to ensurecompleteness and accuracy of data collection.

Page 25: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

25f02laitenberger7

Conduct and collect

DC4: For surveys, monitor and report the response rate anddiscuss the representativeness of the responses and the impactof nonresponse.

DC5: For observational studies and experiments, record dataabout subjects who drop out from the studies.

DC6: For observational studies and experiments, record dataabout other performance measures that may be affected by thetreatment, even if they are not the main focus of the study.

Page 26: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

26f02laitenberger7

Analysis

A1: Specify any procedures used to control for multiple testing.

A2: Consider using blind analysis.

A3: Perform sensitivity analyses.

Page 27: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

27f02laitenberger7

Analysis

A4: Ensure that the data do not violate the assumptions of thetests used on them.

A5: Apply appropriate quality control procedures to verify yourresults.

A3: Perform sensitivity analyses.

Page 28: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

28f02laitenberger7

Presentation

P1: Describe or cite a reference for all statistical procedures used.

P5: Provide appropriate descriptive statistics.

P2: Report the statistical package used.

P3: Present quantitative results as well as significance levels.Quantitative results should show the magnitude of effects andthe confidence limits.P4: Present the raw data whenever possible. Otherwise, confirmthat they are available for confidential review by the reviewersand independent auditors.

P6: Make appropriate use of graphics.

Page 29: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

29f02laitenberger7

Interpretation

I1: Define the population to which inferential statistics andpredictive models apply.

I2: Differentiate between statistical significance and practicalimportance.

I3: Define the type of study.

I4: Specify any limitations of the study.

Page 30: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

30f02laitenberger7

Results?

Page 31: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

31f02laitenberger7

Output reliability

Page 32: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

32f02laitenberger7

Surprising data (no digit)

Page 33: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

33f02laitenberger7

Effort estimation error

Page 34: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

34f02laitenberger7

Estimated/actual productivity

Page 35: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

35f02laitenberger7

Productivity

Page 36: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

36f02laitenberger7

Productivity

Page 37: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

37f02laitenberger7

effort

Page 38: 1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal

38f02laitenberger7

For Thurs 9/19

Smith, Hale and Parrish, “An Empirical Study Using Task Assessment to Improve the Accuracy of Software Effort Estimation”, IEEE TOSE Mar 2001