View
260
Download
0
Category
Preview:
Citation preview
Yasutaka Kamei Shinsuke Matsumoto Akito MondenKen-ichi Matsumoto Bram Adams Ahmed E. Hassan
Revisiting Common Bug Prediction Findings Using Effort-Aware Models
Effort-aware Models*
* T. Mende and R. Koschke, “Effort-aware defect prediction models,” in Proc. of European Conference on Software Maintenance and Reengineering (CSMR’10), 2010, pp. 109–118. 15
Effort-aware Models*
* T. Mende and R. Koschke, “Effort-aware defect prediction models,” in Proc. of European Conference on Software Maintenance and Reengineering (CSMR’10), 2010, pp. 109–118. 16
# BUGS : 5Effort : 1
# BUGS : 6Effort : 20
File A File B
Effort-aware Models*
* T. Mende and R. Koschke, “Effort-aware defect prediction models,” in Proc. of European Conference on Software Maintenance and Reengineering (CSMR’10), 2010, pp. 109–118. 17
5 = 5 / 1 0.30 = 6 / 20# BUGS : 5Effort : 1
# BUGS : 6Effort : 20
File A File B
Effort-aware Models*
* T. Mende and R. Koschke, “Effort-aware defect prediction models,” in Proc. of European Conference on Software Maintenance and Reengineering (CSMR’10), 2010, pp. 109–118. 18
5 = 5 / 1 0.30 = 6 / 20# BUGS : 5Effort : 1
# BUGS : 6Effort : 20
File A File B
Effort-aware Models*
* T. Mende and R. Koschke, “Effort-aware defect prediction models,” in Proc. of European Conference on Software Maintenance and Reengineering (CSMR’10), 2010, pp. 109–118. 19
# BUGS : 5Effort : 1
# BUGS : 6Effort : 20
5 = 5 / 1 0.30 = 6 / 20
SLOC
File A File B
Major Findings in Prediction Studies
Process metrics are better defect predictors than product metrics
Package-level prediction has higher precision and recall than file-level prediction
…20
[Schroter2006ISESE], [Zimmermann2007PROMISE], …
[Graves2000TSE], [Nagappan2006ICSE], [Moser2008ICSE], …
Major Findings in Prediction Studies
Process metrics are better defect predictors than product metrics
Package-level prediction has higher precision and recall than file-level prediction
…21
RQ1
RQ2
[Schroter2006ISESE], [Zimmermann2007PROMISE], …
[Graves2000TSE], [Nagappan2006ICSE], [Moser2008ICSE], …
Cross-release Prediction
25
Build a prediction model
TimePlatform 3.1 Platform 3.2
Measure MetricsBugs
Training data
Cross-release Prediction
26
BUGS : 5
BUGS : 7
BUGS : 0
Build a prediction model Predict a bug
TimePlatform 3.1 Platform 3.2
Measure MetricsBugs
Measure MetricsBugs
Training data Testing data
Cumulative Lift Chart
All modules are ordered by decreasing predicted RR(x) .
27
KSLOC
# bu
gs
0 200 400 600 800
050
010
0015
0020
0025
00
KSLOC(= Effort)
# B
ugs
Cumulative Lift Chart
All modules are ordered by decreasing predicted RR(x) .
28
KSLOC
# bu
gs
0 200 400 600 800
050
010
0015
0020
0025
00
KSLOC(= Effort)
# B
ugs
20%
54%
Research Questions
RQ1: Are process metrics still more effective than product metrics in effort-aware models?
RQ2: Are package-level predictions still more effective than file-level predictions?
29
Research Questions
RQ1: Are process metrics still more effective than product metrics in effort-aware models?
RQ2: Are package-level predictions still more effective than file-level predictions?
30
RQ1: Process vs. Product Metrics
Compare prediction models based on process and product metrics at the file-level
31
Process metrics Product metrics
KSLOC
# bu
gs
0 200 400 600 800
050
010
0015
0020
0025
00Process Metrics are still moreeffective than Product Metrics
36
20%
74%
29%Process
Product
KSLOC(= Effort)
# B
ugs
KSLOC
# bu
gs
0 200 400 600 800
050
010
0015
0020
0025
00
37
20%
74%
29%
KSLOC(= Effort)
# B
ugs
2.6 (= 74/29) Process
Product
Process Metrics are still moreeffective than Product Metrics
Impact of Process and Product Metrics
The top five metrics are all process metrics
38
NSMNSCRefactoringsNORMLCOMSIXDITNSFNOFNOMWMCVGPARMLOCNBDSLOCLOCAddedCodechurnLOCDeletedAgeBugFixesRevisions
0.00 0.05 0.10 0.15 0.20
rf1
IncNodePurityIncNodePurity
RevisionsBugFixes
LOCDeletedAge
CodechurnLOCAdded
Refactorings
Process Metrics
Product Metrics
Research Questions
RQ1: Are process metrics still more effective than product metrics in effort-aware models?
RQ2: Are package-level predictions still more effective than file-level predictions?
39
YES
Model Building Approach
41
B1 Package-level metrics
B2 Lift file-level metrics to package-level
B3 Lift file-level predictions to package-level
B1 Package-level MetricsRQ2: Model Building Approach
42
Martin metrics Build a model Package-levelpredictions
B1 Package-level MetricsRQ2: Model Building Approach
44
Martin metrics Build a model Package-levelpredictions
Model Building Approach
45
B1 Package-level metrics
B2 Lift file-level metrics to package-level
B3 Lift file-level predictions to package-level
B2 Lift File-level Metrics to Package-level
46
RQ2: Model Building Approach
Lift Metrics
File-levelmetrics
Package-levelmetrics
B2 Lift File-level Metrics to Package-level
47
RQ2: Model Building Approach
Package A
File a
File b
File c
Lift Metrics
File-levelmetrics
Package-levelmetrics
B2 Lift File-level Metrics to Package-level
48
RQ2: Model Building Approach
Package A
File a
File b
File c
Complexity:
9
4
5
6 =
9 + 4 + 53
Lift Metrics
File-levelmetrics
Package-levelmetrics
B2 Lift File-level Metrics to Package-level
49
RQ2: Model Building Approach
Lift Metrics
File-levelmetrics
Package-levelmetrics
B2 Lift File-level Metrics to Package-level
50
RQ2: Model Building Approach
Lift Metrics
File-levelmetrics
Package-levelmetrics
Build a model Package-levelpredictions
Model Building Approach
51
B1 Package-level metrics
B2 Lift file-level metrics to package-level
B3 Lift file-level predictions to package-level
B3 Lift File-level Predictionsto Package-level
52
RQ2: Model Building Approach
LiftPredictions
Build a model File-levelpredictionsFile-level
metrics
B3 Lift File-level Predictionsto Package-level
53
RQ2: Model Building Approach
LiftPredictions
Build a model File-levelpredictionsFile-level
metrics
B3 Lift File-level Predictionsto Package-level
54
RQ2: Model Building Approach
Package A
File a
File b
File c
#bugs6
3
2
LiftPredictions
B3 Lift File-level Predictionsto Package-level
55
RQ2: Model Building Approach
Package A
File a
File b
File c
#bugs6
3
2
KSLOC:1.0
0.5
1.5
LiftPredictions
B3 Lift File-level Predictionsto Package-level
56
RQ2: Model Building Approach
Package A
File a
File b
File c
#bugs6
3
2
KSLOC:1.0
0.5
1.5
1.0+0.5+1.56+3+22.9 =
LiftPredictions
B3 Lift File-level Predictionsto Package-level
57
RQ2: Model Building Approach
Build a model File-levelpredictions
LiftPredictions
Package-levelpredictions
File-levelmetrics
Summary of Model Building Approaches
58
RQ2: Model Building Approach
Package-levelmetrics
Build a modelat Package-level
Package-levelpredictions
B1 Martin Metrics
Summary of Model Building Approaches
59
RQ2: Model Building Approach
Package-levelmetrics
File-levelmetrics
Build a modelat Package-level
Package-levelpredictions
LiftMetricsB2 LiftUp(Input)
B1 Martin Metrics
Summary of Model Building Approaches
60
RQ2: Model Building Approach
Package-levelmetrics
File-levelmetrics
Build a modelat Package-level
Package-levelpredictions
Build a modelat File-level
File-levelpredictions
LiftPredictions
LiftMetricsB2 LiftUp(Input)
B3 LiftUp(Pred)
B1 Martin Metrics
KSLOC
# bu
gs
0 200 400 600 800
050
010
0015
0020
0025
00
61
20%
Lifting Predictions yields the Best Performance at Package-level
KSLOC(= Effort)
# B
ugs 62% B3 LiftUp(Prediction)
KSLOC
# bu
gs
0 200 400 600 800
050
010
0015
0020
0025
00
62
20%
Lifting Predictions yields the Best Performance at Package-level
KSLOC(= Effort)
# B
ugs 62% B3 LiftUp(Prediction)
57% B2 LiftUp(Input)
KSLOC
# bu
gs
0 200 400 600 800
050
010
0015
0020
0025
00
63
20%
57% B2 LiftUp(Input)
19% B1 Martin Metrics
Lifting Predictions yields the Best Performance at Package-level
KSLOC(= Effort)
# B
ugs 62% B3 LiftUp(Prediction)
RefactoringsANA.LCOMCaNSCNCNSMINOFWMCNORMDPARVGDITNSFNOMSIXCeLOCAddedCodechurnLOCDeletedAgeMLOCNBDBugFixesSLOCRevisions
0.000 0.002 0.004 0.006 0.008 0.010 0.012
rf3
IncNodePurity
Impact of Martin Metrics
64
Ce
D
I
NC
Ca
NAARefactorings
Revisions
BugFixes
LOCDeletedCodechurnLOCAdded
Age
Martin Metrics
Process Metrics
Product Metrics
65KSLOC
# bu
gs
0 200 400 600 800
050
010
0015
0020
0025
00
20%
74%
62%PackageB3 LiftUp(Pred.)
File
KSLOC(= Effort)
# B
ugs
File-level Predictions are more effective than Package-level
Research Questions
RQ1: Are process metrics still more effective than product metrics in effort-aware models?
RQ2: Are package-level predictions still more effective than file-level predictions?
66
YES
NO
Why is RQ2 not Supported?
The larger the package, the more likely a bug is introduced.
68
Package-level File-level
# BUGS: 8SLOC : 20
# BUGS: 2SLOC : 0.5
Example of Counting the Number of Bugs
76
v3.0 release v3.1 release v3.2 release
B
bug introduction bug fix
v3.0 release v3.1 release v3.2 release
A
bug introduction bug fix
Recommended