16
/ W&I / MDSE PAGE 1 22-06-20 22 Metrics are usually computed at a low level: classes, methods, …

Icsm 2011 you can't control the unfamiliar

Embed Size (px)

Citation preview

Page 1: Icsm 2011 you can't control the unfamiliar

12-04-2023/ W&I / MDSE PAGE 1

Metrics are usually computed at a low level: classes, methods, …

Page 2: Icsm 2011 you can't control the unfamiliar

/W&I / MDSE 12-04-2023

Multitude of data values obscures a general picture of the system maintainability

PAGE 2

Page 3: Icsm 2011 you can't control the unfamiliar

/W&I / MDSE 12-04-2023

That we are actually interested in!

PAGE 3

Page 4: Icsm 2011 you can't control the unfamiliar

You Can't Control the Unfamiliar: A Study on the Relations Between Aggregation Techniques for Software Metrics

Bogdan Vasilescu

Alexander Serebrenik

Mark van den Brand

Page 5: Icsm 2011 you can't control the unfamiliar

/W&I / MDSE 12-04-2023

Two kinds of aggregation

Same artifact, different metrics

Same metrics, different artifacts

PAGE 5

Page 6: Icsm 2011 you can't control the unfamiliar

/W&I / MDSE 12-04-2023

Various techniques can be found in the literature

Same metrics, different artifacts

PAGE 6

Traditional: mean, median, sum, …

Econometric inequality indices: Gini, Theil, Hoover, Kolm, Atkinson

Page 7: Icsm 2011 you can't control the unfamiliar

/W&I / MDSE 12-04-2023

Various techniques can be found in the literature

Same metrics, different artifacts

PAGE 7

Traditional: mean, median, sum, …

Econometric inequality indices: Gini, Theil, Hoover, Kolm, Atkinson

Which aggregation technique should we

use?

Page 8: Icsm 2011 you can't control the unfamiliar

/W&I / MDSE 12-04-2023

Questions

1. Which and to what extent do the different aggregation techniques agree?

2. What is the nature of the relation between the various aggregation techniques?

3. How does the correlation coefficient change as the systems evolve?

PAGE 8

Page 9: Icsm 2011 you can't control the unfamiliar

/W&I / MDSE 12-04-2023

Qualitas Corpus 20101126

PAGE 9

• Qualitas Corpus 20101126r, 106 systems • FitJava v1.1, 2 packages, 2240 SLOC • NetBeans v6.9.1, 3373 packages 1890536 SLOC.

Page 10: Icsm 2011 you can't control the unfamiliar

/W&I / MDSE 12-04-2023

1) Agreement between diff techniques

• Agreement: • Aggregation: Class SLOC Package• Techniques agree if they rank the packages similarly

PAGE 10

We use rank-based correlation coefficient: Kendall’s

Page 11: Icsm 2011 you can't control the unfamiliar

/W&I / MDSE 12-04-2023

1) Agreement: different inequality indices?

• Gini, Theil, Hoover, Atkinson – agree• aggregates obtained convey the same information• Kolm does not!

PAGE 11

Page 12: Icsm 2011 you can't control the unfamiliar

/W&I / MDSE 12-04-2023

1) Agreement: traditional and ineq indices?

• mean • Kolm: strong (0,8) and statistically significant (92%)• median, standard deviation, and variance

• sum• does not correlate with any other aggregation technique

PAGE 12

Page 13: Icsm 2011 you can't control the unfamiliar

/W&I / MDSE 12-04-2023

2) Nature of the relation: Typical patterns

• Theil is known to be more sensitive to the rich

• Theil increases faster when Gini increases

PAGE 13

• Linear relation with a “fat” head

Page 14: Icsm 2011 you can't control the unfamiliar

/ W&I / MDSE 12-04-2023

Which aggregation technique? (1)

• Theil, Hoover, Gini and Atkinson agree• Any can be chosen from the correlation point of view

• Some might be “better” in each specific case• easy to interpret: Gini [0,1]• provide additional insights: Theil (explanation)• negative values: Gini, Hoover

− affects the domain!• sensitive for high values: Theil, Atkinson• deviations from uniformity: Gini, Hoover

PAGE 14

Page 15: Icsm 2011 you can't control the unfamiliar

/ W&I / MDSE 12-04-2023

Which aggregation technique? (2)

• Kolm and mean agree• Kolm is reliable for skewed distributions

− better alternative (“by no means”)• Not in the paper:

− agreement observed for NOC− but not for DIT!

PAGE 15

Page 16: Icsm 2011 you can't control the unfamiliar

/W&I / MDSE 12-04-2023

Conclusions

PAGE 16