53
TRANSFER LEARNING AND SE [email protected] WVU, JULY 2013

Franhouder july2013

Embed Size (px)

DESCRIPTION

Transfer learning and SE

Citation preview

Page 1: Franhouder july2013

TRANSFER LEARNING AND SE [email protected]

WVU, JULY 2013

Page 2: Franhouder july2013

SOUND BITES

•  Ye olde worlde SE

•  “The” model of SE (defects, effort, etc)

•  21st century SE

•  Models (plural) •  No generality in models •  But , perhaps generality in how we find those models

•  Transfer learning

2

Page 3: Franhouder july2013

3

Page 4: Franhouder july2013

WHAT IS TRANSFER LEARNING?

•  Source = old= Domain1 = < Eg1, P1>

•  Target = new = Domain2 = <Eg2, P2>

•  If we move from domain1 to domain2, do have have to start afresh?

•  Or can we learn faster in “new” … •  … Using lessons learned from “old”?

•  NSF funding (2013..2017):

•  Transfer learning in Software Engineering •  Menzies, Layman, Shull , Diep

4

Page 5: Franhouder july2013

WHO CARES? (WHAT’S AT STAKE?)

•  “Transfer” is a core scientific issue

•  Lack of transfer is the scandal of SE

•  Replication is Empirical SE is rare

•  Conclusion instability •  It all depends.

•  The full stop syndrome

•  The result?

•  A funding crisis

5

Page 6: Franhouder july2013

MANUAL TRANSFER (WAR STORIES)

•  Brazil, SEL, 2002: need domain knowledge (but now gone)?

•  NSF, SEL, 2006: need better automatic support

•  Kitchenham, Mendes et al, TSE 2007: for = against

•  Zimmermann FSE, 2009: cross works in 4/600 times

6

Page 7: Franhouder july2013

WAR STORIES (EFFORT ESTIMATION) Effort = a . locx . y

•  learned using Boehm’s methods

•  20*66% of NASA93 •  COCOMO attributes •  Linear regression (log

pre-processor) •  Sort the co-efficients

found for each member of x,y

7

Page 8: Franhouder july2013

WAR STORIES (DEFECT ESTIMATION)

8

Page 9: Franhouder july2013

BUT THERE IS HOPE

•  Maybe we’ve been looking in the wrong direction

•  SE project data = surface features of an underlying effect •  Go beneath the surface

9

Page 10: Franhouder july2013

Focused too much on what we can see at first glance

Did not check the nuances on the hidden structure beneath

10

BUT THERE IS HOPE

Page 11: Franhouder july2013

With new data mining technologies, true picture emerges, where we can see what is going on

12/1/2011 11

BUT THERE IS HOPE

Page 12: Franhouder july2013

ESEM, 2011 : How to Find Relevant Data for Effort Estimation

TIM MENZIES, EKREM KOCAGUNELI

Page 13: Franhouder july2013

THERE IS HOPE

•  Maybe we’ve been looking in the wrong direction

•  SE project data = surface features of an underlying effect •  Go beneath the surface

13

Page 14: Franhouder july2013

USD DOD MILITARY PROJECTS (LAST DECADE)

14

You must segment to find relevant

data

Page 15: Franhouder july2013

15"

DOMAIN SEGMENTATIONS

15

Q: What to do about rare

zones?

A: Select the nearest ones from the rest But how?

Page 16: Franhouder july2013

IN THE LITERATURE: WITHIN VS CROSS = ??

BEFORE THIS WORK

16

Kitchenham et al. TSE 2007

•  Within-company learning (just use local data)

•  Cross-company learning (just use data from other companies)

Results mixed •  No clear win from cross

or within

Cross vs within are no rigid boundaries

•  They are soft borders •  And we can move a

few examples across the border

•  And after making those moves

•  “Cross” same as “local”

Page 17: Franhouder july2013

SOME DATA DOES NOT DIVIDE NEATLY ON EXISTING DIMENSIONS

17

Page 18: Franhouder july2013

THE LOCALITY(1) ASSUMPTION

18

Data divides best on one attribute 1.  development centers of developers; 2.  project type; e.g. embedded, etc; 3.  development language 4.  application type (MIS; GNC; etc); 5.  targeted hardware platform; 6.  in-house vs outsourced projects; 7.  Etc

If Locality(1) : hard to use data across these boundaries

•  Then harder to build effort models: •  Need to collect local data (slow)

Page 19: Franhouder july2013

THE LOCALITY(N) ASSUMPTION

19

Data divides best on combination of attributes If Locality(N)

• Easier to use data across these boundaries

•  Relevant data spread all around

•  little diamonds floating in the dust

Page 20: Franhouder july2013

HOW TO FIND RELEVANT TRAINING DATA?

20

independent attributes

w x y z class similar 1

0 1 1 1 2 similar 2

0 1 1 1 3 different 1 7 7 6 2 5 different 2 1 9 1 8 8 different 3 5 4 2 6 10

alien 1 74 15 73 56 20 alien 2 77 45 13 6 40 alien 3 35 99 31 21 60 alien 4 49 55 37 4 80

Use similar?

Use more variant?

Use aliens ?

Page 21: Franhouder july2013

VARIANCE PRUNING

21

independent attributes

w x y z class similar 1

0 1 1 1 2 similar 2

0 1 1 1 3 different 1 7 7 6 2 5 different 2 1 9 1 8 8 different 3 5 4 2 6 10

alien 1 74 15 73 56 20 alien 2 77 45 13 6 40 alien 3 35 99 31 21 60 alien 4 49 55 37 4 80

1) Sort the clusters by “variance” 2) Prune those high variance things 3) Estimate on the rest

“Easy path”: cull the examples that hurt the learner

PRUNE !

KEEP !

Page 22: Franhouder july2013

TEAK: CLUSTERING + VARIANCE PRUNING (TSE, JAN 2011)

22

•  TEAK is a variance-based instance selector • It is built via GAC trees

•  TEAK is a two-pass system • First pass selects low-variance relevant projects • Second pass retrieves projects to estimate from

Page 23: Franhouder july2013

ESSENTIAL POINT

23

TEAK finds local regions important to the estimation of particular cases

TEAK finds those regions via locality(N)

•  Not locality(1)

Page 24: Franhouder july2013

WITHIN AND CROSS DATASETS

24

Note: all Locality(1) divisions

Page 25: Franhouder july2013

EXPERIMENT1: PERFORMANCE COMPARISON OF WITHIN AND CROSS-SOURCE DATA

25

•  TEAK on within & cross data for each dataset group (lines separate groups) •  LOOCV used for runs •  20 runs performed for each treatment •  Results evaluated w.r.t. MAR, MMRE, MdMRE and Pred(30), but see http://goo.gl/6q0tw

•  If within data outperforms cross, the dataset is highlighted with gray

•  See only 2 datasets highlighted

Page 26: Franhouder july2013

EXPERIMENT 2: RETRIEVAL TENDENCY OF TEAK FROM WITHIN AND CROSS-SOURCE DATA

26

Page 27: Franhouder july2013

EXPERIMENT2: RETRIEVAL TENDENCY OF TEAK FROM WITHIN AND CROSS-SOURCE DATA

27

Diagonal (WC) vs. Off-Diagonal (CC) selection percentages sorted

Percentiles of diagonals and off-diagonals

Page 28: Franhouder july2013

HIGHLIGHTS

28

1.  Don’t listen to everyone •  When listening to a crowd, first

filter the noise

2.  Once the noise clears: bits of me are similar to bits of you

•  Probability of selecting cross or within instances is the same

3.  Cross-vs-within is not a useful distinction

•  Locality(1) not informative •  Enables “cross-company”

learning

Page 29: Franhouder july2013

SO, THERE IS HOPE

•  Maybe we’ve been looking in the wrong direction

•  SE project data = surface features of an underlying effect •  Go beneath the surface

•  Assuming locality(N), not locality(1)

•  No cross-, no within- •  Its all data we can learn from

29

Page 30: Franhouder july2013

TSE, 2013 : LOCAL VS. GLOBAL MODELS FOR EFFORT ESTIMATION AND DEFECT PREDICTION TIM MENZIES, ANDREW BUTCHER (WVU) ANDRIAN MARCUS (WAYNE STATE) THOMAS ZIMMERMANN (MICROSOFT) DAVID COK (GRAMMATECH)

Page 31: Franhouder july2013

Do not on what we can see at first glance

Check the nuances on the hidden structure beneath

31

THERE IS HOPE

Page 32: Franhouder july2013

12/1/2011 32

Cluster then learn (using envy)

Page 33: Franhouder july2013

•  Seek the fence where the grass is greener on the other side.

•  Learn from there

•  Test on here

•  Cluster to find “here” and “there”

12/1/2011 33

ENVY = THE WISDOM OF THE COWS

Page 34: Franhouder july2013

12/1/2011 34

@attribute recordnumber real @attribute projectname {de,erb,gal,X,hst,slp,spl,Y} @attribute cat2 {Avionics, application_ground, avionicsmonitoring, … } @attribute center {1,2,3,4,5,6} @attribute year real @attribute mode {embedded,organic,semidetached} @attribute rely {vl,l,n,h,vh,xh} @attribute data {vl,l,n,h,vh,xh} … @attribute equivphyskloc real @attribute act_effort real @data 1,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,25.9,117.6 2,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,24.6,117.6 3,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,7.7,31.2 4,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,8.2,36 5,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,9.7,25.2 6,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,2.2,8.4 ….

DATA = MULTI-DIMENSIONAL VECTORS

Page 35: Franhouder july2013

CAUTION: DATA MAY NOT DIVIDE NEATLY ON RAW DIMENSIONS

The best description for SE projects may be synthesize dimensions extracted from the raw dimensions

12/1/2011 35

Page 36: Franhouder july2013

FASTMAP

36

Fastmap: Faloutsos [1995] O(2N) generation of axis of large variability

•  Pick any point W; •  Find X furthest from W, •  Find Y furthest from Y.

c = dist(X,Y) All points have distance a,b to (X,Y)

•  x = (a2 + c2 − b2)/2c •  y= sqrt(a2 – x2)

Find median(x), median(y) Recurse on four quadrants

Page 37: Franhouder july2013

HIERARCHICAL PARTITIONING Prune

Find two orthogonal dimensions

Find median(x), median(y)

Recurse on four quadrants

Combine quadtree leaves with similar densities

Score each cluster by median score of class variable

37

Grow

Page 38: Franhouder july2013

38

Learning via “envy”

Page 39: Franhouder july2013

•  Seek the fence where the grass is greener on the other side.

•  Learn from there

•  Test on here

•  Cluster to find “here” and “there”

39

ENVY = THE WISDOM OF THE COWS

Page 40: Franhouder july2013

HIERARCHICAL PARTITIONING Prune

Find two orthogonal dimensions

Find median(x), median(y)

Recurse on four quadrants

Combine quadtree leaves with similar densities

Score each cluster by median score of class variable

40

Grow

Page 41: Franhouder july2013

HIERARCHICAL PARTITIONING Prune

Find two orthogonal dimensions

Find median(x), median(y)

Recurse on four quadrants

Combine quadtree leaves with similar densities Score each cluster by median score of class variable This cluster envies its neighbor with better score and max abs(score(this) - score(neighbor)) 41

Grow

Where is grass greenest?

Page 42: Franhouder july2013

Q: HOW TO LEARN RULES FROM NEIGHBORING CLUSTERS

A: it doesn’t really matter • Many competent rule learners

But to evaluate global vs local rules: • Use the same rule learner for local vs global rule learning

This study uses WHICH (Menzies [2010])

• Customizable scoring operator • Faster termination • Generates very small rules (good for explanation)

42

Page 43: Franhouder july2013

DATA FROM HTTP://PROMISEDATA.ORG/DATA

Effort reduction = { NasaCoc, China } : COCOMO or function points

Defect reduction = {lucene,xalan jedit,synapse,etc } : CK metrics(OO)

Clusters have untreated class distribution.

Rules select a subset of the examples:

•  generate a treated class distribution

43

0 20 40 60 80 100

25th

50th

75th

100th

untreated global local

Distributions have percentiles:

Treated with rules learned from all data

Treated with rules learned from neighboring cluster

Page 44: Franhouder july2013

Lower median efforts/defects (50th percentile)

Greater stability (75th – 25th percentile)

Decreased worst case (100th percentile)

BY ANY MEASURE, LOCAL BETTER THAN GLOBAL

44

Page 45: Franhouder july2013

RULES LEARNED IN EACH CLUSTER

What works best “here” does not work “there”

•  Misguided to try and tame conclusion instability •  Inherent in the data

Can’t tame conclusion instability.

•  Instead, you can exploit it •  Learn local lessons that do better than overly generalized global theories

45

Page 46: Franhouder july2013

RULES LEARNED IN EACH CLUSTER

What works best “here” does not work “there”

•  Misguided to try and tame conclusion instability •  Inherent in the data

Can’t tame conclusion instability.

•  Instead, you can exploit it •  Learn local lessons that do better than overly generalized global theories

46

Page 47: Franhouder july2013

Do not on what we can see at first glance

Check the nuances on the structures within our data

•  Cluster, then envy

47

SO THERE IS HOPE

Page 48: Franhouder july2013

48

Conclusion

Page 49: Franhouder july2013

LACK OF TRANSFER = THE GREAT SCANDAL OF SE

•  Replication is Empirical SE is rare

•  Conclusion instability

•  “It all depends.” is not good enough

•  A funding crisis

49

Page 50: Franhouder july2013

BUT THERE IS HOPE

•  Maybe we’ve been looking in the wrong direction

•  SE project data = surface features of an underlying effect •  Go beneath the surface

•  Assuming locality(N), not locality(1)

•  No cross-, no within- •  Its all data we can learn from

50

Page 51: Franhouder july2013

Do not on what we can see at first glance

Check the nuances on the structures within our data

•  Cluster, then envy

51

BUT THERE IS HOPE

Page 52: Franhouder july2013

With new data mining technologies, true picture emerges, where we can see what is going on

12/1/2011 52

BUT THERE IS HOPE

Page 53: Franhouder july2013

53