57
Machine Learning Nearest Neighbor Classification 1

Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

MachineLearning

NearestNeighborClassification

1

Page 2: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Thislecture

• K-nearestneighborclassification– Thebasicalgorithm– Differentdistancemeasures– Somepracticalaspects

• Voronoi DiagramsandDecisionBoundaries– Whatisthehypothesisspace?

• TheCurseofDimensionality

2

Page 3: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Thislecture

• K-nearestneighborclassification– Thebasicalgorithm– Differentdistancemeasures– Somepracticalaspects

• Voronoi DiagramsandDecisionBoundaries– Whatisthehypothesisspace?

• TheCurseofDimensionality

3

Page 4: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Howwouldyoucolortheblankcircles?

A

B

C

4

Page 5: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Howwouldyoucolortheblankcircles?B

C

Ifwebaseditonthecoloroftheirnearestneighbors,wewouldgetA:BlueB:RedC:Red

5

A

Page 6: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Trainingdatapartitionstheentireinstancespace(usinglabelsofnearestneighbors)

6

Page 7: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

NearestNeighbors:Thebasicversion

• Trainingexamplesarevectorsxi associatedwithalabelyi– E.g.xi =afeaturevectorforanemail,yi =SPAM

• Learning:Juststoreallthetrainingexamples

• Prediction foranewexamplex– Findthetrainingexamplexi thatisclosest tox– Predictthelabelofx tothelabelyi associatedwithxi

7

Page 8: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

K-NearestNeighbors

• Trainingexamplesarevectorsxi associatedwithalabelyi– E.g.xi =afeaturevectorforanemail,yi =SPAM

• Learning:Juststoreallthetrainingexamples

• Prediction foranewexamplex– Findthekclosest trainingexamplestox– Constructthelabelofx usingthesekpoints.How?– Forclassification:?

8

Page 9: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

K-NearestNeighbors

• Trainingexamplesarevectorsxi associatedwithalabelyi– E.g.xi =afeaturevectorforanemail,yi =SPAM

• Learning:Juststoreallthetrainingexamples

• Prediction foranewexamplex– Findthekclosest trainingexamplestox– Constructthelabelofx usingthesekpoints.How?– Forclassification:Everyneighborvotesonthelabel.Predictthemost

frequentlabelamongtheneighbors.– Forregression:?

9

Page 10: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

K-NearestNeighbors

• Trainingexamplesarevectorsxi associatedwithalabelyi– E.g.xi =afeaturevectorforanemail,yi =SPAM

• Learning:Juststoreallthetrainingexamples

• Prediction foranewexamplex– Findthekclosest trainingexamplestox– Constructthelabelofx usingthesekpoints.How?– Forclassification:Everyneighborvotesonthelabel.Predictthemost

frequentlabelamongtheneighbors.– Forregression:Predictthemeanvalue

10

Page 11: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Instancebasedlearning

• Aclassoflearningmethods– Learning:Storingexampleswithlabels– Prediction:Whenpresentedanewexample,classifythelabelsusing

similar storedexamples

• K-nearestneighborsalgorithmisanexampleofthisclassofmethods

• Alsocalledlazylearning,becausemostofthecomputation(inthesimplestcase,allcomputation)isperformedonlyatpredictiontime

11Questions?

Page 12: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Distancebetweeninstances

• Ingeneral,agoodplacetoinjectknowledgeaboutthedomain

• Behaviorofthisapproachcandependonthis

• Howdowemeasuredistancesbetweeninstances?

12

Page 13: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Distancebetweeninstances

Numericfeatures,representedasndimensionalvectors

13

Page 14: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Distancebetweeninstances

Numericfeatures,representedasndimensionalvectors– Euclideandistance

– Manhattandistance

– Lp-norm• Euclidean=L2• Manhattan=L1• Exercise:WhatisL1?

14

Page 15: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Distancebetweeninstances

Numericfeatures,representedasndimensionalvectors– Euclideandistance

– Manhattandistance

– Lp-norm• Euclidean=L2• Manhattan=L1• Exercise:WhatisL1?

15

Page 16: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Distancebetweeninstances

Numericfeatures,representedasndimensionalvectors– Euclideandistance

– Manhattandistance

– Lp-norm• Euclidean=L2• Manhattan=L1• Exercise:WhatisL1?

16

Page 17: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Distancebetweeninstances

Whataboutsymbolic/categoricalfeatures?

17

Page 18: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Distancebetweeninstances

Symbolic/categoricalfeatures

MostcommondistanceistheHammingdistance– Numberofbitsthataredifferent– Or:Numberoffeaturesthathaveadifferentvalue– Alsocalledtheoverlap– Example:

X1:{Shape=Triangle,Color=Red,Location=Left,Orientation=Up}X2:{Shape=Triangle,Color=Blue,Location=Left,Orientation=Down}

Hammingdistance=2

18

Page 19: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Advantages

• Trainingisveryfast– Justaddinglabeledinstancestoalist– Morecomplexindexingmethodscanbeused,whichslowdown

learningslightlytomakepredictionfaster

• Canlearnverycomplexfunctions

• Wealwayshavethetrainingdata– Forotherlearningalgorithms,aftertraining,wedon’tstorethedata

anymore.Whatifwewanttodosomethingwithitlater…

19

Page 20: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Disadvantages

• Needsalotofstorage– Isthisreallyaproblemnow?

• Predictioncanbeslow!– Naïvely:O(dN)forNtrainingexamplesinddimensions– Moredatawillmakeitslower– Comparetootherclassifiers,wherepredictionisveryfast

• Nearestneighborsarefooledbyirrelevantattributes– Importantandsubtle

20

Questions?

Page 21: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Summary:K-NearestNeighbors• Probablythefirst“machinelearning”algorithm

– Guarantee:Ifthereareenoughtrainingexamples,theerrorofthenearestneighborclassifierwillconvergetotheerroroftheoptimal(i.e.bestpossible)predictor

• Inpractice,useanoddK.Why?– Tobreakties

• HowtochooseK?Usingaheld-outsetorbycross-validation

• Featurenormalizationcouldbeimportant– Often,goodideatocenterthefeaturestomakethemzeromeanandunitstandard

deviation.Why?– Becausedifferentfeaturescouldhavedifferentscales(weight,height,etc);butthe

distanceweightsthemequally

• Variantsexist– Neighbors’labelscouldbeweightedbytheirdistance

21

Page 22: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Summary:K-NearestNeighbors• Probablythefirst“machinelearning”algorithm

– Guarantee:Ifthereareenoughtrainingexamples,theerrorofthenearestneighborclassifierwillconvergetotheerroroftheoptimal(i.e.bestpossible)predictor

• Inpractice,useanoddK.Why?– Tobreakties

• HowtochooseK?Usingaheld-outsetorbycross-validation

• Featurenormalizationcouldbeimportant– Often,goodideatocenterthefeaturestomakethemzeromeanandunitstandard

deviation.Why?– Becausedifferentfeaturescouldhavedifferentscales(weight,height,etc);butthe

distanceweightsthemequally

• Variantsexist– Neighbors’labelscouldbeweightedbytheirdistance

22

Page 23: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Summary:K-NearestNeighbors• Probablythefirst“machinelearning”algorithm

– Guarantee:Ifthereareenoughtrainingexamples,theerrorofthenearestneighborclassifierwillconvergetotheerroroftheoptimal(i.e.bestpossible)predictor

• Inpractice,useanoddK.Why?– Tobreakties

• HowtochooseK?Usingaheld-outsetorbycross-validation

• Featurenormalizationcouldbeimportant– Often,goodideatocenterthefeaturestomakethemzeromeanandunitstandard

deviation.Why?– Becausedifferentfeaturescouldhavedifferentscales(weight,height,etc);butthe

distanceweightsthemequally

• Variantsexist– Neighbors’labelscouldbeweightedbytheirdistance

23

Page 24: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Summary:K-NearestNeighbors• Probablythefirst“machinelearning”algorithm

– Guarantee:Ifthereareenoughtrainingexamples,theerrorofthenearestneighborclassifierwillconvergetotheerroroftheoptimal(i.e.bestpossible)predictor

• Inpractice,useanoddK.Why?– Tobreakties

• HowtochooseK?Usingaheld-outsetorbycross-validation

• Featurenormalizationcouldbeimportant– Often,goodideatocenterthefeaturestomakethemzeromeanandunitstandard

deviation.Why?– Becausedifferentfeaturescouldhavedifferentscales(weight,height,etc);butthe

distanceweightsthemequally

• Variantsexist– Neighbors’labelscouldbeweightedbytheirdistance

24

Page 25: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Summary:K-NearestNeighbors• Probablythefirst“machinelearning”algorithm

– Guarantee:Ifthereareenoughtrainingexamples,theerrorofthenearestneighborclassifierwillconvergetotheerroroftheoptimal(i.e.bestpossible)predictor

• Inpractice,useanoddK.Why?– Tobreakties

• HowtochooseK?Usingaheld-outsetorbycross-validation

• Featurenormalizationcouldbeimportant– Often,goodideatocenterthefeaturestomakethemzeromeanandunitstandard

deviation.Why?– Becausedifferentfeaturescouldhavedifferentscales(weight,height,etc);butthe

distanceweightsthemequally

• Variantsexist– Neighbors’labelscouldbeweightedbytheirdistance

25

Page 26: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Wherearewe?

• K-nearestneighborclassification– Thebasicalgorithm– Differentdistancemeasures– Somepracticalaspects

• Voronoi DiagramsandDecisionBoundaries– Whatisthehypothesisspace?

• TheCurseofDimensionality

26

Page 27: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Wherearewe?

• K-nearestneighborclassification– Thebasicalgorithm– Differentdistancemeasures– Somepracticalaspects

• Voronoi DiagramsandDecisionBoundaries– Whatisthehypothesisspace?

• TheCurseofDimensionality

27

Page 28: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

ThedecisionboundaryforKNN

IstheKnearestneighborsalgorithmexplicitlybuildingafunction?

28

Page 29: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

ThedecisionboundaryforKNN

IstheKnearestneighborsalgorithmexplicitlybuildingafunction?– No,itneverformsanexplicithypothesis

Butwecanstillask:Givenatrainingsetwhatistheimplicitfunctionthatisbeingcomputed

29

Page 30: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheVoronoi Diagram

30

Foranypointx inatrainingsetS,theVoronoi Cellofx isapolyhedronconsistingofallpointsclosertox thananyotherpointsinS

TheVoronoi diagramistheunionofallVoronoi cells• Coverstheentirespace

Page 31: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheVoronoi Diagram

31

Foranypointx inatrainingsetS,theVoronoi Cellofx isapolyhedronconsistingofallpointsclosertox thananyotherpointsinS

TheVoronoi diagramistheunionofallVoronoi cells• Coverstheentirespace

Page 32: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Voronoi diagramsoftrainingexamples

32

Foranypointx inatrainingsetS,theVoronoi Cellofx isapolytopeconsistingofallpointsclosertox thananyotherpointsinS

TheVoronoi diagramistheunionofallVoronoi cells• Coverstheentirespace

PointsintheVoronoicellofatrainingexampleareclosertoitthananyothers

Page 33: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Voronoi diagramsoftrainingexamples

33

Foranypointx inatrainingsetS,theVoronoi Cellofx isapolytopeconsistingofallpointsclosertox thananyotherpointsinS

TheVoronoi diagramistheunionofallVoronoi cells• Coverstheentirespace

PointsintheVoronoicellofatrainingexampleareclosertoitthananyothers

Page 34: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Voronoi diagramsoftrainingexamples

34

PointsintheVoronoicellofatrainingexampleareclosertoitthananyothers

PictureusesEuclideandistancewith1-nearestneighbors.

WhataboutK-nearestneighbors?

Alsopartitionsthespace,butmuchmorecomplexdecisionboundary

Page 35: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Voronoi diagramsoftrainingexamples

35

PointsintheVoronoicellofatrainingexampleareclosertoitthananyothers

PictureusesEuclideandistancewith1-nearestneighbors.

WhataboutK-nearestneighbors?

Alsopartitionsthespace,butmuchmorecomplexdecisionboundary

Whataboutpointsontheboundary? Whatlabelwilltheyget?

Page 36: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Exercise

Ifyouhaveonlytwotrainingpoints,whatwillthedecisionboundaryfor1-nearestneighborbe?

36

Page 37: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Exercise

Ifyouhaveonlytwotrainingpoints,whatwillthedecisionboundaryfor1-nearestneighborbe?

– Alinebisectingthetwopoints

37

Page 38: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Thislecture

• K-nearestneighborclassification– Thebasicalgorithm– Differentdistancemeasures– Somepracticalaspects

• Voronoi DiagramsandDecisionBoundaries– Whatisthehypothesisspace?

• TheCurseofDimensionality

38

Page 39: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Whyyourclassifiermightgowrong

Twoimportantconsiderationswithlearningalgorithms

• Overfitting:Wehavealreadyseenthis

• Thecurseofdimensionality– Methodsthatworkwithlowdimensionalspacesmayfailinhigh

dimensions– Whatmightbeintuitivefor2or3dimensionsdonotalwaysapplyto

highdimensionalspaces

39

Checkoutthe1884bookFlatland:ARomanceofManyDimensions forafunintroductiontothefourthdimension

Page 40: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Ofcourse,irrelevantattributeswillhurt

Supposewehave1000dimensionalfeaturevectors– Butonly10featuresarerelevant

– Distanceswillbedominatedbythelargenumberofirrelevantfeatures

40

Page 41: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Ofcourse,irrelevantattributeswillhurt

Supposewehave1000dimensionalfeaturevectors– Butonly10featuresarerelevant

– Distanceswillbedominatedbythelargenumberofirrelevantfeatures

41

Butevenwithonlyrelevantattributes,highdimensionalspacesbehaveinoddways

Page 42: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheCurseofDimensionality

Example1:Whatfractionofthepointsinacubelieoutsidethesphereinscribedinit?

42

Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces

Whatfractionofthesquare(i.e thecube)isoutsidetheinscribedcircle(i.e thesphere)intwodimensions?

2r

Intwodimensions

Page 43: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheCurseofDimensionality

Example1:Whatfractionofthepointsinacubelieoutsidethesphereinscribedinit?

43

Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces

Whatfractionofthesquare(i.e thecube)isoutsidetheinscribedcircle(i.e thesphere)intwodimensions?

2r

Intwodimensions

Page 44: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheCurseofDimensionality

Example1:Whatfractionofthepointsinacubelieoutsidethesphereinscribedinit?

44

Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces

Whatfractionofthesquare(i.e thecube)isoutsidetheinscribedcircle(i.e thesphere)intwodimensions?

2r

Intwodimensions

But,distancesdonotbehavethesamewayinhighdimensions

Page 45: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheCurseofDimensionality

Example1:Whatfractionofthepointsinacubelieoutsidethesphereinscribedinit?

Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces

2r

Inthreedimensions

45

Whatfractionofthecubeisoutsidetheinscribedsphereinthreedimensions?

But,distancesdonotbehavethesamewayinhighdimensions

Page 46: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheCurseofDimensionality

Example1:Whatfractionofthepointsinacubelieoutsidethesphereinscribedinit?

Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces

2r

Inthreedimensions

46

Whatfractionofthecubeisoutsidetheinscribedsphereinthreedimensions?

Asthedimensionalityincreases,thisfractionapproaches1!!

Inhighdimensions,mostofthevolumeofthecubeisfarawayfromthecenter!

Page 47: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheCurseofDimensionality

Example2:Whatfractionofthevolumeofaunitsphereliesbetweenradius1- ² andradius1?

Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces

47

Page 48: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheCurseofDimensionality

Example2:Whatfractionofthevolumeofaunitsphereliesbetweenradius1- ² andradius1?

Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces

48

Intwodimensions

Page 49: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheCurseofDimensionality

Example2:Whatfractionofthevolumeofaunitsphereliesbetweenradius1- ² andradius1?

Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces

49

Intwodimensions Whatfractionoftheareaofthecircleisintheblueregion?

Page 50: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheCurseofDimensionality

Example2:Whatfractionofthevolumeofaunitsphereliesbetweenradius1- ² andradius1?

Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces

50

Intwodimensions Whatfractionoftheareaofthecircleisintheblueregion?

Page 51: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheCurseofDimensionality

Example2:Whatfractionofthevolumeofaunitsphereliesbetweenradius1- ² andradius1?

Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces

51

But,distancesdonotbehavethesamewayinhighdimensions

Intwodimensions Whatfractionoftheareaofthecircleisintheblueregion?

Page 52: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheCurseofDimensionality

Example2:Whatfractionofthevolumeofaunitsphereliesbetweenradius1- ² andradius1?

Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces

52

But,distancesdonotbehavethesamewayinhighdimensionsInddimensions,thefractionis

Asdincreases,thisfractiongoesto1!

Inhighdimensions,mostofthevolumeofthesphereisfarawayfromthecenter!

Questions?

Page 53: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

TheCurseofDimensionality

• Mostofthepointsinhighdimensionalspacesarefarawayfromtheorigin!– In2or3dimensions,mostpointsarenearthecenter– Needmoredatato“fillupthespace”

• Badnewsfornearestneighborclassificationinhighdimensionalspaces

Evenifmost/allfeaturesarerelevant,inhighdimensionalspaces,mostpointsareequallyfarfromeachother!

“Neighborhood”becomesverylarge

Presentscomputationalproblemstoo

53

Page 54: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Dealingwiththecurseofdimensionality

• Most“real-world”dataisnotuniformlydistributedinthehighdimensionalspace– Differentwaysofcapturingtheunderlyingdimensionality ofthespace

• Eg:Dimensionalityreductiontechniques,manifoldlearning

• Featureselection isanart– Differentmethodsexist– Selectfeatures,maybebyinformationgain– Tryoutdifferentfeaturesetsofdifferentsizesandpickagoodset

basedonavalidationset

• Priorknowledgeorpreferencesaboutthehypothesescanalsohelp

54

Questions?

Page 55: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Summary:Nearestneighborsclassification

• Probablytheoldestandsimplestlearningalgorithm– Predictionisexpensive.

• Efficientdatastructureshelp.k-Dtrees:themostpopular,workswellinlowdimensions

• Approximatenearestneighborsmaybegoodenoughsometimes.Hashingbasedalgorithmsexist

• Requiresadistancemeasurebetweeninstances– Metriclearning:Learnthe“right”distanceforyourproblem

• PartitionsthespaceintoaVoronoi Diagram

• Bewarethecurseofdimensionality

55

Questions?

Page 56: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Exercises

1. WhatwillhappenwhenyouchooseKtothenumberoftrainingexamples?

2. Supposeyouwanttobuildanearestneighborsclassifiertopredictwhetherabeverageisacoffeeorateausingtwofeatures:thevolumeoftheliquid(inmilliliters)andthecaffeinecontent(ingrams).Youcollectthefollowingdata:

56

Volume(ml)

Caffeine(g) Label

238 0.026 Tea

100 0.011 Tea

120 0.040 Coffee

237 0.095 Coffee

WhatisthelabelforatestpointwithVolume=120,Caffeine=0.013?

Whymightthisbeincorrect?

Howwouldyoufixtheproblem?

Page 57: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional

Exercises

1. WhatwillhappenwhenyouchooseKtothenumberoftrainingexamples?

2. Supposeyouwanttobuildanearestneighborsclassifiertopredictwhetherabeverageisacoffeeorateausingtwofeatures:thevolumeoftheliquid(inmilliliters)andthecaffeinecontent(ingrams).Youcollectthefollowingdata:

57

Volume(ml)

Caffeine(g) Label

238 0.026 Tea

100 0.011 Tea

120 0.040 Coffee

237 0.095 Coffee

WhatisthelabelforatestpointwithVolume=120,Caffeine=0.013?

Whymightthisbeincorrect?

Howwouldyoufixtheproblem?

Thelabelwillalwaysbethemostcommonlabelinthetrainingdata

Coffee

BecauseVolumewilldominatethedistance

Rescalethefeatures.Maybetozeromean,unitvariance