Upload
doanminh
View
239
Download
0
Embed Size (px)
Citation preview
INTRODUCTION TO
Machine Learning
ETHEM ALPAYDIN© The MIT Press, 2004
[email protected]://www.cmpe.boun.edu.tr/~ethem/i2ml
Lecture Slides for
CHAPTER 8:
Nonparametric Methods
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
3
Nonparametric Estimation
Parametric (single global model), semiparametric (small number of local models)Nonparametric: Similar inputs have similar outputsFunctions (pdf, discriminant, regression) change smoothlyKeep the training data;“let the data speak for itself”Given x, find a small number of closest training instances and interpolate from theseAka lazy/memory-based/case-based/instance-based learning
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
4
Density Estimation
Given the training set X={xt}t drawn iid from p(x)
Divide data into bins of size h
Histogram:
Naive estimator:
or
( ) { }Nh
xx#xp
t as bin same the in =
( ) { }Nh
hxxhx#xp
t
2+≤<−
=
( ) ( )⎩⎨⎧ <
=⎟⎟⎠
⎞⎜⎜⎝
⎛ −= ∑
= otherwise0
1if 21
1
1
u/uw
hxx
wNh
xpN
t
t
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
5
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
6
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
7
Kernel Estimator
Kernel function, e.g., Gaussian kernel:
Kernel estimator (Parzen windows)
( ) ∑=
⎟⎟⎠
⎞⎜⎜⎝
⎛ −=
N
t
t
hxx
KNh
xp1
1
( ) ⎥⎦
⎤⎢⎣
⎡−
π=
2exp
2
1 2uuK
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
8
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
9
k-Nearest Neighbor Estimator
Instead of fixing bin width h and counting the number of instances, fix the instances (neighbors) k and check bin width
dk(x), distance to kth closest instance to x
( ) ( )xNdk
xpk2
=
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
10
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
11
Multivariate Data
Kernel density estimator
Multivariate Gaussian kernel
spheric
ellipsoid
( ) ∑=
⎟⎟⎠
⎞⎜⎜⎝
⎛ −=
N
t
t
d hK
Nhp
1
1 xxx
( )
( )( ) ⎥⎦
⎤⎢⎣⎡−
π=
⎥⎥⎦
⎤
⎢⎢⎣
⎡−⎟
⎠
⎞⎜⎝
⎛π
=
− uuu
uu
1212
2
21
exp2
1
2exp
2
1
SS
T//d
d
K
K
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
12
Nonparametric Classification
Estimate p(x|Ci) and use Bayes’ rule
Kernel estimator
k-NN estimator
( ) ( )
( ) ( ) ( ) ti
N
t
t
diii
ii
ti
N
t
t
di
i
rh
KNh
CPCpg
NN
CPrh
KhN
Cp
∑
∑
=
=
⎟⎟⎠
⎞⎜⎜⎝
⎛ −==
=⎟⎟⎠
⎞⎜⎜⎝
⎛ −=
1
1
1|
1
|
xxxx
xxx
( ) ( ) ( ) ( ) ( )( ) k
kp
CPCpCP
VNk
Cp iiiik
i
ii ===
xx
xx
x|
| |
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
13
Condensed Nearest Neighbor
Time/space complexity of k-NN is O (N)
Find a subset Z of X that is small and is accurate in classifying X (Hart, 1968)
( ) ( ) ZZXXZ λ+= || E'E
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
14
Condensed Nearest Neighbor
Incremental algorithm: Add instance if needed
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
15
Nonparametric Regression
Aka smoothing models
Regressogram
( ) ( )( )
( )⎩⎨⎧
=
=∑∑
=
=
otherwise0
withbin same the in is if 1
where
1
1
xxx,xb
x,xb
rx,xbxg
tt
N
t
t
tN
t
t
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
16
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
17
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
18
Running mean smoother
Running line smoother
Running Mean/Kernel Smoother
Kernel smoother
where K( ) is Gaussian
Additive models (Hastie and Tibshirani, 1990)
( )
( )⎩⎨⎧ <
=
⎟⎟⎠
⎞⎜⎜⎝
⎛ −
⎟⎟⎠
⎞⎜⎜⎝
⎛ −
=
∑
∑
=
=
otherwise0
1if 1
where
1
1
uuw
hxx
w
rh
xxw
xgN
t
t
tN
t
t
( )∑
∑
=
=
⎟⎟⎠
⎞⎜⎜⎝
⎛ −
⎟⎟⎠
⎞⎜⎜⎝
⎛ −
=N
t
t
tN
t
t
hxx
K
rh
xxK
xg
1
1
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
19
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
20
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
21
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
22
How to Choose k or h?
When k or h is small, single instances matter; bias is small, variance is large (undersmoothing): High complexity
As k or h increases, we average over more instances and variance decreases but bias increases (oversmoothing): Low complexity
Cross-validation is used to finetune k or h.