28
Machine Learning for Data Science (CS4786) Lecture 5 Gaussian Mixture Models Course Webpage : http://www.cs.cornell.edu/Courses/cs4786/2017fa/

Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

Machine Learning for Data Science (CS4786)Lecture 5

Gaussian Mixture Models

Course Webpage :http://www.cs.cornell.edu/Courses/cs4786/2017fa/

Page 2: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

Clustering demo

Page 3: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

Two elongated ellipses

Page 4: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

Iris dataset: Flowers

Iris-Setosa Iris-versicolor Iris-virginica

Page 5: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

Smiley dataset

Page 6: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

Wisconsin Breast Cancer dataset

Page 7: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

K-means: pitfalls

Page 8: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

K-means: pitfalls

Page 9: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

K-means: pitfalls

Page 10: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

K-means: pitfalls

Page 11: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

K-means: pitfalls

Page 12: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

K-means: pitfalls

Page 13: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

K-means: pitfalls

Page 14: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

K-means: pitfalls

Page 15: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

K-means: pitfalls

Page 16: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

K-means: pitfalls

Page 17: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

K-means: pitfalls

Page 18: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

K-means: pitfalls

• Looks for spherical clusters

• Of same radius

• And with roughly equal number of points

Page 19: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

• Can we design algorithm that can address these shortcomings?

K-means: pitfalls

Page 20: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

Ellipsoid

⌃ =1

|Cj |X

t2Cj

(xt � rj)(xt � rj)>

rjx

(x� rj)>⌃�1(x� rj)

Page 21: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

t

K-MEANS CLUSTERING

For all j ∈ [K], initialize cluster centroids r

0j randomly and set m = 1

Repeat until convergence (or until patience runs out)1 For each t ∈ {1, . . . ,n}, set cluster identity of the point

cm(xt) = argminj∈[K]

�xt − r

m−1j �

2 For each j ∈ [K], set new representative as

r

mj = 1�Cm

j � �xt∈Cmj

xt

3 m← m + 1

Page 22: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

ELLIPSOIDAL CLUSTERING

For all j ∈ [K], initialize cluster centroids r

0j and ellipsoids ⌃0

jrandomly and set m = 1Repeat until convergence (or until patience runs out)

1 For each t ∈ {1, . . . ,n}, set cluster identity of the point

cm(xt) = argminj∈[K]

(xt − r

m−1j )� �⌃m−1�−1 (xt − r

m−1j )

2 For each j ∈ [K], set new representative as

r

mj = 1�Cm

j � �xt∈Cmj

xt ⌃m = 1�Cj� �t∈Cj

(xt − r

mj )(xt − r

mj )�

3 m← m + 1

Page 23: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

K-means: pitfalls

• Looks for spherical clusters

• Of same radius

• And with roughly equal number of points

Page 24: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

HARD GAUSSIAN MIXTURE MODEL

For all j ∈ [K], initialize cluster centroids r

0j , ellipsoids ⌃0

j andinitial proportions ⇡0 randomly and set m = 1Repeat until convergence (or until patience runs out)

1 For each t ∈ {1, . . . ,n}, set cluster identity of the point

cm(xt) = argminj∈[K]

(xt − r

m−1j )� �⌃m−1�−1 (xt − r

m−1j ) − log(⇡m−1

j )2 For each j ∈ [K], set new representative as

r

mj = 1�Cm

j � �xt∈Cmj

xt ⌃m = 1�Cj� �t∈Cj

(xt − r

mj )(xt − r

mj )� ⇡m

j = �Cmj �

n

3 m← m + 1

Page 25: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

Multivariate Gaussian• Two parameters:

• Mean

• Covariance matrix of size dxd

µ 2 Rd

p(x;µ,⌃) = (2⇡)

d/2det(⌃)

�1/2exp

✓�1

2

(x� µ)

>⌃

�1(x� µ)

Page 26: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

Multivariate Gaussian• Two parameters:

• Mean

• Covariance matrix of size dxd

µ 2 Rd

32

10

x

-1

exp(-(10 x2 +y2)/2)

-2-3-3

-2

-1

0

y

1

2

3

1

0.8

0.6

0.4

0.2

0

p(x;µ,⌃) = (2⇡)

d/2det(⌃)

�1/2exp

✓�1

2

(x� µ)

>⌃

�1(x� µ)

Page 27: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

HARD GAUSSIAN MIXTURE MODEL

For all j ∈ [K], initialize cluster centroids r

0j , ellipsoids ⌃0

j andinitial proportions ⇡0 randomly and set m = 1Repeat until convergence (or until patience runs out)

1 For each t ∈ {1, . . . ,n}, set cluster identity of the point

cm(xt) = argminj∈[K]

p(xt, rm−1j , ⌃m−1) × ⇡m(j)

2 For each j ∈ [K], set new representative as

r

mj = 1�Cm

j � �xt∈Cmj

xt ⌃m = 1�Cj� �t∈Cj

(xt − r

mj )(xt − r

mj )� ⇡m

j = �Cmj �

n

3 m← m + 1

Page 28: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of

(SOFT) GAUSSIAN MIXTURE MODEL

For all j ∈ [K], initialize cluster centroids r

0j and ellipsoids ⌃0

jrandomly and set m = 1Repeat until convergence (or until patience runs out)

1 For each t ∈ {1, . . . ,n}, set cluster identity of the point

Qmt (j) = p(xt, r

m−1j , ⌃m−1) × ⇡m(j)

2 For each j ∈ [K], set new representative as

r

mj = ∑

nt=1 Qt(j)xt∑n

t=1 Qt(j) ⌃m = ∑nt=1 Qt(j)(xt − r

mj )(xt − r

mj )�

∑nt=1 Qt(j)

⇡mj = ∑

nt=1 Qt(j)

n

3 m← m + 1