27
A short tutorial on Gaussian Mixture Models CRV 2010 By: Mohand Saïd Allili Université du Québec en Outaouais 1

CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

Embed Size (px)

Citation preview

Page 1: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

A short tutorial onGaussian Mixture Models

CRV 2010

By: Mohand Saïd AlliliUniversité du Québec en Outaouais

1

Page 2: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

2

Plan

-Introduction

-What is a Gaussian mixture model?

-The Expectation-Maximization algorithm

-Some issues

-Applications of GMM in computer vision

Page 3: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

3

A few basic equalities that are often used:

)()|()( BpBApBAp =∩

• Conditional probabilities:

)()()|()|(

APBpBApABp =

• Bayes rule:

)()(

,:

1i

K

i

ji

K

1ii

BApAp

BBjiB

∩=

=∩≠∀Ω=

∑=

=

                                

:then   and   if φU

Introduction

Page 4: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

44

Carl Friedrich Gauss invented the normal distribution in 1809 as a way to rationalize the method of least squares.

Introduction

Page 5: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

5

For d dimensions, the Gaussian distribution of a vector x =(x1, x2, …,xd)T

is defined by:

⎟⎠⎞

⎜⎝⎛ −Σ−−

Σ=Σ − )()(

21exp

||)2(1),|( 1

2/ μμπ

μ xxxΝ Td

What is a Gaussian?

Gaussian. the of matrix covariancethe is  and mean  the  is    where Σμ

Example: T)0,0(=μ ⎟⎟⎠

⎞⎜⎜⎝

⎛=Σ

00.130.030.025.0

Page 6: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

6

What is a Gaussian mixture model?

Examples:

d=1: d=2:

The probability given in a mixture of K Gaussians is:

∑=

Σ⋅=K

jjjj xΝwxp

1

),|()( μ

Gaussian.  th  the  of  (weight)  yprobabilit  prior  the is    where jwj

∑=

=K

jjw

1

1 10 ≤≤ jwand

Page 7: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

7

data. the fits that model GMM the of parameters the estimate GMM), a(probably ondistributi unknown

an from drawn data of set a Given

θ

},...,,{ 21 NxxxX =

• Problem:

What is a Gaussian mixture model?

• Solution:

?)|(

parameters model the toregard withdata the of likelihood the Maximize θXp

∏=

==N

iixpXp

1

* )|(maxarg )|(maxarg θθθθθ

Page 8: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

The Expectation-Maximization algorithm

• Basic ideas of the EM algorithm:

.likelihood the of onmaximizati the simplify wouldknowledge its that such variable hidden a Introduce -

:iteration each At-

• E-Step: Estimate the distribution of the hidden variable given the data and the current value of the parameters.

• M-Step: Maximize the joint distribution of the data and the hidden variable.

8

algorithm. (EM) onMaximizati-nExpectatio the use to islikelihood the maximize to approaches popular most the of One

Page 9: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

9

The EM for the GGM (graphical view 1)

Hidden variable: for each point, which Gaussian generated it?

Page 10: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

10

The EM for the GGM (graphical view 2)

E-Step: for each point, estimate the probability that each Gaussian generated it.

Page 11: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

11

The EM for the GGM (graphical view 3)

M-Step: modify the parameters according to the hidden variable to maximize the likelihood of the data (and the hidden variable).

Page 12: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

12

General formulation of the EM

)Z is  name variablehidden the follows  what (In

:function auxillary""  following  the  Consider

[ ]))|,(log(),( tZ

t ZXpEQ θθθ =

:),( tQ θθ  maximizing  that  shown   been  can  It

[ ]),(maxarg1 tt Q θθθθ

=+

data.  the of dliekelihoo the maximizes  always

Page 13: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

[ ]

)]|(log[)],|(log[),|(

)]|(),|(log[),|(

))|,(log(),|(

))|,(log(),(

θθθ

θθθ

θθ

θθθ

XpXZpXzp

XpXZpXzp

ZXpXzp

ZXpEQ

z

t

z

t

z

t

tZ

t

+⎥⎦

⎤⎢⎣

⎡=

⋅=

=

=

                

                

                

13

Proof of EM convergence

)1(

Page 14: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

14

Proof of EM convergence

)]|(log[)],|(log[),|(),( tt

z

ttt XpXZpXzpQ θθθθθ +⎥⎦

⎤⎢⎣

⎡= ∑

:have  we    If ,tθθ =

:have we  ,(2)   and   (1)  From

444444 3444444 210

),|(),|(log),|(),(),(

)]|(log[)]|(log[

⎥⎦

⎤⎢⎣

⎡⎥⎦

⎤⎢⎣

⎡+−

=−

∑ tz

tttt

t

XZpXZpXzpQQ

XpXp

θθθθθθθ

θθ

         

 increases.     likelihood  the   then  increases     If )|(),( θθθ XpQ t

Conclusion:

.likelihood  maximum  the  as  same  the is     of  maximum  The ),( tQ θθ

)2(

Page 15: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

15

EM for the GMM

known.  always  not  is    ely,Unfortunat  Gaussian.  by  Gaussian

estimated be can parameters the then observed, is    if   that  Note ‐

ZZ

Nixi :,...,1, =  point  data  a  for   Gaussians  of  mixture  the  write  us  Let‐

∑=

Σ⋅=K

jjjiji xΝwxp

1

),|()( μ

:variable indicator the  introduce  We‐

  otherwise.    0

  emitted    Gaussian  if    1

⎩⎨⎧

=.i

ij

xjz

Page 16: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

16

EM for the GMM

:follows as  and   all  of likelihood joint the write  now  can  We ZX

[ ]

[ ] [ ]∏∏

∏∏

= =

= =

=

=

=

N

i

K

j

zzi

N

i

K

j

zi

ijij

ij

jpjxp

jxp

ZXpZXL

1 1

1 1

)|(),|(

)|,(

)|,(),,(

θθ

θ

θθ

                   

                   

:gives function  log""  Using

[ ] [ ] [ ]∑∑= =

+=N

i

K

jijiij jpzjxpzZXp

1 1

)|(log),|(log),(log θθ  

Page 17: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

17

:by  given is  function  auxillary   The

[ ][ ] [ ]

[ ] [ ]∑∑

∑∑

= =

= =

+=

⎥⎦

⎤⎢⎣

⎡+=

=

N

i

K

j

tijZi

tijZ

N

i

K

j

tijiijZ

tZ

t

jpzEjxpzE

jpzjxpzE

ZXpEQ

1 1

1 1

)|(log)|(),|(log)|(

|)|(log),|(log

|)),(log(),(

θθθθ

θθθ

θθθ

               

               

  

EM for the GMM

:by given Step‐E the  then  have  We

),(),|(),(

),|(

)|0(0)|1(1)|(

ti

ti

t

ti

tij

tij

tijZ

xpjxpjp

xjp

zpzpzE

θθθ

θ

θθθ

=

=

=×+=×=

                   

                   

(Posterior distribution)

Page 18: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

18

:by given Step‐M  the  And

EM for the GMM

0),(=

∂∂

θθθ tQ

.1),...,1,,,( ∑=

==Σ=K

1jjjjj wKjw   and      where μθ

:obtain  we ons,manipulati  rwardstraightfo  After

=

== N

i

ti

i

N

i

ti

j

xjp

xxjp

1

1

),|(

),|(

θ

θμ

[ ]

=

=

−−=Σ N

i

ti

Tjiji

N

i

ti

j

xjp

xxxjp

1

1

),|(

))((),|(

θ

μμθ

Page 19: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

19

EM for the GMM

:gives multiplier  Lagrange  using     constraint  the   gcorporatin In ∑=

=K

1jjw 1

⎟⎟⎠

⎞⎜⎜⎝

⎛−−= ∑

=

K

jj

tt wQJ1

1),(),( λθθθθ

:Then

0),|(

),(),(

1

=−=

−∂

∂=

∂∂

∑=

λθ

λθθθθ

N

i j

ti

j

t

j

t

wxjp

wQ

wJ

                 

 :gives   which

λ

θ∑==

N

i

ti

j

xjpw 1

),|()3(

Page 20: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

20

EM for the GMM

:have  we Also,

01),(1

=−=∂

∂ ∑=

K

jj

t

wJλθθ

:obtain we(4), and (3) From

1),|(11 1

=∑∑= =

K

j

N

i

tixjp θ

λ

  :that  follows  It N=λ

:finally  and ∑=

=N

i

tij xjp

Nw

1

),|(1 θ

)4(

Page 21: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

21

Some issues

1- Initialization:- EM is an iterative algorithm which is very sensitive to initial conditions:

trash with  up end  trash from  Start →

- Usually, we use the K-Means to get a good initialization.

2- Number of Gaussians:

- Use information-theoretic criteria to obtain the optima K.

)log()1(21)1()),,(log( NDDDKKZXLMDL

⎭⎬⎫

⎩⎨⎧

⎥⎦⎤

⎢⎣⎡ +++−+−= θ

.,,, etcMMLBICAIC:criteria Other

Ex:

Page 22: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

22

3- Simplification of the covariance matrices:

Some issues

Case 1: Spherical covariance matrix Idiag jjjjj2222 ),...,,( σσσσ ==Σ

Case 2: Diagonal covariance matrix ),...,,( 222

21 jdjjj diag σσσ=Σ

-Less precise.-Very efficient to compute.

-More precise.-Efficient to compute.

Page 23: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

23

3- Simplification of the covariance matrices:

Some issues

Case 1: Full covariance matrix

⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢

221

22212

12121

),(cov),(cov

),(cov),(cov

),(cov),(cov

jddjdj

djjj

djjj

j

xxxx

xxxx

xxxx

σ

σ

σ

L

MMMM

L

L

-Very precise.-Less efficient to compute.

Page 24: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

24

Applications of GMM in computer vision

1- Image segmentation:

( )TBGRX ,,=

Page 25: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

25

Applications of GMM in computer vision

2- Object tracking:Knowing the moving object distribution in the first frame, we can localizethe objet in the next frames by tracking its distribution.

Page 26: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

26

Some references

1. Christopher M. Bishop. Pattern Recognition and Machine Learning, Springer, 2006.

2. Geoffrey McLachlan and David Peel. Finite Mixture Models. Wiley Series in Probability and Statistics, 2000.

3. Samy Bengio. Statistical Machine Leaning from Data: Gaussian Mixture Models. http://bengio.abracadoudou.com/lectures/gmm.pdf

4. T. Hastie et al. The Elements of of Statistical Learning. Springer, 2009.

Page 27: CRV 2010 - computerrobotvision.orgcomputerrobotvision.org/2010/tutorial_day/GMM_said_crv10_tutorial.pdf · A short tutorial on. Gaussian Mixture Models. CRV. 2010. By: Mohand Saïd

Questions?

27