27
Gaussian Mixture Models and Acoustic Modeling Lecture 9 Spoken Language Processing Prof. Andrew Rosenberg

Gaussian Mixture Models and Acoustic Modeling

  • Upload
    kaspar

  • View
    115

  • Download
    0

Embed Size (px)

DESCRIPTION

Gaussian Mixture Models and Acoustic Modeling. Lecture 9 Spoken Language Processing Prof. Andrew Rosenberg. Acoustic Modeling. The goal of the Acoustic Model is to hypothesize a phone label based on acoustic observations . - PowerPoint PPT Presentation

Citation preview

Page 1: Gaussian Mixture Models and Acoustic Modeling

Gaussian Mixture Models and Acoustic Modeling

Lecture 9Spoken Language Processing

Prof. Andrew Rosenberg

Page 2: Gaussian Mixture Models and Acoustic Modeling

2

Acoustic Modeling• The goal of the Acoustic Model is to

hypothesize a phone label based on acoustic observations.– The phone label will be defined by the

phone inventory (e.g., IPA, ARPAbet, etc.)

– Acoustic Observations will be MFCCs• There are other options.

Page 3: Gaussian Mixture Models and Acoustic Modeling

3

Gaussian Mixture Model

Page 4: Gaussian Mixture Models and Acoustic Modeling

4

Mixture Models• A Mixture Model is the weighted sum of

a number of pdfs where the weights are determined by a multinomial distribution, π

Page 5: Gaussian Mixture Models and Acoustic Modeling

5

Gaussian Mixture Model• GMM: weighted sum of a number of

Gaussians where the weights are determined by a multinomial, π

Page 6: Gaussian Mixture Models and Acoustic Modeling

6

Visualizing the a GMM

Page 7: Gaussian Mixture Models and Acoustic Modeling

7

Latent Variable representation• The mixture coefficients can be viewed as

a latent or unobserved variable.• Training a GMM involves learning both the

parameters for the individual Gaussian Models and the Mixture coefficients.

• For a fixed set of data points x, the optimal setting of the GMM parameters may not have a single optimum.

Page 8: Gaussian Mixture Models and Acoustic Modeling

8

Maximum Likelihood Optimization

• Likelihood Function

• Log likelihood

• A log transform makes the optimization much simpler.

Page 9: Gaussian Mixture Models and Acoustic Modeling

9

Optimizing GMM parameters• Identifying the optimal parameters involves setting

partial derivatives of the likelihood function to zero.

Page 10: Gaussian Mixture Models and Acoustic Modeling

10

Optimizing GMM parameters• Covariance Optimization

Page 11: Gaussian Mixture Models and Acoustic Modeling

11

Optimizing GMM parameters• Mixture Term

Page 12: Gaussian Mixture Models and Acoustic Modeling

12

Maximum Likelihood Estimate

Page 13: Gaussian Mixture Models and Acoustic Modeling

13

What’s the problem?• Circularity: The responsibilities are assigned by

the GMM parameters, and are used in identifying their optimal settings– The Maximum Likelihood Function of the GMM does

not have a closed for optimization for all three variables.

• Expectation Maximization: – Keep one variable fixed, optimize the other.– Here,

• fix the responsibility terms, optimize the GMM parameters• then fix the GMM parameters, and optimize the

responsibilities

Page 14: Gaussian Mixture Models and Acoustic Modeling

Expectation Maximization for GMMs

• Initialize the parameters– Evaluate the log likelihood

• Expectation-step: Evaluate the responsibilities

• Maximization-step: Re-estimate Parameters– Evaluate the log likelihood– Check for convergence

Page 15: Gaussian Mixture Models and Acoustic Modeling

15

E-M for Gaussian Mixture Models

• Initialize the parameters– Evaluate the log likelihood

• Expectation-step: Evaluate the responsibilities

• Maximization-step: Re-estimate Parameters– Evaluate the log likelihood– Check for convergence

Page 16: Gaussian Mixture Models and Acoustic Modeling

EM for GMMs• E-step: Evaluate the Responsibilities

Page 17: Gaussian Mixture Models and Acoustic Modeling

EM for GMMs• M-Step: Re-estimate Parameters

Page 18: Gaussian Mixture Models and Acoustic Modeling

Visual example of EM

Page 19: Gaussian Mixture Models and Acoustic Modeling

Potential Problems• Incorrect number of Mixture

Components

• Singularities

Page 20: Gaussian Mixture Models and Acoustic Modeling

Incorrect Number of Gaussians

Page 21: Gaussian Mixture Models and Acoustic Modeling

Incorrect Number of Gaussians

Page 22: Gaussian Mixture Models and Acoustic Modeling

Singularities• A minority of the data can have a

disproportionate effect on the model likelihood.

• For example…

Page 23: Gaussian Mixture Models and Acoustic Modeling

GMM example

Page 24: Gaussian Mixture Models and Acoustic Modeling

Singularities• When a mixture component collapses

on a given point, the mean becomes the point, and the variance goes to zero.

• Consider the likelihood function as the covariance goes to zero.

• The likelihood approaches infinity.

Page 25: Gaussian Mixture Models and Acoustic Modeling

25

Training acoustic models• TIMIT

– close, manual phonetic transcription– 2342 sentences

• Extract MFCC vectors from each frame within each phone

• For each phone, train a GMM using Expectation Maximization.

• These GMM is the Acoustic Model.– Common to use 8, or 16 Gaussian Mixture

Components.

Page 26: Gaussian Mixture Models and Acoustic Modeling

26

Sequential Models• Make a prediction every frame.• How often can phones change?• Encourage continuity in predictions.• Model phone transitions.

Page 27: Gaussian Mixture Models and Acoustic Modeling

27

Next Class• Hidden Markov Models• Reading: J&M 5.5, 9.2