37
S. Maarschalkerweerd & A. Tjhang 1 Parameter estimation for Parameter estimation for HMMs, Baum-Welch HMMs, Baum-Welch algorithm, Model topology, algorithm, Model topology, Numerical stability Numerical stability Chapter 3.3-3.7

Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

  • Upload
    ayala

  • View
    29

  • Download
    0

Embed Size (px)

DESCRIPTION

Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability. Chapter 3.3-3.7. Overview last lecture. Hidden Markov Models Different algorithms: Viterbi Forward Backward. Overview today. Parameter estimation for HMMs Baum-Welch algorithm HMM model structure - PowerPoint PPT Presentation

Citation preview

Page 1: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

1

Parameter estimation for HMMs, Parameter estimation for HMMs, Baum-Welch algorithm, Model Baum-Welch algorithm, Model topology, Numerical stability topology, Numerical stability

Chapter 3.3-3.7

Page 2: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

2

Overview last lectureOverview last lecture

Hidden Markov Models Different algorithms:

– Viterbi– Forward– Backward

Page 3: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

3

Overview todayOverview today

Parameter estimation for HMMs– Baum-Welch algorithm

HMM model structureMore complex Markov chainsNumerical stability of HMM algorithms

Page 4: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

4

Specifying a HMM modelSpecifying a HMM model

Most difficult problem using HMMs is specifying the model– Design of the structure– Assignment of parameter values

Page 5: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

5

Specifying a HMM modelSpecifying a HMM model

Most difficult problem using HMMs is specifying the model– Design of the structure– Assignment of parameter values

Page 6: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

6

Parameter estimation for HMMsParameter estimation for HMMs

Estimate transition and emission probabilities akl and ek(b)

Two ways of learning:– Estimation when state sequence is known– Estimation when paths are unknown

Assume that we have a set of example sequences (training sequences x1, …xn)

Page 7: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

7

Parameter estimation for HMMsParameter estimation for HMMs

Assume that x1…xn independent.So P(x1,…,xn | ) = P(xj | )

Since log ab = log a + logb

n

j=1

Page 8: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

8

Estimation when state Estimation when state sequence is knownsequence is known

Easier than estimation when paths unknown

Akl = number of transitions k to l in trainingdata + rkl

Ek(b) = number of emissions of b from k in training data + rk(b)

Page 9: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

9

Estimation when paths are unknownEstimation when paths are unknown

More complex than when paths are knownWe can’t use maximum likelihood

estimators Instead, an iterative algorithm is used

– Baum-Welch

Page 10: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

10

The Baum-Welch algorithmThe Baum-Welch algorithm

We don’t know real values of Akl and Ek(b)

1. Estimate Akl and Ek(b)

2. Update akl and ek(b)

3. Repeat with new model parameters akl and ek(b)

Page 11: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

11

Baum-Welch algorithmBaum-Welch algorithmForward value Backward value

Page 12: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

12

Baum-Welch algorithmBaum-Welch algorithm

Now that we have estimated Akl and Ek(b), use maximum likelihood estimators to compute akl and ek(b)

We use these values to estimate Akl and Ek(b) in the next iteration

Continue doing this iteration until change is very small or max number of iterations is exceeded

Page 13: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

13

Baum-Welch algorithmBaum-Welch algorithm

Page 14: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

14

ExampleExample

Page 15: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

15

Page 16: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

16

DrawbacksDrawbacks

ML estimators– Vulnerable to overfitting if not enough data– Estimations can be undefined if never used in

training set (use pseudocounts)Baum-Welch

– Local maximum instead of global maximum can be found, depending on starting values of parameters

– This problem will be worse for large HMMs

Page 17: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

17

Modelling of labelled sequencesModelling of labelled sequences

Only -- and ++ are calculatedBetter than using ML estimators, when

many different classes are present

Page 18: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

18

Specifying a HMM modelSpecifying a HMM model

Most difficult problem using HMMs is specifying the model– Design of the structure– Assignment of parameter values

Page 19: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

19

Design of the structureDesign of the structure

Design: how to connect states by transitions A good HMM is based on the knowledge about the

problem under investigation Local maxima are biggest disadvantage in models

that are fully connected After deleting a transition from model Baum-Welch

will still work: set transition probability to zero

Page 20: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

20

Example 1Example 1

Geometric distribution

p

1-p

Page 21: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

21

Example 2Example 2

Model distribution of length between 2 and 10

Page 22: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

22

Example 3Example 3

Page 23: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

23

Silent statesSilent states

States that do not emit symbols

Also in other places in HMM

B

Page 24: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

24

ExampleExample

Silent states

Page 25: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

25

Silent statesSilent states

Advantage:– Less estimations of transition probabilities

needed

Drawback:– Limits the possibilities of defining a model

Page 26: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

26

More complex Markov chainsMore complex Markov chains

So far, we assumed that probability of a symbol in a sequence depends only on the probability of the previous symbol

More complex – High order Markov chains– Inhomogeneous Markov chains

Page 27: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

27

High order Markov chainsHigh order Markov chains

An nth order Markov process

Probability of a symbol in a sequence depends on the probability of the previous n symbols

An nth order Markov chain over some alphabet A is equivalent to a first order Markov chain over the alphabet An of n-tuples, because

P(AB|B) = P(A|B)

Page 28: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

28

ExampleExample

A second order Markov chain with two different symbols {A,B}

This can be translated into a first order Markov chain of 2-tuples {AA, AB, BA, BB}

Sometimes the framework of high order model is convenient

Page 29: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

29

Gene candidates in DNA:

-sequence of triplets of nucleotides:

startcodon nr. of non-stopcodons stopcodon

-open reading frame (ORF)An ORF can be either a gene or a non-

coding ORF (NORF)

Finding prokaryotic genesFinding prokaryotic genes

Page 30: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

30

Finding prokaryotic genesFinding prokaryotic genes

Experiment: – DNA from bacterium E.coli– Dataset contains 1100 genes (900 used for

training, 200 for testing)

Two models:– Normal model with first order Markov chains– Also first order Markov chains, but codons

instead of nucleotides are used as symbol

Page 31: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

31

Finding prokaryotic genesFinding prokaryotic genes

Outcomes:

Page 32: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

32

Inhomogeneous Markov chainsInhomogeneous Markov chains

Using the position information in the codon– Three models for position 1, 2 and 3

CAT GCA

P(C)aCA aAT aTG aGC aCA P(C)a2CA a3

AT a1TG a2

GC a3CA

Homogeneous Inhomogeneous

1 2 3 1 2 3

Page 33: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

33

Numerical Stability of HMM Numerical Stability of HMM algorithmsalgorithms

Multiplying many probabilities can cause numerical problems:– Underflow errors– Wrong numbers are calculated

Solutions:– Log transformation– Scaling of probabilities

Page 34: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

34

The log transformationThe log transformation

Compute log probabilities– Log 10-100000 = -100000– Underflow problem is essentially solved

Sum operation is often faster than product operation

In the Viterbi algorithm:

Page 35: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

35

Scaling of probabilitiesScaling of probabilities

Scale f and b variablesForward variable:

– For each i a scaling variable si is defined

– New f variables are defined:

– New forward recursion:

Page 36: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

36

Scaling of probabilitiesScaling of probabilities

Backward variable– Scaling has to be with same numbers as forward

variable– New backward recursion:

This normally works well, however underflow errors can still occur in models with many silent states (chapter 5)

Page 37: Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability

S. Maarschalkerweerd & A. Tjhang

37

SummarySummary

Hidden Markov ModelsParameter estimation

– State sequence known – State sequence unknown

Model structure– Silent states

More complex Markov chainsNumerical stability