Upload
narasimhan-kumaravelu
View
232
Download
0
Embed Size (px)
Citation preview
8/2/2019 Fundamentals of Hidden Markov Model1
1/34
Fundamentals of Hidden
Markov Model
Mehmet Yunus Dnmez
8/2/2019 Fundamentals of Hidden Markov Model1
2/34
Markov Random Processes
A random sequence has the Markov property if itsdistribution is determined solely by its current state.Any random process having this property is called a
Markov random process.
For observable state sequences (state is known fromdata), this leads to a Markov chainmodel.
For non-observable states, this leads to a HiddenMarkov Model(HMM).
8/2/2019 Fundamentals of Hidden Markov Model1
3/34
HMM Elements
An HMM for discrete symbol observation
- N
the number of states in the model
the state at time t
- M
the number of distinct observation symbols per state
{1, 2,..., }N
tq
1 2{ , ,..., }MV v v v
8/2/2019 Fundamentals of Hidden Markov Model1
4/34
HMM Elements (2)
- : the state-transition probability distribution : A
- : the observation symbol probability distribution : B
ija
1[ | ]
ij t t a p q j q i 1 ,i j N
( )jb k
( ) [ | ] j t k t
b k p o v q j
1k M
8/2/2019 Fundamentals of Hidden Markov Model1
5/34
HMM Elements (3)
- : the initial state distribution,
Compact Notation of a HMM Model
=
i
1[ ]
i p q i 1 i N
( , , )A B
1 1 2 2 ( 1)1 1 2( ) ( )... ( )
T T Tq q q q q q q T b o a b o a b o
1 2 1 2( , ,... | , , ,..., )T q q qT p o o o s s s
8/2/2019 Fundamentals of Hidden Markov Model1
6/34
A General Case HMM
8/2/2019 Fundamentals of Hidden Markov Model1
7/34
HMM Generator
Choose an initial state ( ) initial state distribution
Set
Choose symbol probability distribution
Transit to a new state state transition probabilitydistribution
Set ; return to step 3 if ;
otherwise, terminate the procedure
1q i
1t
t ko v
1tq j
1t t t T
8/2/2019 Fundamentals of Hidden Markov Model1
8/34
HMM Properties
Often simplified
, and
Obviously for all i
Discrete HMMs :
Continuous HMMs :
1( ) 1s 1( ) 0is
1ijj
a
1 2{ , ,... }MV v v v
dV R
8/2/2019 Fundamentals of Hidden Markov Model1
9/34
HMM Properties (2)
The term hidden
- we can only access to visible symbols (observations)
- drawing conclusions without knowing the hidden sequence of
states
Causal: Probabilities depend on previous states
Ergodic if every state is visited in transition sequence for any
given initial state
Final or absorbing state: the state which, if entered, is never left
8/2/2019 Fundamentals of Hidden Markov Model1
10/34
3 Basic Problems
The Evaluation Problem
- given an HMM- given an observation
- compute the probability of the observation
1 2{ , ,... | }
Tp o o o
1 2, ,... To o o
8/2/2019 Fundamentals of Hidden Markov Model1
11/34
3 Basic Problems (2)
The Decoding Problem
- given an HMM
- given an observation- compute the most likely state sequence
i.e.
1 2, ,... To o o
1 2, ,...,q q qT s s s
1,... 1 2 1arg max ( , ,..., | ,... , )
q qT T T
p o o o q q
8/2/2019 Fundamentals of Hidden Markov Model1
12/34
3 Basic Problems (3)
The learning / optimization problem
- given an HMM
- given an observation- find an HMM such that
1 2, ,... To o o
1 2 1 1 2{ , ,... | } { , ,... | }T Tp o o o p o o o
8/2/2019 Fundamentals of Hidden Markov Model1
13/34
The Evaluation Problem
We know :
=
- From this :
=
1 2 1 2( , ,... | , , ,..., )T q q qT p o o o s s s
1 1 1 11 11,... 1( ) ( ) ( )k k kq q q q q k k Ts b o a b o
1 2( , ,... | )Tp o o o
1 1 1 11
2
3
1 11,... 1,... 11,...1,...
1,...
( ) ( ) ( )k k k
T
q q q q q k q N k Tq Nq N
q N
s b o a b o
8/2/2019 Fundamentals of Hidden Markov Model1
14/34
The Evaluation Problem(2)
Obvious:
for sufficiently large values of T, it is infeasible to compute theabove term for all possible state sequences need other
solution
8/2/2019 Fundamentals of Hidden Markov Model1
15/34
The Forward Algorithm
At time t and state i, probability of partial observation sequence
: array
1 2, ,... to o o ( )t i
1 1( ) ( )i ii b o 1 i N
[ ][ ]time state
1 11( ) [ ( ) ] ( )
N
t t ij j t i j i a b o
8/2/2019 Fundamentals of Hidden Markov Model1
16/34
The Forward Algorithm (2)
As a result at the last time T
[ ][ ] ( )timetime state state
[ ][ ]state
T state1 2( , ,... | )Tp o o o
8/2/2019 Fundamentals of Hidden Markov Model1
17/34
Figure
8/2/2019 Fundamentals of Hidden Markov Model1
18/34
The Backward Algorithm
1 2
1 1
1
1 2 1
1
, ,... ( )
( ) 1
( ) ( ) ( )
1, 2,...1
( , ,... | ) ( )
t t T t
T
N
t ij j t t
j
N
T
j
o o o i
i
i a b o j
t T T
p o o o j
8/2/2019 Fundamentals of Hidden Markov Model1
19/34
Figure
8/2/2019 Fundamentals of Hidden Markov Model1
20/34
The Decoding Problem
Finding the optimal state sequence associated with the givenobservation sequence
8/2/2019 Fundamentals of Hidden Markov Model1
21/34
Forward-Backward
Optimality criterion : to choose the states that areindividually most likely at each time t
The probability of being in state i at time t
: accounts for partial observation sequence
: account for remainder
tq
1
( ) ( | , )
( ) ( )
( ) ( )
i t
t t
N
t t
i
t p q i O
i i
i i
( )t i
( )t
i 1 2, ,...t t To o o 1 2, ,... to o o
8/2/2019 Fundamentals of Hidden Markov Model1
22/34
The Viterbi Algorithm
The best score along a single path, at time t, which accounts forthe first t observations and ends in state i
Keep track of the argument that maximize above equation
Viterbi Algorithm is similar in implementation to the forwardcalculation, but the major difference is the maximization overprevious states
1 1( ) [max ( ) ] ( )
t t ij j t i
j i a b o
( )t j
8/2/2019 Fundamentals of Hidden Markov Model1
23/34
The Complete Procedure(for finding the best state sequence)
Initialization
Recursion
1 1
1
( ) ( )
( ) 0
i ii b o
i
1 i N
11
11
( ) max[ ( ) ] ( )
( ) arg max[ ( ) ]
t t ij j t i N
t t iji N
j i a b o
j i a
2
1
t T
j N
8/2/2019 Fundamentals of Hidden Markov Model1
24/34
The Complete Procedure (2)(for finding the best state sequence)
Termination
Path(state sequence) backtracking
*
1
*1
max[ ( )]
arg max[ ( )]
Ti N
T Ti N
P i
q i
* *
1 1
( )
1, 2,...,1t t t
q q
t T T
8/2/2019 Fundamentals of Hidden Markov Model1
25/34
The Learning /
Optimization problem
How do we adjust the model parameters to maximize
??
- Parameter Estimation
- Baum-Welch Algorithm ( EM : Expectation Maximization )
- Iterative Procedure
( | )P O
8/2/2019 Fundamentals of Hidden Markov Model1
26/34
Parameter Estimation
Probability of being in state i at time t, and state j at time t+1
1
1 1
1 1
1 1
( , ) ( , | , )( ) ( ) ( )
( ) ( ) ( )
t t t
t ij j t t
N N
t ij j t t
i j
i j P q i q j Oi a b o j
i a b o j
8/2/2019 Fundamentals of Hidden Markov Model1
27/34
Figure
8/2/2019 Fundamentals of Hidden Markov Model1
28/34
Parameter Estimation (2)
Probability of being in state i at time t, given the entireobservation sequence and the model
We can relate these by summing over j
1( ) ( , )
N
t t
ji i j
8/2/2019 Fundamentals of Hidden Markov Model1
29/34
8/2/2019 Fundamentals of Hidden Markov Model1
30/34
Parameter Estimation (4)
Update using &
: expected frequency (number of times) in state i at time (t=1)
( , , )A B ( , )t i j ( )i t
_
1( )i i
8/2/2019 Fundamentals of Hidden Markov Model1
31/34
Parameter Estimation (5)
New Transition Probability
expected number of transitions from state i to j
expected number of transitions from state I
=
1
_1
1
1
( , )
( )
T
t
tij T
t
t
i j
ai
8/2/2019 Fundamentals of Hidden Markov Model1
32/34
Parameter Estimation (6)
New Observation Probability
expected number of times in state j and observing symbol
expected number of times in j
=
kv
1_
. .
1
( )
( )
( )
t k
T
t
t
s t o vj
T
t
t
j
b k
j
8/2/2019 Fundamentals of Hidden Markov Model1
33/34
Parameter Estimation (7)
From , if we define new
- New model is more likely than old model in the sense that
- The observation sequence is more likely to be produced bynew model
- has been proved by Baum & his colleagues
- iteratively use new model in place of old model, and repeatthe reestimation calculation ML estimation
( , , )A B _ _ _ _
( , , )A B
_
( | ) ( | )P O P O
8/2/2019 Fundamentals of Hidden Markov Model1
34/34
Questions??