Fundamentals of Hidden Markov Model1

Embed Size (px)

Citation preview

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    1/34

    Fundamentals of Hidden

    Markov Model

    Mehmet Yunus Dnmez

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    2/34

    Markov Random Processes

    A random sequence has the Markov property if itsdistribution is determined solely by its current state.Any random process having this property is called a

    Markov random process.

    For observable state sequences (state is known fromdata), this leads to a Markov chainmodel.

    For non-observable states, this leads to a HiddenMarkov Model(HMM).

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    3/34

    HMM Elements

    An HMM for discrete symbol observation

    - N

    the number of states in the model

    the state at time t

    - M

    the number of distinct observation symbols per state

    {1, 2,..., }N

    tq

    1 2{ , ,..., }MV v v v

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    4/34

    HMM Elements (2)

    - : the state-transition probability distribution : A

    - : the observation symbol probability distribution : B

    ija

    1[ | ]

    ij t t a p q j q i 1 ,i j N

    ( )jb k

    ( ) [ | ] j t k t

    b k p o v q j

    1k M

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    5/34

    HMM Elements (3)

    - : the initial state distribution,

    Compact Notation of a HMM Model

    =

    i

    1[ ]

    i p q i 1 i N

    ( , , )A B

    1 1 2 2 ( 1)1 1 2( ) ( )... ( )

    T T Tq q q q q q q T b o a b o a b o

    1 2 1 2( , ,... | , , ,..., )T q q qT p o o o s s s

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    6/34

    A General Case HMM

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    7/34

    HMM Generator

    Choose an initial state ( ) initial state distribution

    Set

    Choose symbol probability distribution

    Transit to a new state state transition probabilitydistribution

    Set ; return to step 3 if ;

    otherwise, terminate the procedure

    1q i

    1t

    t ko v

    1tq j

    1t t t T

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    8/34

    HMM Properties

    Often simplified

    , and

    Obviously for all i

    Discrete HMMs :

    Continuous HMMs :

    1( ) 1s 1( ) 0is

    1ijj

    a

    1 2{ , ,... }MV v v v

    dV R

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    9/34

    HMM Properties (2)

    The term hidden

    - we can only access to visible symbols (observations)

    - drawing conclusions without knowing the hidden sequence of

    states

    Causal: Probabilities depend on previous states

    Ergodic if every state is visited in transition sequence for any

    given initial state

    Final or absorbing state: the state which, if entered, is never left

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    10/34

    3 Basic Problems

    The Evaluation Problem

    - given an HMM- given an observation

    - compute the probability of the observation

    1 2{ , ,... | }

    Tp o o o

    1 2, ,... To o o

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    11/34

    3 Basic Problems (2)

    The Decoding Problem

    - given an HMM

    - given an observation- compute the most likely state sequence

    i.e.

    1 2, ,... To o o

    1 2, ,...,q q qT s s s

    1,... 1 2 1arg max ( , ,..., | ,... , )

    q qT T T

    p o o o q q

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    12/34

    3 Basic Problems (3)

    The learning / optimization problem

    - given an HMM

    - given an observation- find an HMM such that

    1 2, ,... To o o

    1 2 1 1 2{ , ,... | } { , ,... | }T Tp o o o p o o o

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    13/34

    The Evaluation Problem

    We know :

    =

    - From this :

    =

    1 2 1 2( , ,... | , , ,..., )T q q qT p o o o s s s

    1 1 1 11 11,... 1( ) ( ) ( )k k kq q q q q k k Ts b o a b o

    1 2( , ,... | )Tp o o o

    1 1 1 11

    2

    3

    1 11,... 1,... 11,...1,...

    1,...

    ( ) ( ) ( )k k k

    T

    q q q q q k q N k Tq Nq N

    q N

    s b o a b o

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    14/34

    The Evaluation Problem(2)

    Obvious:

    for sufficiently large values of T, it is infeasible to compute theabove term for all possible state sequences need other

    solution

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    15/34

    The Forward Algorithm

    At time t and state i, probability of partial observation sequence

    : array

    1 2, ,... to o o ( )t i

    1 1( ) ( )i ii b o 1 i N

    [ ][ ]time state

    1 11( ) [ ( ) ] ( )

    N

    t t ij j t i j i a b o

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    16/34

    The Forward Algorithm (2)

    As a result at the last time T

    [ ][ ] ( )timetime state state

    [ ][ ]state

    T state1 2( , ,... | )Tp o o o

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    17/34

    Figure

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    18/34

    The Backward Algorithm

    1 2

    1 1

    1

    1 2 1

    1

    , ,... ( )

    ( ) 1

    ( ) ( ) ( )

    1, 2,...1

    ( , ,... | ) ( )

    t t T t

    T

    N

    t ij j t t

    j

    N

    T

    j

    o o o i

    i

    i a b o j

    t T T

    p o o o j

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    19/34

    Figure

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    20/34

    The Decoding Problem

    Finding the optimal state sequence associated with the givenobservation sequence

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    21/34

    Forward-Backward

    Optimality criterion : to choose the states that areindividually most likely at each time t

    The probability of being in state i at time t

    : accounts for partial observation sequence

    : account for remainder

    tq

    1

    ( ) ( | , )

    ( ) ( )

    ( ) ( )

    i t

    t t

    N

    t t

    i

    t p q i O

    i i

    i i

    ( )t i

    ( )t

    i 1 2, ,...t t To o o 1 2, ,... to o o

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    22/34

    The Viterbi Algorithm

    The best score along a single path, at time t, which accounts forthe first t observations and ends in state i

    Keep track of the argument that maximize above equation

    Viterbi Algorithm is similar in implementation to the forwardcalculation, but the major difference is the maximization overprevious states

    1 1( ) [max ( ) ] ( )

    t t ij j t i

    j i a b o

    ( )t j

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    23/34

    The Complete Procedure(for finding the best state sequence)

    Initialization

    Recursion

    1 1

    1

    ( ) ( )

    ( ) 0

    i ii b o

    i

    1 i N

    11

    11

    ( ) max[ ( ) ] ( )

    ( ) arg max[ ( ) ]

    t t ij j t i N

    t t iji N

    j i a b o

    j i a

    2

    1

    t T

    j N

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    24/34

    The Complete Procedure (2)(for finding the best state sequence)

    Termination

    Path(state sequence) backtracking

    *

    1

    *1

    max[ ( )]

    arg max[ ( )]

    Ti N

    T Ti N

    P i

    q i

    * *

    1 1

    ( )

    1, 2,...,1t t t

    q q

    t T T

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    25/34

    The Learning /

    Optimization problem

    How do we adjust the model parameters to maximize

    ??

    - Parameter Estimation

    - Baum-Welch Algorithm ( EM : Expectation Maximization )

    - Iterative Procedure

    ( | )P O

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    26/34

    Parameter Estimation

    Probability of being in state i at time t, and state j at time t+1

    1

    1 1

    1 1

    1 1

    ( , ) ( , | , )( ) ( ) ( )

    ( ) ( ) ( )

    t t t

    t ij j t t

    N N

    t ij j t t

    i j

    i j P q i q j Oi a b o j

    i a b o j

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    27/34

    Figure

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    28/34

    Parameter Estimation (2)

    Probability of being in state i at time t, given the entireobservation sequence and the model

    We can relate these by summing over j

    1( ) ( , )

    N

    t t

    ji i j

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    29/34

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    30/34

    Parameter Estimation (4)

    Update using &

    : expected frequency (number of times) in state i at time (t=1)

    ( , , )A B ( , )t i j ( )i t

    _

    1( )i i

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    31/34

    Parameter Estimation (5)

    New Transition Probability

    expected number of transitions from state i to j

    expected number of transitions from state I

    =

    1

    _1

    1

    1

    ( , )

    ( )

    T

    t

    tij T

    t

    t

    i j

    ai

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    32/34

    Parameter Estimation (6)

    New Observation Probability

    expected number of times in state j and observing symbol

    expected number of times in j

    =

    kv

    1_

    . .

    1

    ( )

    ( )

    ( )

    t k

    T

    t

    t

    s t o vj

    T

    t

    t

    j

    b k

    j

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    33/34

    Parameter Estimation (7)

    From , if we define new

    - New model is more likely than old model in the sense that

    - The observation sequence is more likely to be produced bynew model

    - has been proved by Baum & his colleagues

    - iteratively use new model in place of old model, and repeatthe reestimation calculation ML estimation

    ( , , )A B _ _ _ _

    ( , , )A B

    _

    ( | ) ( | )P O P O

  • 8/2/2019 Fundamentals of Hidden Markov Model1

    34/34

    Questions??