Upload
teddy-ekatamto
View
235
Download
1
Embed Size (px)
DESCRIPTION
copyright by Ahmad Hidayatno ST, MTTeknik Elektro Undip Semarang
Citation preview
Estimasi
Bagian II
8
T = {(1), (2), (3), , (T)} sequence of states
We might have 6 = {1, 4, 2, 2, 1, 4}
The system can revisit a state at different steps and not every state need to be visited10
P(j(t + 1) | i (t)) = aij
10
10
= (aij, T)
P(T | ) = a14 . a42 . a22 . a21 . a14 . P((1) = i)
Example: speech recognition
production of spoken words
Production of the word: pattern represented by phonemes
/p/ /a/ /tt/ /er/ /n/ // ( // = silent state)
Transitions from /p/ to /a/, /a/ to /tt/, /tt/ to er/, /er/ to /n/ and /n/ to a silent state
10
bjk= 1 for all j where bjk=P(Vk(t) | j(t)).
It is the probability that the model produces a sequence VT of visible states. It is:
where each r indexes a particular sequence
of T hidden states.
Using equations (1) and (2), we can write:
Interpretation: The probability that we observe the particular
sequence of T visible states VT is equal to the sum over all rmax
possible sequences of hidden states of the conditional probability
that the system has made a particular transition multiplied by the
probability that it then emitted the visible symbol in our target
sequence.
Example: Let 1, 2, 3 be the hidden states; v1, v2, v3 be the
visible states
and V3 = {v1, v2, v3} is the sequence of visible states
P({v1, v2, v3}) = P(1).P(v1 | 1).P(2 | 1).P(v2 | 2).P(3 | 2).P(v3 | 3)
++ (possible terms in the sum= all possible (33= 27) cases !)
First possibility:
Second Possibility:
P({v1, v2, v3}) = P(2).P(v1 | 2).P(3 | 2).P(v2 | 3).P(1 | 3).P(v3 | 1) + +
Therefore:
1
(t = 1)
2
(t = 2)
3
(t = 3)
v1
v2
v3
2
(t = 1)
1
(t = 3)
3
(t = 2)
v3
v2
v1
Given a sequence of visible states VT, the decoding problem is
to find the most probable sequence of hidden states.
This problem can be expressed mathematically as:
find the single best state sequence (hidden states)
Note that the summation disappeared, since we want to find
Only one unique best case !
Where: = [,A,B]
= P((1) = ) (initial state probability)
A = aij = P((t+1) = j | (t) = i)
B = bjk = P(v(t) = k | (t) = j)
In the preceding example, this computation corresponds to the selection of the best path amongst:
{1(t = 1),2(t = 2),3(t = 3)}, {2(t = 1),3(t = 2),1(t = 3)}
{3(t = 1),1(t = 2),2(t = 3)}, {3(t = 1),2(t = 2),1(t = 3)}
{2(t = 1),1(t = 2),3(t = 3)}
This third problem consists of determining a method to adjust the model parameters = [,A,B] to satisfy a certain optimization criterion. We need to find the best model
Such that to maximize the probability of the observation sequence:
We use an iterative procedure such as Baum-Welch or Gradient to find this local optimum
=
=
=
-
=
max
r
1
r
T
t
1
t
T
)
1
t
(
|
)
t
(
(
P
))
t
(
|
)
t
(
v
(
P
)
V
(
P
w
w
w
=
=
=
=
-
=
=
T
t
1
t
T
r
T
t
1
t
T
r
T
))
1
t
(
|
)
t
(
(
P
)
P(
(2)
))
t
(
|
)
t
(
v
(
P
)
|
V
(
P
)
1
(
w
w
w
w
w
{
}
)
T
(
),...,
2
(
),
1
(
T
r
w
w
w
w
=
=
=
-
=
states
hidden
of
sequence
possible
3
t
1
t
3
2
1
))
1
t
(
|
)
t
(
(
P
)).
t
(
|
)
t
(
v
(
P
})
v
,
v
,
v
({
P
w
w
w
)
(
P
)
|
V
(
P
)
V
(
P
T
r
r
1
r
T
r
T
T
max
w
w
=
=
]
B
,
A
,
[
p
l
=
[
]
l
w
w
w
w
w
w
w
w
w
w
w
w
|
)
T
(
V
),...,
2
(
v
),
1
(
v
),
T
(
),...,
2
(
),
1
(
P
max
arg
)
T
(
),...,
2
(
),
1
(
:
that
such
)
T
(
),...,
2
(
),
1
(
)
T
(
),...,
2
(
),
1
(
=
)
|
V
(
P
Max
T
l
l