Instructor: Eyal Amir Grad TAs : Wen Pu, Yonatan Bisk Undergrad TAs : Sam Johnson, Nikhil Johri

Instructor: Eyal Amir

Grad TAs: Wen Pu, Yonatan BiskUndergrad TAs: Sam Johnson, Nikhil Johri

CS 440 / ECE 448Introduction to Artificial IntelligenceSpring 2010Lecture #23

Today & Thursday

• Time and uncertainty• Inference: filtering, prediction, smoothing• Hidden Markov Models (HMMs)

– Model– Exact Reasoning

Time and Uncertainty

• Standard Bayes net model:– Static situation– Fixed (finite) random variables– Graphical structure and conditional independence

• In many systems, data arrives sequentially• Dynamic Bayes nets (DBNs) and HMMs model:

– Processes that evolve over time

Example (Robot Position)

Sensor 1 Sensor 3

Pos1 Pos2 Pos3

Sensor2

Sensor1 Sensor3

Vel 1 Vel 2 Vel 3

Sensor 2

Robot Position(With Observations)

Sens.A 1 Sens.A3

Pos1 Pos2 Pos3

Sens.A2

Sens.B1 Sens.B3

Vel 1 Vel 2 Vel 3

Sens.B 2

Inference Problem

• State of the System at time t:

• Probability distribution over states:

• A lot of parameters

tX

),...,|()...|()(),...,(

10

1211

tt

t

XXXPXXPXPXXP

).,.,,( ttttt BSensASensVelPosX

Solution (Part 1)

• Problem:

• Solution: Markov Assumption– Assume is independent of

given • State variables are expressive enough to

summarize all relevant information about past• Therefore:

),...,|( 11 tt XXXP

tX1tX

21,..., tXX

)|()...|()(),...,( 11211 ttt XXPXXPXPXXP

Solution (Part 2)

• Problem: – If all are different

• Solution: – Assume all are the same– The process is time-invariant or stationary

)|( 1tt XXP

)|( 1tt XXP

Inference in Robot Position DBN

• Compute distribution over true position and velocity – Given a sequence of sensor values

• Belief state: – Probability distribution over different states at

each time step• Update belief state when a new set of sensor

readings arrive ),|( 1 ttt OXXP

)|( :1 tt OXP

Example

• First order Markov assumption not exactly true in real world

Example

• Possible fixes: – Increase order of Markov process

– Augment state, e.g., add Temp, Pressure Or battery to position and velocity

Today



• Dynamic Bayesian Networks– Model – Exact Reasoning

Inference Tasks

• Filtering:– Belief state: probability of state given the evidence

• Prediction: – Like filtering without evidence

• Smoothing: – Better estimate of past states

• Most likelihood explanation: – Scenario that explains the evidence

)|( :1 tt eXP

0),|( :1 keXP tkt

0),|( :1 keXP ktt

)|(maxarg :1:1:1 ttx exPt

Filtering (forward algorithm)

Predict:

Update:

1

)|()|()|( 1:1111:1tx

tttttt exPxXPeXP

)|()|(),|()|(

1:1

1:1:1

tttt

ttttt

XePeXPeeXPeXP

Recursive step

Et-1 Et+1

Xt-1 Xt Xt+1

Et

Example

0

)|()|()|( 011111r

RRPRuPuRP

0

)|( 01r

RRP

Smoothing

)|()|(),|()|(

:1:1

:1:1:1

ktkkk

tkkktk

XePeXPeeXPeXP

Forward backward

SmoothingBackWard Step

1

1

1

)|()|()|(

)|()|(

)|(),|()|(

11:211

11:1

11:1:1

k

k

k

xkkktkkk

xkkktk

xkkkktkktk

XxPxePxeP

XxPxeP

XxPxXePXeP

)|()|(),|()|(

:1:1

:1:1:1

ktkkk

tkkktk

XePeXPeeXPeXP

Most Likely Explanation• Finding most likely path

Et-1 Et+1

Xt-1 Xt Xt+1

Et

Most likely path to xt

Plus one more update

Most Likely Explanation• Finding most likely path

Et-1 Et+1

Xt-1 Xt Xt+1

Et

max tx

)|,(max)|((

)|()|,..(max

:11:1..1

111:111..1

11tttxxtt

tttttxtx

exxPxXP

XePeXxxP

t

Called Viterbi

Viterbi(Example)

Viterbi(Example)

Viterbi(Example)

Viterbi(Example)

Viterbi(Example)

Today

• Time and uncertainty• Inference: filtering, prediction, smoothing,

MLE• Hidden Markov Models (HMMs)



Hidden Markov model (HMM)

Y1 Y3

X1 X2 X3

Y2

Phones/ words

acoustic signal

transitionmatrix

Diagonal Matrix

Sparse transition matrix ) sparse graph

“True” state

Noisy observations

BiXyP tt )|(

),()|( 1 jiAiXjXP tt

Forwards algorithm for HMMsPredict:

Update:

Message passing view of forwards algorithm

Yt-1 Yt+1

Xt-1 XtXt+1

Yt

at|t-1

btbt+1

Forwards-backwards algorithm

Yt-1 Yt+1

Xt-1 Xt Xt+1

Yt

at|t-1bt

bt

If Have Time…




Dynamic Bayesian Network• DBN is like a 2time-BN

– Using the first order Markov assumptions

Standard BN

)( 0XP

Standard BN

Time 0 Time 1

)|( 1tt XXP

Dynamic Bayesian Network

• Basic idea:– Copy state and evidence for each time step– Xt: set of unobservable (hidden) variables (e.g.: Pos, Vel)– Et: set of observable (evidence) variables (e.g.: Sens.A, Sens.B)

• Notice: Time is discrete

Example

Inference in DBN

Unroll:

Inference in the above BNNot efficient (depends on the sequence length)

DBN Representation: DelC

Tt

Lt

CRt

RHCt

Tt+1

Lt+1

CRt+1

RHCt+1

fCR(Lt,CRt,RHCt,CRt+1)

fT(Tt,Tt+1)

L CR RHC CR(t+1) CR(t+1)

O T T 0.2 0.8

E T T 1.0 0.0

O F T 0.0 1.0

E F T 0.0 1.0

O T F 1.0 0.1

E T F 1.0 0.0

O F F 0.0 1.0

E F F 0.0 1.0

T T(t+1) T(t+1)

T 0.91 0.09

F 0.0 1.0

RHMt RHMt+1

Mt Mt+1

fRHM(RHMt,RHMt+1)RHM R(t+1) R(t+1)

T 1.0 0.0

F 0.0 1.0

Benefits of DBN RepresentationPr(Rmt+1,Mt+1,Tt+1,Lt+1,Ct+1,Rct+1 | Rmt,Mt,Tt,Lt,Ct,Rct)

= fRm(Rmt,Rmt+1) * fM(Mt,Mt+1) * fT(Tt,Tt+1) * fL(Lt,Lt+1) * fCr(Lt,Crt,Rct,Crt+1) * fRc(Rct,Rct+1)

- Only few parameters vs. 25440 for matrix

- Removes global exponential dependence

s1 s2 ... s160s1 0.9 0.05 ... 0.0s2 0.0 0.20 ... 0.1

s160 0.1 0.0 ... 0.0

...

Tt

Lt

CRt

RHCt

Tt+1

Lt+1

CRt+1

RHCt+1

RHMt RHMt+1

Mt Mt+1

Documents

Instructor: Eyal Amir Grad TAs : Wen Pu, Yonatan Bisk Undergrad TAs : Sam Johnson, Nikhil Johri