# Expectation Propagation in Dynamical · PDF file 8/10/2012  · Linear systems: Kalman lter/smoother (Kalman, 1959) Nonlinear systems: Approximate inference Extended Kalman Filter/Smoother

• View
1

0

Embed Size (px)

### Text of Expectation Propagation in Dynamical · PDF file 8/10/2012  · Linear systems:...

• Expectation Propagation in Dynamical Systems

Marc Peter Deisenroth

Joint Work with Shakir Mohamed (UBC)

August 10, 2012

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 1

• Motivation

Figure : Complex time series: motion capture, GDP, climate

Time series in economics, robotics, motion capture, etc. have unknown dynamical structure, are high-dimensional and noisy

Flexible and accurate models Nonlinear (Gaussian process) dynamical systems (GPDS)

Accurate inference in (GP)DS important for Better knowledge about latent structures Parameter learning

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 2

• Outline

1 Inference in Time Series Models Filtering and Smoothing Expectation Propagation Approximating the Partition Function Relation to Smoothing

2 EP in Gaussian Process Dynamical Systems Gaussian Processes Filtering/Smoothing in GPDS Expectation Propagation in GPDS

3 Results

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 3

• Inference in Time Series Models Filtering and Smoothing

Time Series Models

xt−1 xt xt+1

zt−1 zt zt+1

xt = f(xt−1) + w , w ∼ N ( 0, Q

) zt = g(xt) + v , v ∼ N

( 0, R

) Latent state x ∈ RD Measurement/observation z ∈ RE Transition function f

Measurement function g

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 4

• Inference in Time Series Models Filtering and Smoothing

Inference in Time Series Models

xt−1 xt xt+1

zt−1 zt zt+1

Objective: Posterior distribution over latent variables xt Filtering (Forward Inference) Compute p(xt|z1:t) for t = 1, . . . , T Smoothing (Forward-Backward Inference) Compute p(xt|z1:t) for t = 1, . . . , T (forward sweep) Compute p(xt|z1:T ) for t = T, . . . , 1 (backward sweep)

Examples:

Linear systems: Kalman filter/smoother (Kalman, 1959) Nonlinear systems: Approximate inference

Extended Kalman Filter/Smoother (Kalman, 1959–1961) Unscented Kalman Filter/Smoother (Julier & Uhlmann, 1997)

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 5

• Inference in Time Series Models Filtering and Smoothing

Inference in Time Series Models

xt−1 xt xt+1

zt−1 zt zt+1

Objective: Posterior distribution over latent variables xt Filtering (Forward Inference) Compute p(xt|z1:t) for t = 1, . . . , T Smoothing (Forward-Backward Inference) Compute p(xt|z1:t) for t = 1, . . . , T (forward sweep) Compute p(xt|z1:T ) for t = T, . . . , 1 (backward sweep)

Examples:

Linear systems: Kalman filter/smoother (Kalman, 1959) Nonlinear systems: Approximate inference

Extended Kalman Filter/Smoother (Kalman, 1959–1961) Unscented Kalman Filter/Smoother (Julier & Uhlmann, 1997)

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 5

• Inference in Time Series Models Filtering and Smoothing

Machine Learning Perspective

xt−1 xt xt+1

zt−1 zt zt+1

Treat filtering/smoothing as an inference problem in graphical models with hidden variables

Allows for efficient local message passing distributed

Messages are unnormalized probability distributions

Iterative refinement of the posterior marginals p(xt), t = 1, . . . , T Multiple forward-backward sweeps until global consistency

(convergence)

Here: Expectation Propagation (Minka 2001)

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 6

• Inference in Time Series Models Expectation Propagation

Expectation Propagation

xt−1 xt xt+1

zt−1 zt zt+1

xt xt+1 p(xt+1|xt)

p(zt|xt) p(zt+1|xt+1)

Inference in factor graphs

p(xt) = ∏n

i=1 ti(xt)

q(xt) = ∏n

i=1 t̃i(xt)

Approximate factors t̃i are members of the Exponential Family (e.g., Multinomial, Gamma, Gaussian)

Find good a good approximation such that q ≈ p

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 7

• Inference in Time Series Models Expectation Propagation

Expectation Propagation

xt−1 xt xt+1

zt−1 zt zt+1

xt xt+1 p(xt+1|xt)

p(zt|xt) p(zt+1|xt+1)

Inference in factor graphs

p(xt) = ∏n

i=1 ti(xt)

q(xt) = ∏n

i=1 t̃i(xt)

Approximate factors t̃i are members of the Exponential Family (e.g., Multinomial, Gamma, Gaussian)

Find good a good approximation such that q ≈ p

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 7

• Inference in Time Series Models Expectation Propagation

Expectation Propagation

Figure : Moment matching vs. mode matching. Borrowed from Bishop (2006)

EP locally minimizes KL(p||q), where p is the true distribution and q is an approximation (from Exponential Family) to it.

EP = moment matching (unlike Variational Bayes [“mode matching”], which minimizes KL(q||p)) EP exploits properties of the Exponential Family: Compute moments of distributions via derivatives of the log-partition function

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 8

• Inference in Time Series Models Expectation Propagation

Expectation Propagation

qB(xt) xt

qM(xt)

xt+1 qC(xt+1)

qM(xt+1)

p(xt+1|xt)

p(zt|xt) p(zt+1|xt+1)

qB(xt) xt

qM(xt)

xt+1 qC(xt+1)

qM(xt+1)

qB(xt+1)qC(xt)

Figure : Factor graph (left) and fully factored factor graph (right).

Write down the (fully factored) factor graph

p(xt) = ∏n

i=1 ti(xt)

q(xt) = ∏n

i=1 t̃i(xt)

Find approximate t̃i, such that KL(p||q) is minimized. Multiple sweeps through graph until global consistency of the messages is assured

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 9

• Inference in Time Series Models Expectation Propagation

Expectation Propagation

qB(xt) xt

qM(xt)

xt+1 qC(xt+1)

qM(xt+1)

p(xt+1|xt)

p(zt|xt) p(zt+1|xt+1)

qB(xt) xt

qM(xt)

xt+1 qC(xt+1)

qM(xt+1)

qB(xt+1)qC(xt)

Figure : Factor graph (left) and fully factored factor graph (right).

Write down the (fully factored) factor graph

p(xt) = ∏n

i=1 ti(xt)

q(xt) = ∏n

i=1 t̃i(xt)

Find approximate t̃i, such that KL(p||q) is minimized. Multiple sweeps through graph until global consistency of the messages is assured

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 9

• Inference in Time Series Models Expectation Propagation

Messages in a Dynamical System

qB(xt) xt

qM(xt)

xt+1 qC(xt+1)

qM(xt+1)

qB(xt+1)qC(xt)

Approximate (factored) marginal: q(xt) = ∏

i t̃i(xt)

Here, our messages t̃i have names:

Measurement message qM Forward message qB Backward message qC

Define cavity distribution: q\i(xt) = q(xt)/t̃i(xt) = ∏

k 6=i t̃k(xt)

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 10

• Inference in Time Series Models Expectation Propagation

Gaussian EP in More Detail

qB(xt) xt

qM(xt)

xt+1 qC(xt+1)

qM(xt+1)

qB(xt+1)qC(xt)

1 Write down the factor graph

2 Initialize all messages t̃i, i = M,B,C Until convergence:

3 For all latent variables xt and corresponding messages ti(xt) do

1 Compute the cavity distribution q\i(xt) = N ( xt |µ\it , Σ\it

) by

Gaussian division. 2 Compute the moments of ti(xt)q

\i(xt) Updated moments of q(xt)

3 Compute updated message

t̃i(xt) = q(xt)/q \i(xt)

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11

• Inference in Time Series Models Expectation Propagation

Gaussian EP in More Detail

qB(xt) xt

qM(xt)

xt+1 qC(xt+1)

qM(xt+1)

qB(xt+1)qC(xt)

1 Write down the factor graph

2 Initialize all messages t̃i, i = M,B,C Until convergence:

3 For all latent variables xt and corresponding messages ti(xt) do

1 Compute the cavity distribution q\i(xt) = N ( xt |µ\it , Σ\it

) by

Gaussian division. 2 Compute the moments of ti(xt)q

\i(xt) Updated moments of q(xt)

3 Compute updated message

t̃i(xt) = q(xt)/q \i(xt)

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11

• Inference in Time Series Models Expectation Propagation

Gaussian EP in More Detail

qB(xt) xt

qM(xt)

xt+1 qC(xt+1)

qM(xt+1)

qB(xt+1)qC(xt)

1 Write down the factor graph

2 Initialize all messages t̃i, i = M,B,C Until convergence:

3 For all latent variables xt and corresponding messages ti(xt) do

1 Compute the cavity distribution q\i(xt) = N ( xt |µ\it , Σ\it

) by

Gaussian division. 2 Compute the moments of ti(xt)q

\i(xt) Updated moments of q(xt)

3 Compute updated message

t̃i(xt) = q(xt)/q \i(xt)

Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11

• Inference in Time Series Models Expectation Propagation

Gaussian EP in More Detail

qB(xt) xt

qM(xt)

xt+1 qC(xt+1)

qM(xt+1)

qB(xt+1)qC(xt)

1 Write down the factor graph

2 Initialize all messages t̃i

Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents