146
Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June 2012 Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 1 / 69

Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Multioutput, Multitask and Mechanistic

Neil D. Lawrence and Raquel Urtasun

CVPR16th June 2012

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 1 / 69

Page 2: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Outline

1 Kalman Filter

2 Convolution Processes

3 Motion Capture Example

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 2 / 69

Page 3: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Outline

1 Kalman Filter

2 Convolution Processes

3 Motion Capture Example

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 3 / 69

Page 4: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Simple Markov Chain

Assume 1-d latent state, a vector over time, x = [x1 . . . xT ].

Markov property,

xi =xi−1 + εi ,

εi ∼N (0, α)

=⇒ xi ∼N (xi−1, α)

Initial state,x0 ∼ N (0, α0)

If x0 ∼ N (0, α) we have a Markov chain for the latent states.

Markov chain it is specified by an initial distribution (Gaussian) and atransition distribution (Gaussian).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 4 / 69

Page 5: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Gauss Markov Chain

-4

-2

0

2

4

0 1 2 3 4 5 6 7 8 9

x

t

x0 = 0, εi ∼ N (0, 1)

x0 = 0.000, ε1 = −2.24

x1 = 0.000− 2.24 = −2.24

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 5 / 69

Page 6: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Gauss Markov Chain

-4

-2

0

2

4

0 1 2 3 4 5 6 7 8 9

x

t

x0 = 0, εi ∼ N (0, 1)

x1 = −2.24, ε2 = 0.457

x2 = −2.24 + 0.457 = −1.78

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 5 / 69

Page 7: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Gauss Markov Chain

-4

-2

0

2

4

0 1 2 3 4 5 6 7 8 9

x

t

x0 = 0, εi ∼ N (0, 1)

x2 = −1.78, ε3 = 0.178

x3 = −1.78 + 0.178 = −1.6

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 5 / 69

Page 8: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Gauss Markov Chain

-4

-2

0

2

4

0 1 2 3 4 5 6 7 8 9

x

t

x0 = 0, εi ∼ N (0, 1)

x3 = −1.6, ε4 = −0.292

x4 = −1.6− 0.292 = −1.89

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 5 / 69

Page 9: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Gauss Markov Chain

-4

-2

0

2

4

0 1 2 3 4 5 6 7 8 9

x

t

x0 = 0, εi ∼ N (0, 1)

x4 = −1.89, ε5 = −0.501

x5 = −1.89− 0.501 = −2.39

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 5 / 69

Page 10: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Gauss Markov Chain

-4

-2

0

2

4

0 1 2 3 4 5 6 7 8 9

x

t

x0 = 0, εi ∼ N (0, 1)

x5 = −2.39, ε6 = 1.32

x6 = −2.39 + 1.32 = −1.08

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 5 / 69

Page 11: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Gauss Markov Chain

-4

-2

0

2

4

0 1 2 3 4 5 6 7 8 9

x

t

x0 = 0, εi ∼ N (0, 1)

x6 = −1.08, ε7 = 0.989

x7 = −1.08 + 0.989 = −0.0881

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 5 / 69

Page 12: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Gauss Markov Chain

-4

-2

0

2

4

0 1 2 3 4 5 6 7 8 9

x

t

x0 = 0, εi ∼ N (0, 1)

x7 = −0.0881, ε8 = −0.842

x8 = −0.0881− 0.842 = −0.93

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 5 / 69

Page 13: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Gauss Markov Chain

-4

-2

0

2

4

0 1 2 3 4 5 6 7 8 9

x

t

x0 = 0, εi ∼ N (0, 1)

x8 = −0.93, ε9 = −0.41

x9 = −0.93− 0.410 = −1.34

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 5 / 69

Page 14: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Multivariate Gaussian Properties: Reminder

Ifz ∼ N (µ,C)

andx = Wz + b

thenx ∼ N

(Wµ + b,WCW>

)

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 6 / 69

Page 15: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Multivariate Gaussian Properties: Reminder

Simplified: Ifz ∼ N

(0, σ2I

)and

x = Wz

thenx ∼ N

(0, σ2WW>

)

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 7 / 69

Page 16: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Matrix Representation of Latent Variables

x1

x2

x3

x4

x5

ε1

ε2

ε3

ε4

ε5

1 0 0 0 0

1 1 0 0 0

1 1 1 0 0

1 1 1 1 0

1 1 1 1 1

×=

x1 = ε1

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 8 / 69

Page 17: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Matrix Representation of Latent Variables

x1

x2

x3

x4

x5

ε1

ε2

ε3

ε4

ε5

1 0 0 0 0

1 1 0 0 0

1 1 1 0 0

1 1 1 1 0

1 1 1 1 1

×=

x2 = ε1 + ε2

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 8 / 69

Page 18: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Matrix Representation of Latent Variables

x1

x2

x3

x4

x5

ε1

ε2

ε3

ε4

ε5

1 0 0 0 0

1 1 0 0 0

1 1 1 0 0

1 1 1 1 0

1 1 1 1 1

×=

x3 = ε1 + ε2 + ε3

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 8 / 69

Page 19: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Matrix Representation of Latent Variables

x1

x2

x3

x4

x5

ε1

ε2

ε3

ε4

ε5

1 0 0 0 0

1 1 0 0 0

1 1 1 0 0

1 1 1 1 0

1 1 1 1 1

×=

x4 = ε1 + ε2 + ε3 + ε4

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 8 / 69

Page 20: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Matrix Representation of Latent Variables

x1

x2

x3

x4

x5

ε1

ε2

ε3

ε4

ε5

1 0 0 0 0

1 1 0 0 0

1 1 1 0 0

1 1 1 1 0

1 1 1 1 1

×=

x5 = ε1 + ε2 + ε3 + ε4 + ε5

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 8 / 69

Page 21: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Matrix Representation of Latent Variables

x εL1 ×=

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 8 / 69

Page 22: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Multivariate Process

Since x is linearly related to ε we know x is a Gaussian process.

Trick: we only need to compute the mean and covariance of x todetermine that Gaussian.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 9 / 69

Page 23: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process Mean

x = L1ε

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 10 / 69

Page 24: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process Mean

〈x〉 = 〈L1ε〉

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 10 / 69

Page 25: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process Mean

〈x〉 = L1 〈ε〉

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 10 / 69

Page 26: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process Mean

〈x〉 = L1 〈ε〉

ε ∼ N (0, αI)

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 10 / 69

Page 27: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process Mean

〈x〉 = L10

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 10 / 69

Page 28: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process Mean

〈x〉 = 0

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 10 / 69

Page 29: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process Covariance

xx> = L1εε>L>1

x> = ε>L>

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 11 / 69

Page 30: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process Covariance

⟨xx>⟩=⟨L1εε

>L>1⟩

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 11 / 69

Page 31: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process Covariance

⟨xx>⟩= L1

⟨εε>⟩

L>1

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 11 / 69

Page 32: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process Covariance

⟨xx>⟩= L1

⟨εε>⟩

L>1

ε ∼ N (0, αI)

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 11 / 69

Page 33: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process Covariance

⟨xx>⟩= αL1L>1

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 11 / 69

Page 34: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process

x = L1ε

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 12 / 69

Page 35: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process

x = L1ε

ε ∼ N (0, αI)

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 12 / 69

Page 36: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process

x = L1ε

ε ∼ N (0, αI)

=⇒

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 12 / 69

Page 37: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Latent Process

x = L1ε

ε ∼ N (0, αI)

=⇒x ∼ N

(0, αL1L>1

)

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 12 / 69

Page 38: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Covariance for Latent Process II

Givenε ∼ N (0, αI) =⇒ ε ∼ N

(0, αL1L>1

).

Thenε ∼ N (0,∆tαI) =⇒ ε ∼ N

(0,∆tαL1L>1

).

where ∆t is the time interval between observations.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 13 / 69

Page 39: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Covariance for Latent Process II

ε ∼ N (0, α∆tI) , x ∼ N(

0, α∆tL1L>1

)K = α∆tL1L>1

ki ,j = α∆tl>:,i l:,j

where l:,k is a vector from the kth row of L1: the first k elements are one,the next T − k are zero.

ki ,j = α∆t min(i , j)

define ∆ti = ti so

ki ,j = αmin(ti , tj ) = k(ti , tj )

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 14 / 69

Page 40: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Covariance for Latent Process II

ε ∼ N (0, α∆tI) , x ∼ N(

0, α∆tL1L>1

)K = α∆tL1L>1

ki ,j = α∆tl>:,i l:,j

where l:,k is a vector from the kth row of L1: the first k elements are one,the next T − k are zero.

ki ,j = α∆t min(i , j)

define ∆ti = ti so

ki ,j = αmin(ti , tj ) = k(ti , tj )

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 14 / 69

Page 41: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Covariance for Latent Process II

ε ∼ N (0, α∆tI) , x ∼ N(

0, α∆tL1L>1

)K = α∆tL1L>1

ki ,j = α∆tl>:,i l:,j

where l:,k is a vector from the kth row of L1: the first k elements are one,the next T − k are zero.

ki ,j = α∆t min(i , j)

define ∆ti = ti so

ki ,j = αmin(ti , tj ) = k(ti , tj )

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 14 / 69

Page 42: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Covariance for Latent Process II

ε ∼ N (0, α∆tI) , x ∼ N(

0, α∆tL1L>1

)K = α∆tL1L>1

ki ,j = α∆tl>:,i l:,j

where l:,k is a vector from the kth row of L1: the first k elements are one,the next T − k are zero.

ki ,j = α∆t min(i , j)

define ∆ti = ti so

ki ,j = αmin(ti , tj ) = k(ti , tj )

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 14 / 69

Page 43: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Covariance FunctionsWhere did this covariance matrix come from?

Markov Process

k(t, t ′)

= αmin(t, t ′)

Covariance matrix is builtusing the inputs to thefunction t.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 15 / 69

Page 44: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Covariance FunctionsWhere did this covariance matrix come from?

Markov Process

k(t, t ′)

= αmin(t, t ′)

Covariance matrix is builtusing the inputs to thefunction t.

-3

-2

-1

0

1

2

3

0 0.5 1 1.5 2

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 15 / 69

Page 45: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Simple Kalman Filter I

We have state vector X = [x1 . . . xq] ∈ RT×q and if each state evolvesindependently we have

p(X) =∏q

i=1 p(x:,i ) p(x:,i ) = N (x:,i |0,K).

We want to obtain outputs through:

yi ,: = Wxi ,:

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 16 / 69

Page 46: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Stacking and Kronecker Products I

Represent with a ‘stacked’ system:

p(x) = N (x|0, I⊗K)

where the stacking is placing each column of X one on top of anotheras

x =

x:,1

x:,2...

x:,q

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 17 / 69

Page 47: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Kronecker Product

aK bK

cK dKK

a b

c d⊗ =

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 18 / 69

Page 48: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Kronecker Product

⊗ =

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 18 / 69

Page 49: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Stacking and Kronecker Products I

Represent with a ‘stacked’ system:

p(x) = N (x|0, I⊗K)

where the stacking is placing each column of X one on top of anotheras

x =

x:,1

x:,2...

x:,q

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 19 / 69

Page 50: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Column Stacking

⊗ =

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 20 / 69

Page 51: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Two Ways of Stacking

Can also stack as follows:

x =

x1,:

x2,:...

xT ,:

p(x) = N (x|0,K⊗ I)

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 21 / 69

Page 52: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Row Stacking

⊗ =

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 22 / 69

Page 53: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

For this stacking the marginal distribution over the latent variables is givenby the block diagonals.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 23 / 69

Page 54: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

For this stacking the marginal distribution over the latent variables is givenby the block diagonals.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 23 / 69

Page 55: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

For this stacking the marginal distribution over the latent variables is givenby the block diagonals.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 23 / 69

Page 56: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

For this stacking the marginal distribution over the latent variables is givenby the block diagonals.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 23 / 69

Page 57: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

For this stacking the marginal distribution over the latent variables is givenby the block diagonals.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 23 / 69

Page 58: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Observed Process

If we relate the observations to the data as follows:

yi ,: = Wxi ,: + εi ,:

ε ∼ N(0, σ2I

)

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 24 / 69

Page 59: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Output Covariance

This leads to a covariance of the form

(I⊗W)(K⊗ I)(I⊗W>) + Iσ2

Using (A⊗ B)(C⊗D) = AC⊗ BD This leads to

K⊗WW> + Iσ2

ory ∼ N

(0,WW> ⊗K + Iσ2

)

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 25 / 69

Page 60: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Kronecker Structure GPs

This Kronecker structure leads to several published models.

(K(x, x′))d ,d ′ = k(x, x′)kT (d , d ′),

where k has x and kT has n as inputs.

Can think of multiple output covariance functions as covariances withaugmented input.

Alongside x we also input the d associated with the output of interest.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 26 / 69

Page 61: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Separable Covariance Functions

Taking B = WW> we have a matrix expression across outputs.

K(x, x′) = k(x, x′)B,

where B is a p × p symmetric and positive semi-definite matrix.

B is called the coregionalization matrix.

We call this class of covariance functions separable due to theirproduct structure.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 27 / 69

Page 62: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Sum of Separable Covariance Functions

In the same spirit a more general class of kernels is given by

K(x, x′) =

q∑j=1

kj (x, x′)Bj .

This can also be written as

K(X,X) =

q∑j=1

Bj ⊗ kj (X,X),

This is like several Kalman filter-type models added together, buteach one with a different set of latent functions.

We call this class of kernels sum of separable kernels (SoS kernels).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 28 / 69

Page 63: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Geostatistics

Use of GPs in Geostatistics is called kriging.

These multi-output GPs pioneered in geostatistics: prediction overvector-valued output data is known as cokriging.

The model in geostatistics is known as the linear model ofcoregionalization (LMC, ??).

Most machine learning multitask models can be placed in the contextof the LMC model.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 29 / 69

Page 64: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Weighted sum of Latent Functions

In the linear model of coregionalization (LMC) outputs are expressedas linear combinations of independent random functions.

In the LMC, each component fd is expressed as a linear sum

fd (x) =

q∑j=1

wd ,j uj (x).

where the latent functions are independent and have covariancefunctions kj (x, x′).

The processes fj (x)qj=1 are independent for q 6= j ′.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 30 / 69

Page 65: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Kalman Filter Special Case

The Kalman filter is an example of the LMC where ui (x)→ xi (t).

I.e. we’ve moved form time input to a more general input space.

In matrix notation:1 Kalman filter

F = WX

2 LMCF = WU

where the rows of these matrices F, X, U each contain q samplesfrom their corresponding functions at a different time (Kalman filter)or spatial location (LMC).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 31 / 69

Page 66: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Intrinsic Coregionalization Model

If one covariance used for latent functions (like in Kalman filter).

This is called the intrinsic coregionalization model (ICM, ?).

The kernel matrix corresponding to a dataset X takes the form

K(X,X) = B⊗ k(X,X).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 32 / 69

Page 67: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Autokrigeability

If outputs are noise-free, maximum likelihood is equivalent toindependent fits of B and k(x, x′) (?).

In geostatistics this is known as autokrigeability (?).

In multitask learning its the cancellation of intertask transfer (?).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 33 / 69

Page 68: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Intrinsic Coregionalization Model

K(X,X) = ww> ⊗ k(X,X).

w =

[15

]B =

[1 55 25

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 34 / 69

Page 69: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Intrinsic Coregionalization Model

K(X,X) = ww> ⊗ k(X,X).

w =

[15

]B =

[1 55 25

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 34 / 69

Page 70: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Intrinsic Coregionalization Model

K(X,X) = ww> ⊗ k(X,X).

w =

[15

]B =

[1 55 25

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 34 / 69

Page 71: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Intrinsic Coregionalization Model

K(X,X) = ww> ⊗ k(X,X).

w =

[15

]B =

[1 55 25

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 34 / 69

Page 72: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Intrinsic Coregionalization Model

K(X,X) = ww> ⊗ k(X,X).

w =

[15

]B =

[1 55 25

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 34 / 69

Page 73: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Intrinsic Coregionalization Model

K(X,X) = B⊗ k(X,X).

B =

[1 0.5

0.5 1.5

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 35 / 69

Page 74: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Intrinsic Coregionalization Model

K(X,X) = B⊗ k(X,X).

B =

[1 0.5

0.5 1.5

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 35 / 69

Page 75: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Intrinsic Coregionalization Model

K(X,X) = B⊗ k(X,X).

B =

[1 0.5

0.5 1.5

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 35 / 69

Page 76: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Intrinsic Coregionalization Model

K(X,X) = B⊗ k(X,X).

B =

[1 0.5

0.5 1.5

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 35 / 69

Page 77: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Intrinsic Coregionalization Model

K(X,X) = B⊗ k(X,X).

B =

[1 0.5

0.5 1.5

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 35 / 69

Page 78: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

LMC Samples

K(X,X) = B1 ⊗ k1(X,X) + B2 ⊗ k2(X,X)

B1 =

[1.4 0.50.5 1.2

]`1 = 1

B2 =

[1 0.5

0.5 1.3

]`2 = 0.2

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 36 / 69

Page 79: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

LMC Samples

K(X,X) = B1 ⊗ k1(X,X) + B2 ⊗ k2(X,X)

B1 =

[1.4 0.50.5 1.2

]`1 = 1

B2 =

[1 0.5

0.5 1.3

]`2 = 0.2

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 36 / 69

Page 80: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

LMC Samples

K(X,X) = B1 ⊗ k1(X,X) + B2 ⊗ k2(X,X)

B1 =

[1.4 0.50.5 1.2

]`1 = 1

B2 =

[1 0.5

0.5 1.3

]`2 = 0.2

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 36 / 69

Page 81: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

LMC Samples

K(X,X) = B1 ⊗ k1(X,X) + B2 ⊗ k2(X,X)

B1 =

[1.4 0.50.5 1.2

]`1 = 1

B2 =

[1 0.5

0.5 1.3

]`2 = 0.2

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 36 / 69

Page 82: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

LMC Samples

K(X,X) = B1 ⊗ k1(X,X) + B2 ⊗ k2(X,X)

B1 =

[1.4 0.50.5 1.2

]`1 = 1

B2 =

[1 0.5

0.5 1.3

]`2 = 0.2

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 36 / 69

Page 83: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

LMC in Machine Learning and Statistics

Used in machine learning for GPs for multivariate regression and instatistics for computer emulation of expensive multivariate computercodes.

Imposes the correlation of the outputs explicitly through the set ofcoregionalization matrices.

Setting B = Ip assumes outputs are conditionally independent giventhe parameters θ. (???).

More recent approaches for multiple output modeling are differentversions of the linear model of coregionalization.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 37 / 69

Page 84: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Semiparametric Latent Factor Model

Coregionalization matrices are rank 1 ?. rewrite equation (??) as

K(X,X) =

q∑j=1

w:,j w>:,j ⊗ kj (X,X).

Like the Kalman filter, but each latent function has a differentcovariance.

Authors suggest using an exponentiated quadratic characteristiclength-scale for each input dimension.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 38 / 69

Page 85: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Semiparametric Latent Factor Model Samples

K(X,X) = w:,1w>:,1 ⊗ k1(X,X) + w:,2w>:,2 ⊗ k2(X,X)

w1 =

[0.51

]w2 =

[1

0.5

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 39 / 69

Page 86: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Semiparametric Latent Factor Model Samples

K(X,X) = w:,1w>:,1 ⊗ k1(X,X) + w:,2w>:,2 ⊗ k2(X,X)

w1 =

[0.51

]w2 =

[1

0.5

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 39 / 69

Page 87: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Semiparametric Latent Factor Model Samples

K(X,X) = w:,1w>:,1 ⊗ k1(X,X) + w:,2w>:,2 ⊗ k2(X,X)

w1 =

[0.51

]w2 =

[1

0.5

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 39 / 69

Page 88: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Semiparametric Latent Factor Model Samples

K(X,X) = w:,1w>:,1 ⊗ k1(X,X) + w:,2w>:,2 ⊗ k2(X,X)

w1 =

[0.51

]w2 =

[1

0.5

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 39 / 69

Page 89: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Semiparametric Latent Factor Model Samples

K(X,X) = w:,1w>:,1 ⊗ k1(X,X) + w:,2w>:,2 ⊗ k2(X,X)

w1 =

[0.51

]w2 =

[1

0.5

]

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 39 / 69

Page 90: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Gaussian processes for Multi-task, Multi-output andMulti-class

? suggest ICM for multitask learning.

Use a PPCA form for B: similar to our Kalman filter example.

Refer to the autokrigeability effect as the cancellation of inter-tasktransfer.

Also discuss the similarities between the multi-task GP and the ICM,and its relationship to the SLFM and the LMC.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 40 / 69

Page 91: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Multitask Classification

Mostly restricted to the case where the outputs are conditionallyindependent given the hyperparameters φ (??????).

Intrinsic coregionalization model has been used in the multiclassscenario. ? use the intrinsic coregionalization model for classification,by introducing a probit noise model as the likelihood.

Posterior distribution is no longer analytically tractable: approximateinference is required.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 41 / 69

Page 92: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Computer Emulation

A statistical model used as a surrogate for a computationallyexpensive computer model.

? use the linear model of coregionalization to model imagesrepresenting the evolution of the implosion of steel cylinders.

In ? use the ICM to model a vegetation model: called the SheffieldDynamic Global Vegetation Model (?).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 42 / 69

Page 93: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Outline

1 Kalman Filter

2 Convolution Processes

3 Motion Capture Example

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 43 / 69

Page 94: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Convolution Process

A convolution process is a moving-average construction thatguarantees a valid covariance function.

Consider a set of functions fj (x)pj=1.

Each function can be expressed as

fj (x) =

∫X

Gj (x− z)u(z)dz = Gj (x) ∗ u(x).

Influence of more than one latent function, ui (z)qi=1 and inclusion

of an independent process wj (x)

yj (x) = fj (x) + wj (x) =

q∑i=1

∫X

Gj ,i (x− z)ui (z)dz + wj (x).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 44 / 69

Page 95: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

A pictorial representation

u(x)

u(x): latent function.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 45 / 69

Page 96: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

A pictorial representation

u(x)

G (x)1

G (x)2

u(x): latent function.

G(x): smoothing kernel.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 45 / 69

Page 97: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

A pictorial representation

u(x)

G (x)1

G (x)2

(f x)2

(f x)1

u(x): latent function.

G(x): smoothing kernel.

f(x): output function.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 45 / 69

Page 98: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

A pictorial representation

u(x)

G (x)1

G (x)2

(f x)2

(f x)1(w x)1

(w x)2

u(x): latent function.

G(x): smoothing kernel.

f(x): output function.

w(x): independent process.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 45 / 69

Page 99: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

A pictorial representation

u(x)

G (x)1

G (x)2

(f x)2

(f x)1(w x)1

(w x)2

(y x)1

(y x)2

u(x): latent function.

G(x): smoothing kernel.

f(x): output function.

w(x): independent process.

y(x): noisy output function.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 45 / 69

Page 100: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Covariance of the output functions.

The covariance between yj (x) and yj ′(x′) is given as

cov[yj (x), yj ′(x′)

]=cov

[fj (x), fj ′(x′)

]+ cov

[wj (x),wj ′(x′)

]δj ,j ′

where

cov[fj (x), fj ′(x′)

]=

∫X

Gj (x− z)

∫X

Gj ′(x′ − z′)cov[u(z), u(z′)

]dz′dz

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 46 / 69

Page 101: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Different forms of covariance for the output functions.

Input Gaussian process

cov[fj , fj ′

]=

∫X

Gj (x− z)

∫X

Gj ′(x′ − z′)ku,u(z, z′)dz′dz

Input white noise process

cov[fj , fj ′

]=

∫X

Gj (x− z)Gj ′(x′ − z)dz

Covariance between output functions and latent functions

cov [fj , u] =

∫X

Gj (x− z′)ku,u(z′, z)dz′

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 47 / 69

Page 102: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Styles of Machine LearningBackground: interpolation is easy, extrapolation is hard

Urs Holzle keynote talk at NIPS 2005.I Emphasis on massive data sets.I Let the data do the work—more data, less extrapolation.

Alternative paradigm:I Very scarce data: computational biology, human motion.I How to generalize from scarce data?I Need to include more assumptions about the data (e.g. invariances).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 48 / 69

Page 103: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

General ApproachBroadly Speaking: Two approaches to modeling

data modeling mechanistic modeling

let the data “speak” impose physical laws

data driven knowledge driven

adaptive models differential equations

digit recognition climate, weather modelsW

eakl

yM

echan

istic

Strongly

Mec

hanist

ic

Figure: Main modeling activity.Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 49 / 69

Page 104: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

General ApproachBroadly Speaking: Two approaches to modeling

data modeling mechanistic modeling

let the data “speak”

impose physical laws

data driven knowledge driven

adaptive models differential equations

digit recognition climate, weather modelsW

eakl

yM

echan

istic

Strongly

Mec

hanist

ic

Figure: Main modeling activity.Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 49 / 69

Page 105: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

General ApproachBroadly Speaking: Two approaches to modeling

data modeling mechanistic modeling

let the data “speak” impose physical laws

data driven knowledge driven

adaptive models differential equations

digit recognition climate, weather modelsW

eakl

yM

echan

istic

Strongly

Mec

hanist

ic

Figure: Main modeling activity.Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 49 / 69

Page 106: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

General ApproachBroadly Speaking: Two approaches to modeling

data modeling mechanistic modeling

let the data “speak” impose physical laws

data driven

knowledge driven

adaptive models differential equations

digit recognition climate, weather modelsW

eakl

yM

echan

istic

Strongly

Mec

hanist

ic

Figure: Main modeling activity.Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 49 / 69

Page 107: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

General ApproachBroadly Speaking: Two approaches to modeling

data modeling mechanistic modeling

let the data “speak” impose physical laws

data driven knowledge driven

adaptive models differential equations

digit recognition climate, weather modelsW

eakl

yM

echan

istic

Strongly

Mec

hanist

ic

Figure: Main modeling activity.Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 49 / 69

Page 108: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

General ApproachBroadly Speaking: Two approaches to modeling

data modeling mechanistic modeling

let the data “speak” impose physical laws

data driven knowledge driven

adaptive models

differential equations

digit recognition climate, weather modelsW

eakl

yM

echan

istic

Strongly

Mec

hanist

ic

Figure: Main modeling activity.Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 49 / 69

Page 109: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

General ApproachBroadly Speaking: Two approaches to modeling

data modeling mechanistic modeling

let the data “speak” impose physical laws

data driven knowledge driven

adaptive models differential equations

digit recognition climate, weather modelsW

eakl

yM

echan

istic

Strongly

Mec

hanist

ic

Figure: Main modeling activity.Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 49 / 69

Page 110: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

General ApproachBroadly Speaking: Two approaches to modeling

data modeling mechanistic modeling

let the data “speak” impose physical laws

data driven knowledge driven

adaptive models differential equations

digit recognition

climate, weather modelsW

eakl

yM

echan

istic

Strongly

Mec

hanist

ic

Figure: Main modeling activity.Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 49 / 69

Page 111: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

General ApproachBroadly Speaking: Two approaches to modeling

data modeling mechanistic modeling

let the data “speak” impose physical laws

data driven knowledge driven

adaptive models differential equations

digit recognition climate, weather models

Wea

kly

Mec

hanist

ic

Strongly

Mec

hanist

ic

Figure: Main modeling activity.Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 49 / 69

Page 112: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

General ApproachBroadly Speaking: Two approaches to modeling

data modeling mechanistic modeling

let the data “speak” impose physical laws

data driven knowledge driven

adaptive models differential equations

digit recognition climate, weather modelsW

eakl

yM

echan

istic

Strongly

Mec

hanist

ic

Figure: Main modeling activity.Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 49 / 69

Page 113: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

General ApproachBroadly Speaking: Two approaches to modeling

data modeling mechanistic modeling

let the data “speak” impose physical laws

data driven knowledge driven

adaptive models differential equations

digit recognition climate, weather modelsW

eakl

yM

echan

istic

Strongly

Mec

hanist

ic

Figure: Main modeling activity.Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 49 / 69

Page 114: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Weakly Mechanistic vs Strongly Mechanistic

Underlying data modeling techniques there are weakly mechanisticprinciples (e.g. smoothness).

In physics the models are typically strongly mechanistic.

In principle we expect a range of models which vary in the strength oftheir mechanistic assumptions.

This work is one part of that spectrum: add further mechanistic ideasto weakly mechanistic models.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 50 / 69

Page 115: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Dimensionality Reduction

Linear relationship between the data, X ∈ <n×p, and a reduceddimensional representation, F ∈ <n×q, where q p.

X = FW + ε,

ε ∼ N (0,Σ)

Integrate out F, optimize with respect to W.

For Gaussian prior, F ∼ N (0, I)I and Σ = σ2I we have probabilistic PCA (??).I and Σ constrained to be diagonal, we have factor analysis.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 51 / 69

Page 116: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Dimensionality Reduction: Temporal Data

Deal with temporal data with a temporal latent prior.

Independent Gauss-Markov priors over each fi (t) leads to :Rauch-Tung-Striebel (RTS) smoother (Kalman filter).

More generally consider a Gaussian process (GP) prior,

p (F|t) =

q∏i=1

N(f:,i |0,Kf:,i ,f:,i

).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 52 / 69

Page 117: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Joint Gaussian Process

Given the covariance functions for fi (t) we have an impliedcovariance function across all xi (t)—(ML: semi-parametric latentfactor model (?), Geostatistics: linear model of coregionalization).

Rauch-Tung-Striebel smoother has been preferredI linear computational complexity in n.I Advances in sparse approximations have made the general GP

framework practical. (???).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 53 / 69

Page 118: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Mechanical Analogy

Back to Mechanistic Models!

These models rely on the latent variables to provide the dynamicinformation.

We now introduce a further dynamical system with a mechanisticinspiration.

Physical Interpretation:

I the latent functions, fi (t) are q forces.I We observe the displacement of p springs to the forces.,I Interpret system as the force balance equation, XD = FS + ε.I Forces act, e.g. through levers — a matrix of sensitivities, S ∈ <q×p.I Diagonal matrix of spring constants, D ∈ <p×p.I Original System: W = SD−1.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 54 / 69

Page 119: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Extend Model

Add a damper and give the system mass.

FS = XM + XC + XD + ε.

Now have a second order mechanical system.

It will exhibit inertia and resonance.

There are many systems that can also be represented by differentialequations.

I When being forced by latent function(s), fi (t)qi=1, we call this a

latent force model.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 55 / 69

Page 120: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Marionette

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 56 / 69

Page 121: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Mass Spring Damper Analogy

d1

c1 m1s1f (t)

x1(t)

s1

d2

c2 m2s2f (t)

x2(t)

s2

f (t)

mass

spring

damper

observations

latent input

pulleys

Figure: Mass spring damper analogy, an unobserved force drives multipleoscillators.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 57 / 69

Page 122: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Mass Spring Damper Analogy

d1

c1 m1s1f (t)

x1(t)

s1

d2

c2 m2s2f (t)

x2(t)

s2

f (t)

mass

spring

damper

observations

latent input

pulleys

Figure: Mass spring damper analogy, an unobserved force drives multipleoscillators.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 57 / 69

Page 123: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Mass Spring Damper Analogy

d1

c1 m1s1f (t)

x1(t)

s1

d2

c2 m2s2f (t)

x2(t)

s2

f (t)

mass

spring

damper

observations

latent input

pulleys

Figure: Mass spring damper analogy, an unobserved force drives multipleoscillators.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 57 / 69

Page 124: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Mass Spring Damper Analogy

d1

c1 m1s1f (t)

x1(t)

s1

d2

c2 m2s2f (t)

x2(t)

s2

f (t)

mass

spring

damper

observations

latent input

pulleys

Figure: Mass spring damper analogy, an unobserved force drives multipleoscillators.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 57 / 69

Page 125: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Mass Spring Damper Analogy

d1

c1 m1s1f (t)

x1(t)

s1

d2

c2 m2s2f (t)

x2(t)

s2

f (t)

mass

spring

damper

observations

latent input

pulleys

Figure: Mass spring damper analogy, an unobserved force drives multipleoscillators.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 57 / 69

Page 126: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Gaussian Process priors and Latent Force ModelsDriven Harmonic Oscillator

For Gaussian process we can compute the covariance matrices for theoutput displacements.

For one displacement the model is

mk xk(t) + ck xk(t) + dkxk(t) = bk +

q∑i=0

sik fi (t), (1)

where, mk is the kth diagonal element from M and similarly for ck

and dk . sik is the i , kth element of S.

Model the latent forces as q independent, GPs with exponentiatedquadratic covariances

kfi fl(t, t ′) = exp

(−(t − t ′)2

2`2i

)δil .

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 58 / 69

Page 127: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Covariance for ODE Model

Exponentiated Quadratic Covariance function for f (t)

xj (t) =1

mjωj

q∑i=1

sji exp(−αj t)

∫ t

0fi (τ) exp(αjτ) sin(ωj (t − τ))dτ

Joint distributionfor x1 (t), x2 (t),x3 (t) and f (t).Damping ratios:

ζ1 ζ2 ζ3

0.125 2 1f(t) y

1(t) y

2(t) y

3(t)

f(t)

y 1(t)

y 2(t)

y 3(t)

−0.4

−0.2

0

0.2

0.4

0.6

0.8

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 59 / 69

Page 128: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Covariance for ODE Model

Analogy

x =∑

i

e>i fi fi ∼ N (0,Σi )→ x ∼ N

(0,∑

i

e>i Σi ei

)

Joint distributionfor x1 (t), x2 (t),x3 (t) and f (t).Damping ratios:

ζ1 ζ2 ζ3

0.125 2 1f(t) y

1(t) y

2(t) y

3(t)

f(t)

y 1(t)

y 2(t)

y 3(t)

−0.4

−0.2

0

0.2

0.4

0.6

0.8

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 59 / 69

Page 129: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Covariance for ODE Model

Exponentiated Quadratic Covariance function for f (t)

xj (t) =1

mjωj

q∑i=1

sji exp(−αj t)

∫ t

0fi (τ) exp(αjτ) sin(ωj (t − τ))dτ

Joint distributionfor x1 (t), x2 (t),x3 (t) and f (t).Damping ratios:

ζ1 ζ2 ζ3

0.125 2 1f(t) y

1(t) y

2(t) y

3(t)

f(t)

y 1(t)

y 2(t)

y 3(t)

−0.4

−0.2

0

0.2

0.4

0.6

0.8

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 59 / 69

Page 130: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Joint Sampling of x (t) and f (t)

lfmSample

50 55 60 65 70−2

−1.5

−1

−0.5

0

0.5

1

1.5

50 55 60 65 70−1

−0.5

0

0.5

1

1.5

2

50 55 60 65 70−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

50 55 60 65 70−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Figure: Joint samples from the ODE covariance, black: f (t), red: x1 (t)(underdamped), green: x2 (t) (overdamped), and blue: x3 (t) (criticallydamped).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 60 / 69

Page 131: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Joint Sampling of x (t) and f (t)

lfmSample

50 55 60 65 70−2

−1.5

−1

−0.5

0

0.5

1

1.5

50 55 60 65 70−1

−0.5

0

0.5

1

1.5

2

50 55 60 65 70−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

50 55 60 65 70−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Figure: Joint samples from the ODE covariance, black: f (t), red: x1 (t)(underdamped), green: x2 (t) (overdamped), and blue: x3 (t) (criticallydamped).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 60 / 69

Page 132: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Joint Sampling of x (t) and f (t)

lfmSample

50 55 60 65 70−2

−1.5

−1

−0.5

0

0.5

1

1.5

50 55 60 65 70−1

−0.5

0

0.5

1

1.5

2

50 55 60 65 70−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

50 55 60 65 70−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Figure: Joint samples from the ODE covariance, black: f (t), red: x1 (t)(underdamped), green: x2 (t) (overdamped), and blue: x3 (t) (criticallydamped).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 60 / 69

Page 133: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Joint Sampling of x (t) and f (t)

lfmSample

50 55 60 65 70−2

−1.5

−1

−0.5

0

0.5

1

1.5

50 55 60 65 70−1

−0.5

0

0.5

1

1.5

2

50 55 60 65 70−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

50 55 60 65 70−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Figure: Joint samples from the ODE covariance, black: f (t), red: x1 (t)(underdamped), green: x2 (t) (overdamped), and blue: x3 (t) (criticallydamped).

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 60 / 69

Page 134: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Covariance for ODE

Exponentiated Quadratic Covariance function for f (t)

xj (t) =1

mjωj

q∑i=1

sji exp(−αj t)

∫ t

0fi (τ) exp(αjτ) sin(ωj (t − τ))dτ

Joint distributionfor x1 (t), x2 (t),x3 (t) and f (t).

Damping ratios:ζ1 ζ2 ζ3

0.125 2 1f(t) y

1(t) y

2(t) y

3(t)

f(t)

y 1(t)

y 2(t)

y 3(t)

−0.4

−0.2

0

0.2

0.4

0.6

0.8

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 61 / 69

Page 135: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Outline

1 Kalman Filter

2 Convolution Processes

3 Motion Capture Example

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 62 / 69

Page 136: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Example: Motion Capture

Mauricio Alvarez and David Luengo (??)

Motion capture data: used for animating human motion.

Multivariate time series of angles representing joint positions.

Objective: generalize from training data to realistic motions.

Use 2nd Order Latent Force Model with mass/spring/damper (resistorinductor capacitor) at each joint.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 63 / 69

Page 137: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Example: Motion Capture

Mauricio Alvarez and David Luengo (??)

Motion capture data: used for animating human motion.

Multivariate time series of angles representing joint positions.

Objective: generalize from training data to realistic motions.

Use 2nd Order Latent Force Model with mass/spring/damper (resistorinductor capacitor) at each joint.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 63 / 69

Page 138: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Example: Motion Capture

Mauricio Alvarez and David Luengo (??)

Motion capture data: used for animating human motion.

Multivariate time series of angles representing joint positions.

Objective: generalize from training data to realistic motions.

Use 2nd Order Latent Force Model with mass/spring/damper (resistorinductor capacitor) at each joint.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 63 / 69

Page 139: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Example: Motion Capture

Mauricio Alvarez and David Luengo (??)

Motion capture data: used for animating human motion.

Multivariate time series of angles representing joint positions.

Objective: generalize from training data to realistic motions.

Use 2nd Order Latent Force Model with mass/spring/damper (resistorinductor capacitor) at each joint.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 63 / 69

Page 140: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Prediction of Test Motion

Model left arm only.

3 balancing motions (18, 19, 20) from subject 49.

18 and 19 are similar, 20 contains more dramatic movements.

Train on 18 and 19 and testing on 20

Data was down-sampled by 32 (from 120 fps).

Reconstruct motion of left arm for 20 given other movements.

Compare with GP that predicts left arm angles given other bodyangles.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 64 / 69

Page 141: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Mocap Results

Table: Root mean squared (RMS) angle error for prediction of the left arm’sconfiguration in the motion capture data. Prediction with the latent force modeloutperforms the prediction with regression for all apart from the radius’s angle.

Latent Force RegressionAngle Error ErrorRadius 4.11 4.02Wrist 6.55 6.65

Hand X rotation 1.82 3.21Hand Z rotation 2.76 6.14

Thumb X rotation 1.77 3.10Thumb Z rotation 2.73 6.09

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 65 / 69

Page 142: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Mocap Results II

0 1 2 3 4 5 6 7 8 9−300

−250

−200

−150

−100

−50

0

50

100

150

(a) Inferred Latent Force

0 1 2 3 4 5 6 7 8 9−5

0

5

10

15

20

25

30

35

40

45

(b) Wrist

0 1 2 3 4 5 6 7 8 9−30

−25

−20

−15

−10

−5

0

(c) Hand X Rotation

0 1 2 3 4 5 6 7 8 9−45

−40

−35

−30

−25

−20

−15

−10

−5

0

(d) Hand Z Rotation

0 1 2 3 4 5 6 7 8 9−2

0

2

4

6

8

10

12

(e) Thumb X Rotation

0 1 2 3 4 5 6 7 8 9−15

−10

−5

0

5

10

15

20

(f) Thumb Z Rotation

Figure: Predictions from LFM (solid line, grey error bars) and direct regression(crosses with stick error bars).Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 66 / 69

Page 143: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Motion Capture Experiments

Data set is from the CMU motion capture data base1.

Two different types of movements: golf-swing and walking.

Train on a subset of motions for each movement and test on adifferent subset.

This assesses the model’s ability to extrapolate.

For testing: condition on three angles associated to the root nodesand first five and last five frames of the motion.

Golf-swing use leave one out cross validation on four motions.

For the walking train on 4 motions and validate on 8 motions.

1The CMU Graphics Lab Motion Capture Database was created withfunding from NSF EIA-0196217 and is available at http://mocap.cs.cmu.edu.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 67 / 69

Page 144: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

Motion Capture Results

Table: RMSE and R2 (explained variance) for golf swing and walking

Movement Method RMSE R2 (%)

Golf swing

IND GP 21.55± 2.35 30.99± 9.67MTGP 21.19± 2.18 45.59± 7.86SLFM 21.52± 1.93 49.32± 3.03LFM 18.09± 1.30 72.25± 3.08

Walking

IND GP 8.03± 2.55 30.55± 10.64MTGP 7.75± 2.05 37.77± 4.53SLFM 7.81± 2.00 36.84± 4.26LFM 7.23± 2.18 48.15± 5.66

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 68 / 69

Page 145: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

References IM. A. Alvarez, D. Luengo, and N. D. Lawrence. Latent force models. In ?, pages 9–16. [PDF].

M. A. Alvarez, D. Luengo, and N. D. Lawrence. Linear latent force models using Gaussian processes. Technical report,University of Sheffield, [PDF].

E. V. Bonilla, K. M. Chai, and C. K. I. Williams. Multi-task Gaussian process prediction. In J. C. Platt, D. Koller, Y. Singer, andS. Roweis, editors, Advances in Neural Information Processing Systems, volume 20, Cambridge, MA, 2008. MIT Press.

S. Conti and A. O’Hagan. Bayesian emulation of complex multi-output and dynamic computer models. Journal of StatisticalPlanning and Inference, 140(3):640–651, 2009. [DOI].

P. Goovaerts. Geostatistics For Natural Resources Evaluation. Oxford University Press, 1997. [Google Books] .

J. D. Helterbrand and N. A. C. Cressie. Universal cokriging under intrinsic coregionalization. Mathematical Geology, 26(2):205–226, 1994.

D. M. Higdon, J. Gattiker, B. Williams, and M. Rightley. Computer model calibration using high dimensional output. Journal ofthe American Statistical Association, 103(482):570–583, 2008.

A. G. Journel and C. J. Huijbregts. Mining Geostatistics. Academic Press, London, 1978. [Google Books] .

N. D. Lawrence and J. C. Platt. Learning to learn with the informative vector machine. In R. Greiner and D. Schuurmans,editors, Proceedings of the International Conference in Machine Learning, volume 21, pages 512–519. Omnipress, 2004.[PDF].

T. P. Minka and R. W. Picard. Learning how to learn is learning with point sets. Available on-line., 1997. [URL]. Revised 1999,available at http://www.stat.cmu.edu/~minka/.

J. Quinonero Candela and C. E. Rasmussen. A unifying view of sparse approximate Gaussian process regression. Journal ofMachine Learning Research, 6:1939–1959, 2005.

C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA, 2006. [GoogleBooks] .

S. T. Roweis. EM algorithms for PCA and SPCA. In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, Advances in NeuralInformation Processing Systems, volume 10, pages 626–632, Cambridge, MA, 1998. MIT Press.

M. Seeger and M. I. Jordan. Sparse Gaussian Process Classification With Multiple Classes. Technical Report 661, Department ofStatistics, University of California at Berkeley,

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 69 / 69

Page 146: Multioutput, Multitask and Mechanisticurtasun/tutorials/gp_cvpr12... · 2012. 7. 10. · Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June

References II

G. Skolidis and G. Sanguinetti. Bayesian multitask classification with Gaussian process priors. IEEE Transactions on NeuralNetworks, 22(12):2011 – 2021, 2011.

E. Snelson and Z. Ghahramani. Sparse Gaussian processes using pseudo-inputs. In Y. Weiss, B. Scholkopf, and J. C. Platt,editors, Advances in Neural Information Processing Systems, volume 18, Cambridge, MA, 2006. MIT Press.

Y. W. Teh, M. Seeger, and M. I. Jordan. Semiparametric latent factor models. In R. G. Cowell and Z. Ghahramani, editors,Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, pages 333–340, Barbados, 6-8January 2005. Society for Artificial Intelligence and Statistics.

M. E. Tipping and C. M. Bishop. Probabilistic principal component analysis. Journal of the Royal Statistical Society, B, 6(3):611–622, 1999. [PDF]. [DOI].

M. K. Titsias. Variational learning of inducing variables in sparse Gaussian processes. In ?, pages 567–574.

D. van Dyk and M. Welling, editors. Artificial Intelligence and Statistics, volume 5, Clearwater Beach, FL, 16-18 April 2009.JMLR W&CP 5.

H. Wackernagel. Multivariate Geostatistics: An Introduction With Applications. Springer-Verlag, 3rd edition, 2003. [GoogleBooks] .

C. K. Williams and D. Barber. Bayesian Classification with Gaussian processes. IEEE Transactions on Pattern Analysis andMachine Intelligence, 20(12):1342–1351, 1998.

I. Woodward, M. R. Lomas, and R. A. Betts. Vegetation-climate feedbacks in a greenhouse world. Philosophical Transactions:Biological Sciences, 353(1365):29–39, 1998.

K. Yu, V. Tresp, and A. Schwaighofer. Learning Gaussian processes from multiple tasks. In Proceedings of the 22ndInternational Conference on Machine Learning (ICML 2005), pages 1012–1019, 2005.

Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 70 / 69