34
MCMC: Gibbs Sampler

MCMC: Gibbs Sampler

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MCMC: Gibbs Sampler

MCMC:Gibbs Sampler

Page 2: MCMC: Gibbs Sampler

Building Complex Models: Conditional Sampling

● Consider a more complex model

● When sampling for each parameter iteratively, only need to consider distributions that have that parameter

pb , ,∣y ∝ p y∣b p b∣ , p p

pb∣ , , y ∝ p y∣b p b∣ ,

p∣b , , y ∝ p b∣ , p

p∣b , , y ∝ p b∣ , p

Page 3: MCMC: Gibbs Sampler

In one MCMC step

p bg1∣g ,g , y ∝ p y∣bg p bg∣g ,g

p g1∣bg1 ,g , y ∝ p bg1∣g ,g p g

p g1∣bg1 ,g1 , y ∝ pbg1∣g1 ,g p g

Page 4: MCMC: Gibbs Sampler

Gibbs Sampling

● If we can solve a conditional distribution analytically– We can sample from that conditional directly

– Even if we can't solve for full joint distribution analytically

● Example

p ,2∣y ∝N y∣ ,2N ∣0,V 0 IG 2∣ ,

Page 5: MCMC: Gibbs Sampler

=N n

2g y

0

2

n

2g

1

2

,1

1

2g

1

2

p 2g1∣

g1 ,y ∝ N y∣g1 ,2g IG

2g∣ ,

p g1∣

2g ,y ∝N y∣g ,2gN

g∣0,V 0

= IG n2 ,12∑ y i−

g12

Page 6: MCMC: Gibbs Sampler

Trade-offs of Gibbs Sampling

● 100% acceptance rate● Quicker mixing, quicker convergence● No need to specify a Jump distribution● No need to tune a Jump variance● Requires that we know the analytical solution

for the conditional– Conjugate distributions

● Can mix different samplers within a MCMC

Page 7: MCMC: Gibbs Sampler

Gibbs Sampler

1) Start from some initial parameter value c

2) Evaluate the unnormalized posterior

3) Propose a new parameter value Random draw from conditional distribution given the current value for all other parameters

4) Evaluate the new unnormalized posterior

5) Decide whether or not to accept the new valueAccept new value 100% of the time

6) Repeat 3-5

Page 8: MCMC: Gibbs Sampler

Metropolis Algorithm

)1 Start from some initial parameter value c

)2 Evaluate the unnormalized posterior p(c)

)3 Propose a new parameter value Random draw from a “jump” distribution centered on the current parameter value

)4 Evaluate the new unnormalized posterior p()

)5 Decide whether or not to accept the new valueAccept new value with probability

a = p() / p(c)

)6 Repeat 3-5

Page 9: MCMC: Gibbs Sampler

Bayesian Regressionvia Gibbs Sampling

● Consider the standard regression model

– Y = response variable

– Xj = covariates

– j = parameters

yi=b0b1 xi ,1b2 xi ,2⋯bn xi , ni

i~N 0,2

Page 10: MCMC: Gibbs Sampler

Matrix version

● As noted earlier this

can be expressed as

● The full data set can be expressed as

yi=b0b1 xi ,1b2 xi ,2⋯bn xi ,n

yi=X ib

y=X b

Page 11: MCMC: Gibbs Sampler

Quick review of matrix math

● Vector

● Vector x vector

b=[b1

b2

b3]

bT×b=[b1 b2 b3]×[b1

b2

b3]=b1

2b2

2b3

2

● Transpose

bT=[b1 b2 b3 ]3 x 11 x 3

1 x 3 3 x 1 1 x 1

See Appendix C for more details

Page 12: MCMC: Gibbs Sampler

Quick review of matrix math

● Vector x vector

● Vector x matrix

b× bT=[b1

b2

b3]×[b1 b2 b3]=[

b12 b1b2 b1b3

b2b1 b22 b2b3

b3b1 b3b2 b32 ]

● Matrix x Matrix

n x 3

3 x 1 1 x 1

X b=

3 x 1 n x 1

X XT=Zn x 3 n x n3 x n

Page 13: MCMC: Gibbs Sampler

● Matrix x Matrix

● Identity Matrix

XT X=[ ∑ xi ,12 ∑ xi ,1 xi ,2 ∑ x i ,1 xi ,3

∑ x i ,2 xi ,1 ∑ xi ,22 ∑ x i ,2 xi ,3

∑ xi ,3 xi ,1 ∑ xi ,3 xi ,2 ∑ xi ,32 ]

3 x n 3 x 3n x 3

I=[1 0 00 1 00 0 1]

● Matrix Inversion

X−1X=X X−1=I

Page 14: MCMC: Gibbs Sampler

...back to regression

y=X b ~N n ,0

~N n0,2 I Likelihood

p y ∣b ,2 ,X =N ny ∣ X b ,2 I

Priorsp b=N pb ∣ b0 ,V b

p 2=IG

2∣ s1 , s2

Process modelData model

Parameter model

Page 15: MCMC: Gibbs Sampler

General pattern

● Write down likelihood– Data model

– Process model

● Write down priors for the free parameters in the likelihood– Parameter model

● Solve for posterior distribution of model parameters– Analytical

– Numerical

Page 16: MCMC: Gibbs Sampler

Posterior

p b ,2∣X ,y∝N n y ∣X b ,2 I ×

N pb ∣ b0 ,V b IG 2∣ s1 , s2

Sample in terms of conditionals

p b ∣2 ,X ,y∝N n y ∣X b ,2 I N p b ∣ b0 ,V b

p 2∣b ,X ,y∝N n y ∣X b ,2 I IG 2∣ s1 , s2

Page 17: MCMC: Gibbs Sampler

p b ∣2 ,X ,y∝N n y ∣X b ,2 I N p b ∣ b0 ,V b

Regression Parameters

Will have a posterior that is multivariate Normal...

N n b ∣ vV ,V =1

2n /2∣∣1 /2exp [ 1

2b−Vv TV−1

b−Vv ]

What are V and v??

Page 18: MCMC: Gibbs Sampler

b−Vv TV−1b−Vv

bTV−1b− bTV−1V v−v V V−1bv V V−1Vv

bTV−1b− bT v−v bvVv

bTV−1b−2 bT vvVv

Page 19: MCMC: Gibbs Sampler

y−XbT 2 I −1y−Xbb−b0

TV b−1b−b0

−2 yT y−−2 yT Xb−−2bT XT y−2bT XT Xb

bTV b−1b− bTV b

−1 b0−b0TV b

−1bb0TV b

−1b0

bTV−1b

−2 bT v

−2bT XT Xb bT V b−1b V−1=−2 XT XV b

−1

−−2 yT Xb−−2bT XT y− bTV b−1 b0−b0

TV b−1b

v=−2XT yV b

−1 b0

Page 20: MCMC: Gibbs Sampler

p 2 ∣b ,X ,y∝N n y ∣X b ,2 I IG 2 ∣ s1 , s2

Variance

Will have a posterior that is Inverse Gamma...

IG 2∣u1 ,u2∝2

−u11exp [− u2

2 ]

∝1

n /2 exp [ 122 y−Xb

T y−Xb]2−s11exp [− s2

2 ]

u1=s1n2

u2=s212y−XbT y−Xb

Page 21: MCMC: Gibbs Sampler

BUGS implementation

model{

##prior distributionsb[1] ~ dnorm(0,1.0E-5)b[2] ~ dnorm(0,1.0E-5)tau ~ dgamma(0.1,0.001)

##likelihoodfor(i in 1:n){

mu[i] <- b[1] + x[i] * b[2]y[i] ~ dnorm(mu[i],tau)

}

}

Page 22: MCMC: Gibbs Sampler
Page 23: MCMC: Gibbs Sampler

b [ 1 ]

b [ 1 ]

0 . 0 2 0 . 0 4 0 . 0 6 0 . 0

P(b

[1])

0.0

0.0

50

.10

.15

b [ 2 ]

b [ 2 ]

- 1 0 . 0 - 5 . 0 0 . 0

P(b

[2])

0.0

0.5

1.0

Page 24: MCMC: Gibbs Sampler

b [ 1 ].1 0 0 2 0 . 0 3 0 . 0 4 0 . 0

b[2

]

-8.0

-6.0

-4.0

-2.0

Page 25: MCMC: Gibbs Sampler

t a u

t a u0 . 0 0 . 1 0 . 2 0 . 3

P(t

au

)

0.0

5.0

10

.01

5.0

20

.0

Page 26: MCMC: Gibbs Sampler

b [ ]1

l a g0 5 0 1 0 0

au

to c

orr

ela

tio

n-1

.0-0

.50

.00

.51

.0

Page 27: MCMC: Gibbs Sampler

Centering the data

model{

##prior distributionsb[1] ~ dnorm(0,1.0E-5)b[2] ~ dnorm(0,1.0E-5)tau ~ dgamma(0.1,0.001)mX <- mean(x[])

##likelihoodfor(i in 1:n){

mu[i] <- b[1] + (x[i]-mX) * b[2]y[i] ~ dnorm(mu[i],tau)

}

}

Page 28: MCMC: Gibbs Sampler

b [ 1 ]- 5 . 0 0 . 0 5 . 0 1 0 . 0

b[2

]

-8.0

-6.0

-4.0

-2.0

Page 29: MCMC: Gibbs Sampler

Outline

● Different numerical techniques for sampling from the posterior– Rejection Sampling

– Inverse Distribution Sampling

– Markov Chain-Monte Carlo (MCMC)● Metropolis● Metropolis-Hastings● Gibbs sampling

● Sampling conditionals vs full model● Flexibility to specify complex models

Page 30: MCMC: Gibbs Sampler

Where things stand...

● We've now filled our “toolbox” with the basic tools that will enable us to explore more advanced topics– Probability rules & distributions

– Data models & Likelihoods

– Process models

– Parameter models & Prior distributions

– Analytical and Numerical Methods in ML and Bayes

Page 31: MCMC: Gibbs Sampler

General pattern - Bayes

● Write down likelihood– Data model

– Process model

● Write down priors for the free parameters in the likelihood– Parameter model

● Solve for posterior distribution of model parameters– Analytical

– Numerical

Page 32: MCMC: Gibbs Sampler

General Pattern - Maximum Likelihood

● Step 1: Write a likelihood function describing the likelihood of the observation

● Step 2: Find the value of the model parameter that maximized the likelihood– Calculate -lnL

– Analytical● Take derivative with respect to EACH parameter● Set = 0 , solve for parameter

– Numerical● Find minimum of -lnL surface

dLd

=0

Page 33: MCMC: Gibbs Sampler

Where we're going...

● Section 2 is the “meat” of the course:– Interval estimation

● Confidence and Credible intervals● Predictive intervals

– Model selection and hypothesis testing

– GLMs

– Nonlinear models

– Hierarchal models and random effects

– Measurement error and missing data

● Section 3 is more advanced applications

Page 34: MCMC: Gibbs Sampler