11
On stability and convergence of adaptive MC[MC] Discussion by Christian P. Robert Universit´ e Paris-Dauphine, IuF, and CREST http://xianblog.wordpress.com Adap’skiii, Park City, Utah, Jan. 03, 2011 C.P. Robert On stability and convergence of adaptive MC[MC]

On stability and convergence of adaptive MC[MC]xian/adapskiii/disc...Given a target ˇ, PMC produces a sequence qt of importance functions (t= 1;:::;T) aimed at improving the approximation

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On stability and convergence of adaptive MC[MC]xian/adapskiii/disc...Given a target ˇ, PMC produces a sequence qt of importance functions (t= 1;:::;T) aimed at improving the approximation

On stability and convergenceof adaptive MC[MC]

Discussion by Christian P. Robert

Universite Paris-Dauphine, IuF, and CRESThttp://xianblog.wordpress.com

Adap’skiii, Park City, Utah, Jan. 03, 2011

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 2: On stability and convergence of adaptive MC[MC]xian/adapskiii/disc...Given a target ˇ, PMC produces a sequence qt of importance functions (t= 1;:::;T) aimed at improving the approximation

The adaptive algorithm

Focussing on the scale matrix of an adaptive random-walkMetropolis Hastings scheme is theoretically fascinating, especiallywhen considering

removal of containment for LLN [intuitive but hard to prove]

potential link with renewal theory [in the fixed componentversion]

no lower bound on Σm

verifiable assumptions

[Roberts & Rosenthal, 2007]

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 3: On stability and convergence of adaptive MC[MC]xian/adapskiii/disc...Given a target ˇ, PMC produces a sequence qt of importance functions (t= 1;:::;T) aimed at improving the approximation

The adaptive algorithm (2)

Limited applicability of the adaptive scheme when using arandom-walk Metropolis Hastings scheme because of

strong tail [or support] conditions

more generaly, dependence on parameterisation

curse of dimensionality [ellip. symm. less & less appropriate]

unavailability of the performance target [e.g., acceptance of0.234]

Congratulations and thanks for adding a software to the theoreticaldevelopment!

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 4: On stability and convergence of adaptive MC[MC]xian/adapskiii/disc...Given a target ˇ, PMC produces a sequence qt of importance functions (t= 1;:::;T) aimed at improving the approximation

The adaptive algorithm (2)

Limited applicability of the adaptive scheme when using arandom-walk Metropolis Hastings scheme because of

strong tail [or support] conditions

more generaly, dependence on parameterisation

curse of dimensionality [ellip. symm. less & less appropriate]

unavailability of the performance target [e.g., acceptance of0.234]

Congratulations and thanks for adding a software to the theoreticaldevelopment!

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 5: On stability and convergence of adaptive MC[MC]xian/adapskiii/disc...Given a target ˇ, PMC produces a sequence qt of importance functions (t= 1;:::;T) aimed at improving the approximation

Banana benchmark

Twisted Np(0,Σ) target with Σ = diag(σ21 , 1, . . . , 1), changing the

second co-ordinate x2 to x2 + b(x21 − σ2

1)

x1

x 2

−40 −20 0 20 40

−40

−30

−20

−10

010

20

p = 10, σ21 = 100, b = 0.03

[Haario et al. 1999]

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 6: On stability and convergence of adaptive MC[MC]xian/adapskiii/disc...Given a target ˇ, PMC produces a sequence qt of importance functions (t= 1;:::;T) aimed at improving the approximation

Banana benchmark (2)

Reparameterisation of the above within the unit hypercube by a logittransform

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

y

−100 −90

−80

−70

−60

−50

−40

−30

−20

−10 0 0

p = 2, σ21 = 100, b = 0.01

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 7: On stability and convergence of adaptive MC[MC]xian/adapskiii/disc...Given a target ˇ, PMC produces a sequence qt of importance functions (t= 1;:::;T) aimed at improving the approximation

Iterating importance sampling

Population Monte Carlo (PMC) offers a solution to the difficulty ofpicking the importance function q through adaptivity:Given a target π, PMC produces a sequence qt of importancefunctions (t = 1, . . . , T ) aimed at improving the approximation ofπ

[Cappe, Douc, Guillin, Marin & X., 2007]

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 8: On stability and convergence of adaptive MC[MC]xian/adapskiii/disc...Given a target ˇ, PMC produces a sequence qt of importance functions (t= 1;:::;T) aimed at improving the approximation

Adaptive importance sampling

Use of mixture densities

qt(x) = q(x;αt, θt) =D∑

d=1

αtdϕ(x; θt

d)

[West, 1993]

where

αt = (αt1, . . . , α

tD) is a vector of adaptable weights for the D

mixture components

θt = (θt1, . . . , θ

tD) is a vector of parameters which specify the

components

ϕ is a parameterised density (usually taken to be multivariateGaussian or Student-t)

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 9: On stability and convergence of adaptive MC[MC]xian/adapskiii/disc...Given a target ˇ, PMC produces a sequence qt of importance functions (t= 1;:::;T) aimed at improving the approximation

PMC updates

Maximization of Lt(α, θ) leads to closed form solutions inexponential families (and for the t distributions)For instance for Np(µd,Σd):

αt+1d =

∫ρd(x;αt, µt,Σt)π(x)dx,

µt+1d =

∫xρd(x;αt, µt,Σt)π(x)dx

αt+1d

,

Σt+1d =

∫(x− µt+1

d )(x− µt+1d )Tρd(x;αt, µt,Σt)π(x)dxαt+1

d

.

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 10: On stability and convergence of adaptive MC[MC]xian/adapskiii/disc...Given a target ˇ, PMC produces a sequence qt of importance functions (t= 1;:::;T) aimed at improving the approximation

Simulation

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 11: On stability and convergence of adaptive MC[MC]xian/adapskiii/disc...Given a target ˇ, PMC produces a sequence qt of importance functions (t= 1;:::;T) aimed at improving the approximation

Comparison to MCMC

Adaptive MCMC: Proposal is a multivariate Gaussian with Σupdated/based on previous values in the chain. Scale and updatetimes chosen for optimal results.

[Kilbinger, Wraith et al., 2010]

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)!

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)!

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)(

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)(

fa fa

fbfb

PMC MCMC

Evolution of π(fa) (top panels) and π(fb) (bottom panels) from 10k points to 100k points for both PMC (leftpanels) and MCMC (right panels).

C.P. Robert On stability and convergence of adaptive MC[MC]