32
MCMC Methods in Harmonic Models Simon Godsill Signal Processing Laboratory Cambridge University Engineering Department [email protected] www-sigproc.eng.cam.ac.uk/~sjg

MCMC Methods in Harmonic Models

  • Upload
    chaim

  • View
    55

  • Download
    0

Embed Size (px)

DESCRIPTION

MCMC Methods in Harmonic Models. Simon Godsill Signal Processing Laboratory Cambridge University Engineering Department [email protected] www-sigproc.eng.cam.ac.uk/~sjg. Overview. MCMC Methods Metropolis-Hastings and Gibbs Samplers Design Considerations - PowerPoint PPT Presentation

Citation preview

Page 1: MCMC Methods in Harmonic Models

MCMC Methods in Harmonic Models

Simon Godsill

Signal Processing Laboratory

Cambridge University Engineering Department

[email protected]

www-sigproc.eng.cam.ac.uk/~sjg

Page 2: MCMC Methods in Harmonic Models

Overview MCMC Methods Metropolis-Hastings and Gibbs Samplers Design Considerations Case Study: Gabor Regression Models

Page 3: MCMC Methods in Harmonic Models

MCMC Methods

MCMC methods are sophisticated and general methods for simulation from a complex probability distribution, say (x) – x may be high dimensional, highly non-Gaussian, multimodal:

Given a set of samples from (x) we can compute Monte Carlo expectations for any quantities of interest by ergodic averages:

Page 4: MCMC Methods in Harmonic Models

MCMC Contd.

In a Bayesian setting (x) will typically be the posterior distribution:

• Underlying concept is to construct an irreducible, aperiodic Markov chain having p(x) as its stationary distribution and transition kernel K(dx’;x)

• Initialise chain at arbitrary state x(0) (say, random) and simulate repeatedly from K(dx’;x) until convergence achieved

• Convergence in distribution is guaranteed under mild conditions, easily verified for most models

Page 5: MCMC Methods in Harmonic Models

MCMC, contd. Rates of convergence are hard to compute – lots of

theory, but not typically applicable in practice. However, many models, e.g. many harmonic modelling

cases, can be proven to have geometric convergence rates.

Page 6: MCMC Methods in Harmonic Models

MCMC Algorithms

• MCMC schemes are constructed to satisfy the detailed balance condition

•The most basic scheme satisfying detailed balance is the Metropolis-Hastings (M-H) method

• At each iteration of M-H, propose to move from the current state x with a proposal density q(x’|x). This proposal is accepted randomly with probability

• Otherwise remain at x and go on to next iteration

Page 7: MCMC Methods in Harmonic Models

Componentwise M-H In most cases this won’t be feasible as x is high

dimensional -> low acceptance rates, poor convergence Instead, split x into components:

Then perform M-H on each component k=1,…,N: Propose

Accept with probability

Page 8: MCMC Methods in Harmonic Models

Gibbs Sampler Possibly the simplest form of MCMC – choose

(the `full conditional’ distribution of xk) Acceptance probability is 1 – i.e. all moves accepted.

Page 9: MCMC Methods in Harmonic Models

Other types of MCMC Reversible Jump MCMC – extension of M-H to cases

where x can have varying dimension (e.g. in sparsity estimation) – see Green (1995) – Biometrika

Perfect simulation – special MCMC schemes that achieve exact samples from (x) – highly desirable, but slow and not yet practical for many cases

Page 10: MCMC Methods in Harmonic Models

Design Issues and Recommendations

A basic understanding of MCMC is relatively easy, but it is not so easy to construct effective and efficient samplers

Some of the main considerations are: How to partition x into components (need not be same

size, and usually aren’t)

What algorithms to use – M-H, Gibbs, something else? In general Gibbs should only be used if the full conditionals are straightforward to sample from, e.g. Gaussian, gamma, etc., otherwise use M-H.

Page 11: MCMC Methods in Harmonic Models

(Blocking) – it’s nearly always best to group large

numbers of components of x into single partitions xk, provided efficient M-H or Gibbs steps can be constructed for the partitions

(Rao-Blackwellisation) – a related issue is marginalisation – it is better (in terms of estimator variance) to integrate out parameters analytically – again, subject to being able to construct efficient samplers on the remaining space:

Page 12: MCMC Methods in Harmonic Models

References for MCMC MCMC in Practice – Gilks et al – Chapman and Hall

(1996) Monte Carlo Statistical Methods – Robert and Casella –

Springer (1999)

Page 13: MCMC Methods in Harmonic Models

MCMC Case study – Gabor Regression models

Now consider design of a sampler for harmonic models. Full details forthcoming as

Wolfe, Godsill and Ng (2004) - Bayesian variable selection and regularisation for time-frequency surface estimation – Journal of Royal Statistical Society (Series B – methodological)

(See also Wolfe and Godsill (NIPS 2002))

See http://www.eecs.harvard.edu/~patrick/research/

Page 14: MCMC Methods in Harmonic Models

Gabor Regression Models Consider models of the form

G is a matrix of Gabor atoms – here we chose an overcomplete dictionary with 2* redundancy

We will seek sparse representations with time-frequency structure – encoded through prior distributions on ck’s

For the moment consider case of fixed, known e and ck

Page 15: MCMC Methods in Harmonic Models

Gabor regression models Likelihood function is

Posterior probability density for c is…

Page 16: MCMC Methods in Harmonic Models

Posterior for c:

So, in fact no MC is required for this case, since we have the full mean andCovariance matrix for c

[Conditioning on e and c implicit]

Page 17: MCMC Methods in Harmonic Models

Gibbs Sampler – blocking structures However, for large Gabor models, the matrix inversion

will be very slow, and here we could look at reduced- dimension blocking structures

Then Gibbs sampler would proceed as follows, for k=1,…,K:

It’s instructive to look at the form of this conditional pdf:

Page 18: MCMC Methods in Harmonic Models

Full conditional for ck

[Gk contains columns of G corresponding to partition k, and G-k the remaining columns.] This term is the residual

error when ck=0 Note relationship to BasisPursuit residual terms

Page 19: MCMC Methods in Harmonic Models

This form of Gibbs sampler can be very cheap computationally

The interest in this work is to extend the modelling capabilities provided by other algorithms – giving new forms of sparsity and structure. The extra steps are added in modular fashion, retaining the conditionally Gaussian structure of the coefficients and the efficient implementation

Page 20: MCMC Methods in Harmonic Models

Sampling e

First, we allow estimation of the noise floor by sampling e, assuming an inverted-gamma (IG) prior p(e

2):

Under this prior (conjugate)

the full conditional takes the

same form, which is easily

sampled by standard methods

(e.g. MATLAB) :

Page 21: MCMC Methods in Harmonic Models

Sampling coefficient parameters Next, place a structured prior distribution on the Gabor

coefficients. First make them heavy-tailed to match real audio signals. This is done using Scale Mixtures of Normals (see Godsill and Rayner (IEEE Tr. Sp. And Audio –1998) for an audio restoration example).

Simply assign a prior to the variance of each ck:

Implies a non-Gaussian heavy-tailed distribution for ck

Page 22: MCMC Methods in Harmonic Models

Priors for ck

Choice of p(ck2) determines the implied heavy-tailed distribution p(ck)

In simplest case adopt the IG prior as this is conjugate. Then implied p(ck) is Student’s t – distributed:

IG prior has Jeffreys and exponential limiting cases, so the family can encompass many of the sparseness-inducing cases.

Again, the IG prior is conjugate andLeads to a simple Gibbs sampler step:

Page 23: MCMC Methods in Harmonic Models

Direct Sparsity Modelling

Other choices of p(ck2) lead to other heavy-tailed

distributions, e.g. it is possible to get -stable or Generalised Gaussian coefficients with other choices. In these cases M-H would be used to do the sampling, see e.g. Godsill and Kuruoglu (1999 – CUED Tech. Rep.).

A further addition that is easily encorporated into the MCMC is direct estimation of sparsity.

This is an important addition to the models and does not compromise the guaranteed convergence properties of the methods.

We can achieve this by allowing finite probability mass at zero in p(ck

2):

Page 24: MCMC Methods in Harmonic Models

Direct Sparsity Modelling Prior with point mass at zero:

Where k 2{0,1} is a binary indicator variable specifying whether coefficient ck is active or inactive.

Structure is introduced at this point, through priors on the time-frequency indicator field {k}

We use Markov chain or Markov random field priors to encourage continuity across time (tones), frequency (transients), or both:

The indicator field is also sampled using Gibbs sampling – details not given here – no time left…

Page 25: MCMC Methods in Harmonic Models

Final Details We also sample the parameters of

requiring one Gibbs and one M-H step.

Page 26: MCMC Methods in Harmonic Models

Interpreting the MCMC outputAssume that the MCMC has converged and initial `burn-in’ deleted: Coefficient estimation:

Noise reduction:

Estimating the sparsity coefficients:

How many coefficients are active?

Page 27: MCMC Methods in Harmonic Models

Results

Page 28: MCMC Methods in Harmonic Models

Results, contd.

Page 29: MCMC Methods in Harmonic Models

Typical output from the program

Convergence of parameters

Noisy dataFinal iteration

MMSE Estimate

See http://www.eecs.harvard.edu/~patrick/research/for examples and Matlab code

Page 30: MCMC Methods in Harmonic Models

Conclusion Why use MCMC methods in harmonic models?

Extend the range of models computable Guaranteed convergence (in the limit) Computations can be quite cheap Code would contain same building blocks as EM,

IRLS or basis pursuit for similar models – easy to modify to MCMC for baseline comparison

It’s really not as complicated or slow as people think!! Why not use MCMC methods?

Can be computationally expensive Convergence diagnostics unreliable You may not want to explore new models

Page 31: MCMC Methods in Harmonic Models

References C.P. Robert and G. Casella, Monte Carlo

StatisticalMethods, New York: Springer Verlag, 1999 W. R. Gilks and S. Richardson and D. J. Spiegelhalter,

Markov chain Monte Carlo in practice, London: Chapman and Hall, 1996

P. J. Green, Reversible Jump Markov-chain Monte Carlo computation and Bayesian model determination, Biometrika, 82(4), pp. 711-732, 1995

Page 32: MCMC Methods in Harmonic Models

Harmonic models and MCMC – SJG references – see www-sigproc.eng.cam.ac.uk/~sjg

P. J. Wolfe, S. J. Godsill, and W.J. Ng. Bayesian variable selection and regularisation for time-frequency surface estimation Journal of the Royal Statistical Society, Series B, 2004. Read paper (with discussion). To Appear.

M.Davy and S. J. Godsill. Bayesian harmonic models for musical signal analysis (with discussion). In J.M. Bernardo, J.O. Berger, A.P. Dawid, and A.F.M. Smith, editors, Bayesian Statistics VII. Oxford University Press, 2003.

P. J. Wolfe and S. J. Godsill. Bayesian modelling of time-frequency coefficients for audio signal enhancement. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems 15, Cambridge, MA. MIT Press, 2002. S. J. Godsill and P. J. W. Rayner.

Digital Audio Restoration: A Statistical Model-Based Approach. Berlin: Springer, ISBN 3 540 76222 1, September 1998.

S. J. Godsill and P. J. W. Rayner. Robust reconstruction and analysis of autoregressive signals in impulsive noise using the Gibbs sampler. IEEE Trans. on Speech and Audio Processing, 6(4):352-372, July 1998.

S. J. Godsill and E. E. Kuruoglu. Bayesian inference for time series with heavy-tailed symmetric alpha -stable noise processes. In Proc. Applications of heavy tailed distributions in economics, engineering and statistics, June 1999. Washington DC, USA. CUED Tech. Rep.