Tractable Nonparametric Bayesian Inference in Poisson
Processes with Gaussian Process Intensity
Ryan P. Adams, Iain Murray, and David J.C. MacKay
(ICML 2009)
Presented by Lihan He
ECE, Duke University
July 31, 2009
by
Introduction
The model
Poisson distribution
Poisson process
Gaussian process
Gaussian Cox process
Generating data from Gaussian Cox process
Inference by MCMC
Experimental results
Conclusion
Outline
2/19
Introduction
3/19
Inhomogeneous Poisson process
A counting process
Rate of arrivals varies in time or space
Intensity function (s)
Astronomy, forestry, birth model, etc.
Introduction
4/19
How to model the intensity function (s)
Using Gaussian process
Nonparametrical approach
Called Gaussian Cox process
Difficulty: intractable in inference
Double-stochastic process
Some approximation methods in previous research
This paper: tractable inference Introducing latent variables
MCMC inference – Metropolis-Hastings method
No approximation
Model: Poisson distribution
5/19
Discrete random variable X has p.m.f.
!)(
k
ekXP
k
for k = 0, 1, 2, …
Number of event arrivals
Parameter
E[X] = Conjugate prior: Gamma distribution
Model: Poisson process
6/19
The Poisson process is parameterized by an intensity function
N(0)=0
The number of events in disjoint subregions are independent
No events happen simultaneously
Likelihood function
such that the random number of event within a subregion
is Poisson distributed with parameter
!
)(])([
k
ekTNP
kT
T
for k = 0, 1, 2, …
Model: Poisson process
7/19
One-dimensional temporal Poisson process Two-dimensional spatial Poisson process
Model: Gaussian Cox process
8/19
Using Gaussian process prior for intensity function (s)
*: upper bound on (s)
σ : logistic function
g(s): random scalar function, drawn from a Gaussian process prior
));(),;((~ :1:1 NNN sCsmN
))(),((~ CmGPg
Model: Gaussian process
9/19
Definition: Let g=(g(x1), g(x2), …, g(xN)) be an N-dimensional vector of function values evaluated at N points x1:N. P(g) is a Gaussian process if for any finite subset {x1, …, xN} the marginal distribution over that finite subset g has a multivariate Gaussian distribution.
),(~ CmGPg
))(),((~)](),...,([ :1:11 NNNT
N sCsmNxgxg Nonparametric prior (without parameterizing g, as g=wTx)
Infinite dimension prior (dimension N is flexible), but only need to work with finite dimensional problem
Fully specified by the mean function and the covariance function
)(m)(C
Mean function is usually defined to be zero
Example covariance function
),(})(2
1exp{);,( 21
1
20, jivvxxlvxxCC
d
m
mj
mimjiji
},,,{ 210 mlvvv
Model: Generating data from Gaussian Cox process
Objective: generate a set of event {sk}k=1:K on some subregion T which are
drawn from Poisson process with intensity function
10/19
11/19
Inference
Poisson process likelihood function
Given a set of K event {sk}k=1:K on some subregion T as observed data, what is
the posterior distribution over (s)?
Posterior
12/19
Inference
Augment the posterior distribution by introducing latent variables to make the
MCMC-based inference tractable.
Observed data:
Introduced latent variables:
1. Total number of thinned events M
2. Locations of thinned events
3. Values of the function g(s) at the thinned events
4. Values of the function g(s) at the observed events
Complete likelihood
13/19
Inference
MCMC inference: sample
Sample M and : Metropolis-Hasting method
Metropolis-Hasting method: draw a new sample xt+1based on the last sample xt and a proposal distribution q(x’;xt)
1. Sample x’ from proposal q(x’; xt)
);'()(
)';()'(tt
t
xxqxp
xxqxpa
4. If r<a, accept x’ as new sample, i.e., xt+1=x’; otherwise, reject x’, let xt+1=xt.
3. Sample r~U(0,1)
2. Compute acceptance ratio
14/19
Inference
Sample M: Metropolis-Hasting method
Proposal distribution for inserting one thinned event
Proposal distribution for deleting one thinned event
Acceptance ratio for deleting one thinned event
Acceptance ratio for inserting one thinned event
15/19
Inference
Sample : Metropolis-Hasting method
Acceptance ratio for sampling a thinned event
Sample gM+K: Hamiltonian Monte Carlo method (Duane et al, 1987)
m
Sample *: place Gamma prior on *
Conjugate prior, the posterior can be derived analytically.
16/19
Experimental results
Synthetic data
53 events
29 events
235 events
Experimental results
Coal mining disaster data
191 coal mine explosions in British from year 1875 to 1962
17/19
18/19
Experimental results
Redwoods data
195 redwood locations
Conclusion
19/19
Proposed a novel method of inference for the Gaussian Cox process that avoids the intractability of such model;
Using a generative prior that allows exact Poisson data to be generated from a random intensity function drawn from a transformed Gaussian process;
Using MCMC method to infer the posterior distribution of the intensity function;
Compared to other method, having better result;
Having significant computational demands: infeasible for data sets that have more than several thousand event.