c Consult author(s) regarding copyright matters Notice ...€¦ · Analytically or computationally intractable likelihoods often occur in the genetics and biology literature. Applications

This may be the author’s version of a work that was submitted/acceptedfor publication in the following source:

Drovandi, Chris & Pettitt, Tony(2011)Estimation of parameters for macroparasite population evolution using ap-proximate Bayesian computation.Biometrics, 67 (1), pp. 225-233.

This file was downloaded from: https://eprints.qut.edu.au/39328/

c© Consult author(s) regarding copyright matters

This work is covered by copyright. Unless the document is being made available under aCreative Commons Licence, you must assume that re-use is limited to personal use andthat permission from the copyright owner must be obtained for all other uses. If the docu-ment is available under a Creative Commons License (or other specified license) then referto the Licence for details of permitted re-use. It is a condition of access that users recog-nise and abide by the legal requirements associated with these rights. If you believe thatthis work infringes copyright please provide details by email to [email protected]

Notice: Please note that this document may not be the Version of Record(i.e. published version) of the work. Author manuscript versions (as Sub-mitted for peer review or as Accepted for publication after peer review) canbe identified by an absence of publisher branding and/or typeset appear-ance. If there is any doubt, please refer to the published source.

https://doi.org/10.1111/j.1541-0420.2010.01410.x

https://eprints.qut.edu.au/view/person/Drovandi,_Chris.html

https://eprints.qut.edu.au/view/person/Pettitt,_Tony.html

https://eprints.qut.edu.au/39328/

https://doi.org/10.1111/j.1541-0420.2010.01410.x

Estimation of Parameters for Macroparasite Population

Evolution using Approximate Bayesian Computation

C. C. Drovandi and A. N. Pettitt

School of Mathematical Sciences, Queensland University of Technology, Brisbane,Australia

Abstract

We estimate the parameters of a stochastic process model for a macroparasitepopulation within a host using approximate Bayesian computation (ABC). The im-munity of the host is an unobserved model variable and only mature macroparasitesat sacrifice of the host are counted. With very limited data, process rates are in-ferred reasonably precisely. Modelling involves a three variable Markov process forwhich the observed data likelihood is computationally intractable. ABC methods areparticularly useful when the likelihood is analytically or computationally intractable.The ABC algorithm we present is based on sequential Monte Carlo, is adaptive innature and overcomes some drawbacks of previous approaches to ABC. The algorithmis validated on a test example involving simulated data from an autologistic modelbefore being used to infer parameters of the Markov process model for experimentaldata. The fitted model explains the observed extra-Binomial variation in terms of azero-one immunity variable which has a short lived presence in the host.

Keywords: approximate Bayesian computation, sequential Monte Carlo, Markov process,macroparasite, autologistic model, inference.

1

1 Introduction

Analytically or computationally intractable likelihoods often occur in the genetics andbiology literature. Applications involving populations in biology are often modelled byproposing a continuous time stochastic model. For example Tanaka et al. (2006) considera Markov process model for the transmission of Tuberculosis that also takes into accountmutation. Furthermore, Riley et al. (2003) develop a stochastic process to model the lifecycle of a macroparasite population within a host. The likelihood function for such mod-els is only computationally tractable for relatively small populations, forcing alternativeapproaches to be derived.

Recent developments provide a methodology known as approximate Bayesian computation(ABC) to estimate the posterior densities of unknown parameters when the likelihoodis difficult to compute. Whilst the likelihood can be intractable, simulation from themodel is assumed to be straightforward. The method requires some technique to comparesimulated data with the observed data. This is commonly based on summary statistics,or more accurately, sufficient statistics if they are available.

The first ABC algorithm implemented was an acceptance based sampling algorithm pio-neered by Pritchard et al. (1999), referred to henceforth as AS ABC. Other implementa-tions of this method can be found in, for example, Beaumont et al. (2002) and Tanakaet al. (2006). In this approach, parameter values are simulated from the prior distribu-tion and are accepted provided these values lead to simulated data close to the observeddata. While resulting in independent samples from the approximate posterior, this algo-rithm unfortunately leads to a low acceptance rate. In an attempt to improve efficiency,Marjoram et al. (2003) extended this approach to work within the Markov chain MonteCarlo (MCMC) framework, with the intention that parameters could be proposed from acarefully chosen distribution, e.g. those based on a random walk. Whilst this method canimprove acceptance rates, it still suffers from the usual drawbacks of MCMC.

Sequential Monte Carlo (SMC) techniques have been developed as an alternative to MCMCfor static problems. SMC is an extension of importance sampling where a sequence of tar-get distributions is defined that constructs a path between an initial distribution and afinal distribution. A set of particles is sampled from the initial distribution and propa-gated through the sequence of targets using a combination of reweighting, resampling andmutation steps. The approach requires that the initial distribution is easily sampled fromand that adjacent target distributions are in a sense close allowing efficient importancedistributions to be built. This produces a set of weighted particles as an approximationto the final target in the sequence. The algorithm in its full generality can be found inDel Moral et al. (2006).

Sisson et al. (2007) then modified this algorithm so that the likelihood function is notrequired. A natural choice for the sequence of targets in this setting, as specified in Sissonet al. (2007), is the approximate posterior distribution with the discrepancy threshold,between observed and simulated data, decreasing through the sequence. There are anumber of advantages in using this approach over MCMC ABC as pointed out in Sissonet al. (2007). These issues mostly surround the Markov chain in MCMC ABC, which willbe discussed later. One disadvantage with this algorithm is that the number and values ofthe discrepancy between observed and simulated data in the sequence need to be chosen,i.e. some tuning is still required.

There are two motivations for this paper. One aim is to extend the approach of Sisson

2

et al. (2007) to provide an algorithm which is robust to implement. The sequence ofdiscrepancies is selected adaptively by the algorithm. We achieve this by dropping acertain proportion, α, of the particles at each iteration and selecting the next threshold asthe largest discrepancy value of the particles that remain. The population is replenishedby resampling from the ‘alive’ particles and moving them according to an MCMC kernel.We refer to this method as the SMC ABC replenishment algorithm. Secondly, we applythe algorithm to a Markov process model describing the life cycle of a macroparasitepopulation, resulting in an approximate Bayesian approach for inference, which contrastswith the highly computationally intensive estimation method of Riley et al. (2003). Thisparticular example involves the Brugia pahangi parasite. In the experiment of Denhamet al. (1972), hosts (Felis catus in this case) that had never previously been exposed to theBrugia pahangi parasite were infected with L3 larvae (juvenile parasites). Over a period oftime, these larvae may either develop into mature parasites or die. Also, mature parasitescan die. It is assumed that a cat can develop an immunity to the larvae and hence inhibitthe development (maturing) of parasites. The data (Denham et al., 1972) available foranalysis consists of single infection data where cats are injected with approximately 100or 200 juvenile larvae. Subsequently, each infected host is autopsied and the survivingmatured parasites are counted and recorded. We also apply the methodology to simulateddata from an autologistic model as a test example to validate the algorithm.

This paper is organised as follows. In section 2, we describe briefly various ABC algorithmsthat have been developed previously and then propose our extension. Section 3 describesthe autologistic model, its sufficient statistics and the data used. The corresponding resultsfrom applying the SMC ABC replenishment algorithm to this application are also shownin this section. In section 4, the model of Riley et al. (2003) is reproduced. We alsopresent an alternative simplified model based on a pseudo-equilibrium assumption. Thissection also describes the data available for analysis. We apply the new algorithm to thisapplication and present the results in section 5. Concluding remarks and discussion areprovided in section 6.

2 The Algorithm

2.1 Approximate Bayesian Computation

In ABC, the objective is to obtain samples from the joint approximate posterior distribu-tion

f(θ,xs|ρ(x,xs) ≤ εT ) ∝ f(xs|θ)π(θ)1ρ(x,xs)≤εT ,

where θ is the unknown parameter, ρ is a discrepancy measure between summary statisticsof the observed data, x, and the simulated data, xs, produced by the parameter whileεT is the target tolerance. Here 1 is the indicator function that is unity if the conditioninvolving the discrepancy is satisfied. The approximate posterior for the parameter isobtained by marginalising over the simulated data

f(θ|ρ(x,xs) ≤ εT ) ∝∫xs

f(xs|θ)π(θ)1ρ(x,xs)≤εT dxs.

The first algorithm of this kind was AS ABC (see, for example, Beaumont et al. (2002)).In this approach parameter values are proposed from the prior distribution, θ ∼ π(.),data are simulated from the likelihood model, xs ∼ f(.|θ), and the proposed parameter

3

values are accepted if ρ(x,xs) ≤ εT . Whilst this results in independent draws from theappropriate target, it is inefficient if the prior is uninformative relative to the observeddata likelihood.

Marjoram et al. (2003) attempted to overcome this problem by developing an MCMCalgorithm without likelihoods. In this approach, parameters are proposed based on anarbitrary distribution, θp ∼ q(.|θc), conditional on the current parameters, θc, and dataare simulated based on the proposed parameters, xs ∼ f(.|θp). These parameters areaccepted with probability, pacc, based on the Metropolis-Hastings ratio

pacc = min

(1,π(θp)q(θc|θp)π(θc)q(θp|θc)

1ρ(x,xs)≤εT

), (1)

where the indicator function replaces the likelihood.

The above MCMC approach also suffers from a number of drawbacks. There is the possi-bility of strong dependence in posterior samples resulting from the Markov chain and thatthe chain gets trapped in low probability regions. Furthermore, a burn-in is required andMarkov chain convergence has to be addressed. In order to eliminate the problems associ-ated with MCMC, Sisson et al. (2007) extended the general SMC algorithm of Del Moralet al. (2006) so that the likelihood function is not required. In SMC without likelihoodsthe following sequence of distributions is defined

πt(θ,xs|ρ(x,xs) ≤ εt) ∝ f(xs|θ)π(θ)1ρ(x,xs)≤εt ,

for t = 1, . . . , T with a non-increasing sequence of the tolerances, ε1 ≥ ε2 ≥ · · · ≥ εT .Assuming the number of distributions and the sequence of target discrepancies have beenspecified, the algorithm of Sisson et al. (2007) is a combination of SMC (Del Moral et al.,2006) and partial rejection control (PRC) (Liu, 2001). Here we have N particles where{θit,W i

t } is the ith weighted particle for target t, π0(θ) is the importance distribution forthe first target (which is usually taken as the prior), Kt(.|.) is a Markov transition kerneland Lt(.|.) is an arbitrary backward Markov kernel, see Del Moral et al. (2006) for details.

2.2 The Replenishment SMC ABC Algorithm

We propose a number of changes to this basic algorithm in order to make practical im-plementation more robust, partially overcoming some of the difficulties of choosing theMarkov kernels Kt(.|.) and Lt(.|.) which are included in both Del Moral et al. (2006) andSisson et al. (2007). Since a forward kernel is adopted in Sisson et al. (2007), the fol-lowing weighting step is necessary, assuming that a particle that satisfies the next targetdiscrepancy has been generated from the forward kernel, θti ∼ Kt(.|θt−1

i )

W ti ∝

π(θti)Lt−1(θt−1i |θti)

π(θt−1i )Kt(θti |θ

t−1i )

,

and thus involves a choice for the backwards kernel, Lt(.|.), which is essentially arbitrarybut can have a substantial impact on sampler performance. Sisson et al. (2007) selectthe backward kernel to be equal to the forward, which although results in a simplificationin the reweighting formula, is a particularly poor choice since it results in bias as shownin Beaumont et al. (2009). Sisson et al. (2009) and Beaumont et al. (2009) advocate aparticle approximation to a closer to optimal choice of the backwards kernel, which results

4

in the weighting formula

W ti ∝

π(θti)∑Nj=1W

t−1j Kt(θti |θ

t−1j )

.

Whilst this reweighting step creates an O(N2) algorithm, the bias issue is now mitigated.An alternative approach to remove the bias, as developed here, is to use an MCMC kernel(see Chopin (2002) for example) rather than a forward kernel. This amounts to a selectionof a backwards kernel given by equation (30) in Del Moral et al. (2006). In addition toremoving the bias issue, the choice of an MCMC kernel results in an O(N) algorithm atthe expense of generating potentially duplicated particles. We iterate the MCMC kernelnumerous times in order to guarantee particle diversity with a probability close to one.

Furthermore, we propose that the discrepancies in the sequence of distributions are chosenadaptively, instead of being pre-specified. We achieve this by sorting the particle set viathe discrepancy parameter and dropping a percentage of the particles, say 100α%, with thelargest distance (see lines 1.8 and 1.11 of algorithm 1 in Web Appendix A). The distancefor the next target is then chosen to be the maximum distance of the particles that remain(line 1.11). The dropping of particles in our algorithm is theoretically based upon theincremental weights. Since we make use of an MCMC kernel, the incremental weights toget to the next target are given by

wti =1ρ(x,xi

s)≤εt1ρ(x,xi

s)≤εt−1

,

so that W ti ∝ wtiW

t−1i and hence the particles that are ‘dropped’ are the ones with zero

weight (i.e. the ones that do not satisfy the next tolerance).

Particles are resampled from the remaining particles in order to replenish the entire pop-ulation, and then to ensure particle disparity, are moved according to an MCMC kernelusing the next discrepancy in the target (lines 1.12 to 1.20). The resampling and movesteps at each iteration help ensure that the particle population does not degenerate. Thealgorithm completes when the maximum distance of the population is less than or equal tothe target tolerance, εT . Alternatively, the stopping rule may be defined when the MCMCacceptance rate is unacceptably small. Therefore, with this algorithm, the intermediatetarget distances, D = {ε2, . . . , εT−1}, and the number of these targets, T − 2, are replacedby a single tuning parameter, α. Furthermore, ε1 may be determined based on a selectedacceptance rate of the initial ABC acceptance part of the algorithm and εT could be basedon the stopping rule above.

Further improvements come from the following remarks about algorithm 1 in Web Ap-pendix A. Firstly, the particle weights W i = 1/N and so the effective sample size (ESS)measure is equal to N . This does not reflect that particles may have the same values.This becomes apparent as we are using an MCMC kernel and particle movement is notguaranteed. To be more confident that most of the resampled particles are moved, wemay repeat lines 1.14 to 1.19 of algorithm 1 R times. The value of R for the next targetcan be determined dynamically based on the overall MCMC acceptance probability forthe current target. Let the MCMC acceptance probability be pacc. Assuming independentbinary trials, R is chosen such that there is a probability of 1 − c that the particle getsmoved at least once

R =log(c)

log(1− pacc).

5

Therefore assuming there are Na particles, we may expect that cNa of these are not movedafter the resample. Here we used c = 0.01. Secondly, the MCMC move kernel requires achoice of proposal density, q(.|.). However, at the move point of the algorithm we alreadyhave a number of particles, (1−α)N , that are already distributed according to the desiredtarget (i.e. the particles that are not dropped). Therefore, we can use these particles tohelp build efficient proposal densities, based on the idea of Chopin (2002). For example,sample means and covariances can be computed to build an independent proposal based ona multivariate Normal or t distribution. Alternatively, the covariance matrix (or a scaledversion thereof) can be used to develop a multivariate Normal random walk proposalcentred on the resampled particle values. These improvements to algorithm 1 are shownin algorithm 2 of Web Appendix A.

Unfortunately, by the end of the algorithm some duplicated particles cannot be avoided.We choose to run the MCMC move step on every particle R times in order to diversifythe sample representing the final target.

2.3 Validation of the Replenishment SMC ABC Algorithm

Our algorithm depends upon the existence of an MCMC kernel for the current targetdistribution. Proof of detailed balance of the MCMC kernel used is given in Marjoram et al.(2003). When an MCMC kernel is adopted in SMC the incremental weights are derivedfrom basic importance sampling, where the current target is divided by the importancedistribution. In the SMC setting, the importance distribution is the previous distributionin the sequence. These incremental weights can also be justified theoretically based ona particular choice of the backwards kernel as specified in Del Moral et al. (2006). Thisdemonstrates the theoretical validity of our algorithm. We demonstrate the computationalvalidity of the replenishment algorithm on a test example below.

3 Test Example

In the first application we apply the replenishment algorithm to simulated data from anautologistic model. This test example is chosen as sufficient statistics are available, makingit suitable for any ABC approach. Furthermore, an algorithm already exists for computingthe exact posterior (using MCMC, see Møller et al. (2006)) and we can compare the resultsbetween methods. The autologistic model is used for the spatial variability in ecologicalmodelling (see, for example, Wu and Huffer (1997) and Augustin et al. (1996)).

3.1 Model and Data

The 2D autologistic (or Ising) model is used to describe correlated Binary data on anmxn rectangular lattice, xi,j ∈ {−1, 1} for i = 1, . . . ,m and j = 1, . . . , n. The probabilitydistribution of this model is given by

p(x|θ0, θ1) =1

Z(θ0, θ1)eθoV0(x)+θ1V1(x),

6

where

V0(x) =m∑i=1

n∑j=1

xi,j ,

V1(x) =m−1∑i=1

n∑j=1

xi,jxi+1,j +m∑i=1

n−1∑j=1

xi,jxi,j+1.

Here, V0(.) and V1(.) are sufficient statistics for the two parameters. Here θ0 representsthe mean value of the lattice and θ1 represents the strength of correlation between neigh-bouring points on the lattice. Unfortunately, for relatively large lattices the normalisingconstant, Z(θ0, θ1), is intractable rendering the likelihood infeasible to compute. However,simulating the data from the model is computationally possible, rendering this type of ap-plication suitable for ABC. We use the algorithm of Propp and Wilson (1996) to produceexact samples from the model.

To investigate the replenishment algorithm’s ability to recover known parameter values,the data for analysis was simulated from the model based on a 60x60 lattice with θ0 = 0.2and θ1 = 0.1. These parameters produce data with a tendency towards positive valuesand a small correlation between adjacent points.

3.2 Results

In this model, sufficient statistics are available and therefore the true posterior can beobtained in the limit as the target distance, εT , approaches zero. Hence this is a good ap-plication to validate our algorithm. As there are two sufficient statistics, the discrepancycan be computed as the norm of the difference between the weighted statistics of the truedata, x, and simulated data, xs. It is not trivial to determine an appropriate weightingscheme as the variance of the sufficient statistics are dependent on the parameters. How-ever we note that when θ0 and θ1 are both zero the lattice consists of independent binarydata with an equally likely response. In this instance, the variance of V1 is approximatelytwice the variance of V0. Therefore, we divide V1 by

√2 and compute the tolerance using

ρ(x,xs) =

∥∥∥∥ V0(x)− V0(xs)√0.5(V1(x)− V1(xs))

∥∥∥∥ .Various norms could be selected, however we choose the 2-norm. Summary statistics thatare naturally more variable will have a more profound impact on the distance betweenobserved and simulated data if they are not scaled. This places more weight on theparameters that are affected by these summary statistics. This scaling has more impactfor larger tolerances, but inferences on the parameters are not sensitive to the scaling whenthe tolerance is low (i.e. at the target tolerance). In this example we found that fewertargets needed to be traversed when the scaling was introduced. Further computationalgains could be achieved if the covariance between summary statistics was included andthe tolerance evaluated using the Mahalanobis distance.

In the ABC algorithm, following a resample, an independent Metropolis-Hastings kernelwas used to move the particles. This move was based on drawing, independently, froma bivariate Normal distribution based on sample quantities estimated from the particlessatisfying the next tolerance constraint.

We used a Uniform (-1,1) prior for θ0 and Uniform (-0.4,0.4) for θ1. The simulationalgorithm was quite computationally expensive for highly correlated lattices, that is |θ1|

7

close to 0.4. This is one drawback of the ABC approach and we provide more details inthe discussion. The results for this application presented below are based on N = 500particles.

3.2.1 Choice of α

The choice of α, the proportion of particles to drop, represents a tradeoff between thenumber of targets required and the amount of mismatch between adjacent targets. Largevalues of α produce fewer targets although unfortunately the MCMC proposal distributionmakes use of fewer particles and is thus poorly estimated. Moreover, the number of MCMCmoves to use for the next target is based on the acceptance rate of the current target. Smallvalues of α are too conservative and the algorithm must traverse too many targets. Wesuggest that the optimal α value is problem specific, and could be investigated on a caseby case basis. However, it is sensible to suggest that α = 0.5 may be a good choice acrossproblems.

3.2.2 Posterior and Other Results

We used α = 0.5 to perform the posterior analysis. An initial and final tolerance of 2700and 50 were used respectively. The starting tolerance resulted in an initial acceptancerate of about 50%. After the final tolerance was reached, 100 MCMC move steps wereperformed on every particle which resulted in a total of 84895 simulations and 500 uniqueparticles. At this point, the acceptance probability in the MCMC move step fell to about9%. Initially the acceptance probability remained between the range of 50% and 60% untilaround a distance of 200 and started to fall thereafter.

Figure 1 shows the estimated posterior densities of the parameters. For θ0 a mean(standarddeviation) of 0.181(0.019) were obtained with (2.5%,50%,97.5%) quantiles of (0.148,0.181,0.216).The equivalent results for θ1 were 0.115(0.013) and (0.089,0.114,0.141). In addition, thealgorithm revealed a correlation of about -0.7 between the two parameters.

To determine the accuracy of these estimates we compared our results to the approach ofMøller et al. (2006). This method could be considered the gold standard for this appli-cation. 500000 iterations of this algorithm produced posterior mean(standard deviation)and (2.5%,50%,97.5%) quantiles of 0.181(0.019) and (0.147, 0.180, 0.218) for θ0. For θ1 theequivalent results were 0.114(0.013) and (0.089, 0.114, 0.139). Again the correlation wasabout -0.7. Hence our results compare favourably.

To emphasise the poor computational efficiency of AS ABC, we compared SMC ABCwith the AS ABC algorithm. The replenishment approach outperformed its acceptancecounterpart by 25-fold, with the latter producing only 20 particles from 84895 simulations.

4 Markov Process Model and Data

We now present our main application involving a trivariate Markov process model describ-ing a within-host macroparasite population (Riley et al., 2003), the parasite was Brugiapahangi and the hosts were cats (Felis catus).

8

0.1 0.15 0.2 0.250

5

10

15

20

25

dens

ityθ

0

Posterior for θ0

0.060.08 0.1 0.120.140.160

5

10

15

20

25

30

35

dens

ity

θ1

Posterior for θ1

Figure 1: Posterior distributions for θ0 and θ1 of the autologistic model.

4.1 Data

The host autopsy times ranged between 26 and 1193 days after the initial macroparasitejuvenile infection and then mature parasites were counted. As a simplification, we usedata up to a sacrifice date of 400 days and include only those with approximately 100juveniles (and for these we assume that there were exactly 100 juveniles). The subset ofthe full data we use is shown in Figure 4.

The high variability of the count of the mature parasites at autopsy means that the processcannot be accurately captured using a deterministic model, and this was the motivation fora stochastic process. However, other continuous approximations such as moment closuretechniques allow the variability and higher order moments to be included in the model.Unfortunately, applying cumulant truncation at the third and fourth order to close the setof differential equations produced accurate approximations of the Markov process only insome regions of the parameter space (Riley et al., 2003).

4.2 Model of Riley et al. (2003)

The following stochastic model was developed by Riley et al. (2003) to help explain thepopulation dynamics of Brugia pahangi. At time t any host is described by three randomvariables {M(t) , L(t), I(t)}, where M(t) is the number of mature parasites, L(t) is thenumber of larvae and I(t) is a discrete version of the host’s experience to infection. Initiallyhosts are infected with LI larvae and after a certain time, tEND days, the cats are autopsiedand the number of mature parasites are counted and recorded. Hence the initial conditionsare M(0) = 0, L(0) = LI and I(0) = 0. It is assumed that larvae can mature at a rateof γ larvae per day. Larvae die at a rate µL + βI(t) per larva where µL represents therate at which natural death of larvae occurs and β is a rate parameter that describes thedeath of larvae due to the immune response of the host. The acquisition of immunity isassumed to be dependent only on the number of larvae, νL(t), and a host loses immunityat a rate µI per unit of immunity. Mature parasites die at a rate of µM adults per day.Parameters γ and µM have been previously estimated at 0.04 (Suswillo et al., 1982) and0.0015 (Michael et al., 1997) respectively and we accept these as fixed. Two assumptionsthat Riley et al. (2003) applied in this application are that all cats are independent of eachother and that the parameters of the model do not vary between hosts.

Assuming a continuous deterministic approximation is appropriate, the above model can

9

be re-written as a set of interacting differential equations as presented below

dL

dt= −µLL− βIL− γL,

dM

dt= γL− µMM,

dI

dt= νL− µII,

(2)

with the initial conditions as before. However, values of M can be small so that thisapproach appears invalid. Additionally, Figure 4 suggests there is substantial variabilityin the values of M and this might be explained by a stochastic model version of (2).Accordingly, Riley et al. (2003) modelled the stochastic nature of this problem througha continuous time discrete trivariate Markov process, whose probability model is given inWeb Appendix B.

A compromise can be made between the continuous but deterministic and stochastic yetdiscrete models if we apply a pseudo-equilibrium assumption for immunity. In the pseudo-equilibrium assumption we allow immunity to be continuous but assume that it is in steadystate, i.e. assume dI/dt = 0. Applying this simplification we obtain I∗ = ν/µIL

∗ and thenew differential equation system is given by

dL∗

dt= −µLL∗ − ηL∗2 − γL∗,

dM∗

dt= γL∗ − µMM∗,

(3)

where η = βν/µI . This model written in its stochastic form as a bivariate Markov processcan also be found in Web Appendix B.

4.3 Summary Statistics and Goodness of Fit

As sufficient statistics are not available in this case, summary statistics (or an alternative)need to be derived. We trialled various summary statistics that compared the simulatedand observed data directly through metrics such as the mean absolute difference or themean squared difference. Whilst we found these metrics produced parameter values thatpredicted the mean of the data well, they did not account for the variability.

As an alternative approach to help capture the variability, we made use of a goodness offit statistic (see Riley et al. (2003) for an example). The statistic we use is a comparisonof observed and expected counts for each possible mature count marginalised over thesacrifice time. The full development of this statistic is described in Web Appendix C.

4.4 Simulation

Unfortunately, only the final mature count for a single host is observed and the eventtimes and types between initial injection and necropsy time are all censored. Moreover,the immune response variable I(t) is unbounded. Thus, an explicit likelihood functioncannot be written down. However, as pointed out by Riley et al. (2003), the probabilitymass function for i units of immunity, l larvae and m mature parasites at time t for aparticular parameter set θ, can be estimated by recording the results of many realisationsof the stochastic process. The number of realisations required to produce an accurate

10

probability mass function causes the likelihood to be computationally intractable. Anadditional technique for computing the likelihood is based on theory regarding continuoustime Markov processes. The details are provided in Web Appendix D. The approachrequires the evaluation of the matrix exponential, but due to the large number of statesin the Markov process this computation is too demanding.

As computation of likelihood values has been rendered intractable, we rely on the factthat simulation from the model is straightforward. Firstly, we note that the data canbe simulated using the exact stochastic simulation algorithm of Gillespie (1977). Thisalgorithm for the full model is as follows. Given the current values of the state variablesfor the ath host, I(tia) = i, L(tia) = l and M(tia) = m, it can be shown that the time untilthe next event, ti+1

a − tia, has an exponential distribution with mean (γl + (µL + βi)l +µMm+ νl + µIi)

−1. The event type chosen at ti+1a is simulated based on the probability

of each event type.

An alternative, but not exact, simulation method are the suite of tau-leap approaches (seefor example Gillespie (2001)). Wilkinson (2006) gives a detailed comparison of these algo-rithms. We use the exact simulation approach to eliminate any bias due to the simulationmethod.

5 Results

In this section, we present the results from applying the replenishment SMC ABC algo-rithm to the Markov process application.

5.1 Results for Markov Process Model

We applied the ABC SMC replenishment algorithm to the within-macroparasite modelspecified above. We considered two models; (1) the model with a pseudo-equilibriumapproximation yielding two unknown parameters, η and µL, and (2) the full model withfour unknown parameters, ν, µL, µI and β.

For the pseudo model, a Uniform square prior of width one was placed on the parameters.We found that this prior was relatively uninformative. For the MCMC step, a bivariateGaussian random walk was used with a covariance matrix estimated based on the particlepopulation.

In the full model, Uniform priors were again placed on each parameter, (0,1) on ν andµL, (0,2) on µI and β. However, the parameter space of the full model was more difficultto explore with the MCMC scheme above, due to the increase in dimension and theskewness in some of the variables. As an alternative, we applied a log reparameterisationof ν∗ = − log(ν) and µ∗L = − log(µL). We then used a multivariate Normal randomwalk on the transformed space. Furthermore, we scaled down the variances with a singlefactor, 0 < b < 1, however to ensure that the covariance matrix remained positive definitewe monitored the eigenvalues. The value of b changed adaptively with the targets (weobserved values of b between 0.5 and 0.8). Thus, the MCMC moves were more local andthis also helped to improve the acceptance rates. The equivalent prior distributions onthe reparameterised space for ν and µL are Exponential with a unity mean.

In both cases we used N = 1000 particles and found that dropping 50% of the particlesworked well. The starting tolerance was chosen such that there was about a 70% accep-

11

Table 1: Posterior summaries for the pseudo and full model. Shown are the posterior mean,standard deviation and the (2.5%,50%,97.5%) quantiles. † estimates for these parametershave been multiplied by 100.

model param mean std dev (2.5%,50%,97.5%)

pseudo η† 0.09 0.06 (0.005,0.083,0.23)pseudo µL† 6.82 2.49 (1.91,6.75,11.28)

full ν† 0.28 0.10 (0.13,0.26,0.5)full µL† 1.39 1.11 (0.06,1.11,4.22)full µI 1.01 0.46 (0.16,0.99,1.90)full β 1.23 0.42 (0.43,1.26,1.94)

0 2 4

x 10−3

0

200

400

600

800

de

nsity

η

Posterior for η in the Pseudo model

−0.05 0 0.05 0.1 0.150

5

10

15

de

nsity

µL

Posterior for µL in the Pseudo model

Figure 2: Posterior densities for the pseudo model.

tance rate in the initial acceptance sampling, which was around 125 for the full model and131 for the pseudo model. The algorithms were stopped when the acceptance rate in theMCMC move step reached approximately 10%.

Table 1 and Figure 2 show the posterior summaries and densities of the parameters ofthe pseudo model respectively. Additionally, a correlation of roughly -0.88 between thetwo parameters was determined. Approximately 170000 simulations were required toproduce these results. At this time, the MCMC acceptance rate fell to about 7% and thefinal tolerance of 49 was determined. To obtain an empirical estimate of the Freeman-Tukey statistic described in Web Appendix C, T , the simulation process of section 4.3was repeated 1000 times at the point estimates of this model (the medians) and the Tstatistic calculated at each iteration. A mean of 50.8 was obtained with a 95% interval of(47.4,54.5).

The 95% posterior predictive intervals based on these results is presented in Figure 4(a).It can be seen here that this model is failing to describe much of the variability in thedata.

The posterior summaries and densities for the full model can be found in Table 1 andFigure 3 respectively and are unimodal. These posteriors show that µI and β are impre-cisely estimated, at least in (0,2), however smaller values of β appear less plausible. Thecorrelation matrix of the parameters was estimated as

corr(ν, µL, µI , β) =

1.0 −0.31 0.43 −0.30−0.31 1.0 −0.03 0.080.43 −0.03 1.0 0.46−0.30 0.08 0.46 1.0

.We found that this model was able to reduce the value of T , the discrepancy measure,further than that of the pseudo model. In about 200000 simulations a final tolerance

12

0 0.005 0.010

200

400

600

de

nsity

ν

Posterior for ν in the Full model

−1 0 1 2 30

0.5

1

de

nsity

µI

Posterior for µI in the Full model

−0.05 0 0.05 0.10

20

40

60

de

nsity

µL

Posterior for µL in the Full model

−1 0 1 2 30

0.5

1

de

nsity

β

Posterior for β in the Full model

Figure 3: Posterior densities for the full model.

of 32 was reached and the MCMC acceptance probability was reduced to around 10%.More simulations were required for the full model since at each target more MCMC movesteps were required as the parameter space is more complex to traverse. In fact, only 15intermediate targets were needed as opposed to 18 in the pseudo model. An empiricalestimate of the value of the T statistic was obtained in the same manner as for the pseudomodel. In this case, the mean was 32.5 with a 95% interval of (29.3, 36.0).

It is easily observed in Figure 4(b), that the full model is able to significantly account formore variability in the data as confirmed by the value of T , 32.5.

The parameter estimates of Riley et al. (2003) for µL(0.0011), µI(0.31) and β(1.1) are allwithin the credible intervals shown in Table 1. However, their estimate for ν(0.00084) isbelow the 2.5% quantile. It should be noted that the estimates of Riley et al. (2003) arebased on the full dataset, including data beyond 400 days and those with approximately200 initial larvae. The most likely discrepancy in the estimate of ν would be the exclusionof data with L(0) = 200 initial juveniles. In particular we note the term νL in thedifferential equation (2) depends on L. An additional reason why parameter estimatedmay be different is that the goodness-of-fit statistics in Riley et al. (2003) and our paperare not the same.

Simulations of the process using the estimated values of the parameters indicate thatimmunity is seldom present in a simulated process at a level greater than one unit. Theinitial rate for the increase of the immunity is νI(0) or about 0.28 units of immunity perday. Although this appears small, with the mean being 3.6 days, one unit of immunityincreases the death rate of larva from µL = 0.0139 to 1.247, per day per larva, an almost90-fold increase in the effect from immunity. However the one unit of immunity has adeath rate of µI = 1.006 units per day so that its presence in the host is short lived butleads to a substantially increased death rate of larva. This leads to the variability in thedata being substantially more than Binomial.

13

0 50 100 150 200 250 300 350 4000

10

20

30

40

50

60

70

80

90

10095% posterior prediction intervals under Pseudo model

Autopsy time

matu

re c

ount

(a) pseudo model

0 50 100 150 200 250 300 350 4000

10

20

30

40

50

60

70

80

90

10095% posterior prediction intervals under Full model

Autopsy time

matu

re c

ount

(b) full model

Figure 4: Numbers of mature parasite at autopsy time after initial larvae infection foreach host. Also shown are posterior 95% prediction intervals for the pseudo (a) and full(b) models.

6 Discussion

In this paper we presented a new SMC algorithm for ABC that is able to determine thesequence of tolerances and the proposal distribution of the MCMC kernel adaptively. Themethod was validated on a test example based on simulated data from an autologisticmodel, for which sufficient statistics were available. In this case, very accurate estimatesof the posterior means and variances were obtained when compared with the gold standardalgorithm of Møller et al. (2006). The main purpose of the application to the macropar-asite data was to demonstrate the utility of the method as we were able to obtain validinferences. This was a challenging example involving a multivariate Markov process modelthat is commonly used in biological modelling.

Both of these examples have shown that SMC methods for ABC eliminate several issuesassociated with other approaches to ABC. For example, SMC methods remove the prob-lems associated with the Markov chain in MCMC ABC. Furthermore, the SMC approachwas found to be overwhelmingly more efficient than AS ABC in terms of the number ofsimulations required from the model. Additionally, others have shown (e.g. Sisson et al.(2007)) that the SMC approach outperforms MCMC ABC in terms of efficiency, howeverthis was not investigated here. Moreover, the replenishment algorithm developed here pro-vides an improvement to Sisson et al. (2007) as the sequence of tolerances is determinedadaptively and hence requires much less tuning. It may have computational advantagesover Sisson et al. (2009) and Beaumont et al. (2009) which are still be to investigated.

A disadvantage with the ABC approach would appear to be where the simulation proce-dure is computationally intensive for some regions of the parameter space. This can occurhere for large rates in the Markov process and highly correlated lattices in the autologisticmodel.

An additional complication in the latter example involved the unavailability of sufficientstatistics and therefore necessitated the development of summary statistics. It was difficultto find a set of summary statistics to accommodate the variability in the data. As analternative, a goodness-of-fit statistic was developed. We found that it worked well forthis case.

Unfortunately, in the above goodness of fit technique it was convenient to assume that the

14

initial injection of juveniles was the same otherwise the computational burden becomesheavier. An alternative approach to goodness-of-fit based on indirect inference (Hegglandand Frigessi, 2004) could allow the full dataset to be analysed. Here a model is proposedthat fits the data well and provides estimates which are straightforward to find. Then, ifthe parameters of the ‘true’ model also lead to parameter values under the indirect/‘wrong’model that are close to optimal (e.g. maximum likelihood estimates) they will be accepted.We plan to investigate such an approach in future research.

Supplementary Materials

See the end of this document.

Acknowledgements

The authors would like to thank Edwin Michael and David Denham for access to the data.The authors are also grateful to Rob Reeves, who developed the code to simulate from anautologistic model and produced the gold standard results in section 3.2.2. The authorsalso wish to thank Chris Glasbey for bringing the work of Riley et al. (2003) to theirattention in the context of ABC and Steven Riley for his helpful communications on hiswork. The issues raised by an associate editor and two referees led to improvements inthis paper.

References

Augustin, N., Mugglestone, M., and Buckland, S. (1996). An autologistic model for thespatial distribution of wildlife. Journal of Applied Ecology 33, 339–347.

Beaumont, M. A., Cornuet, J.-M., Marin, J.-M., and Robert, C. P. (2009). Adaptivity forABC algorithms: the ABC-PMC scheme. To appear in Biometrika .URL http://arxiv.org/abs/0805.2256

Beaumont, M. A., Zhang, W., and Balding, D. J. (2002). Approximate Bayesian compu-tation in population genetics. Genetics 162, 2025–2035.

Chopin, N. (2002). A sequential particle filter method for static models. Biometrika 89,539–551.

Del Moral, P., Doucet, A., and Jasra, A. (2006). Sequential Monte Carlo samplers. Journalof the Royal Statistical Society: Series B 68, 411–436.

Denham, D., Ponnudurai, T., Nelson, G., Guy, F., and Rogers, R. (1972). Studieswith Brugia pahangi. I. Parasitological observations on primary infections of cats (Feliscatus). International journal for parasitology 2, 239–247.

Gillespie, D. T. (1977). Exact stochastic simulation of coupled chemical reactions. TheJournal of Physical Chemistry 81, 2340–2361.

Gillespie, D. T. (2001). Approximate accelerated stochastic simulation of chemically re-acting systems. Journal of Chemical Physics 115, 1716–1733.

15

Heggland, K. and Frigessi, A. (2004). Estimating functions in indirect inference. Journalof the Royal Statistical Society. Series B, Statistical Methodology 66, 447–462.

Liu, J. S. (2001). Monte Carlo Strategies in Scientific Computing. New York: Springer.

Marjoram, P., Molitor, J., Plagonal, V., and Tavare, S. (2003). Markov chain Monte Carlowithout likelihoods. Proceedings of the National Academy of Sciences of the UnitedStates of America 100, 15324–15328.

Michael, E., Grenfell, B., Isham, V., Denham, D., and Bundy, D. (1997). Modellingvariability in lymphatic filariasis: macro filarial dynamics in the Brugia pahangi catmodel. Proceedings of the Royal Society of London: Series B 39, 151–156.

Møller, J., Pettitt, A. N., Reeves, R., and Berthelsen, K. K. (2006). An efficient Markovchain Monte Carlo method for distributions with intractable normalising constants.Biometrika 93, 451–458.

Pritchard, J., Seielstad, M., Perez-Lezaun, A., and Feldman, M. (1999). Population growthof human Y chromosomes: a study of Y chromosome microsatellites. Molecular Biologyand Evolution 16, 1791–1798.

Propp, J. and Wilson, D. (1996). Exact sampling with coupled Markov chains and appli-cations to statistical mechanics. Random structures and Algorithms 9, 223–252.

Riley, S., Donnelly, C. L., and Ferguson, N. M. (2003). Robust parameter estimation tech-niques for stochastic within-host macroparasite models. Journal of Theoretical Biology225, 419–430.

Sisson, S., Fan, Y., and Tanaka, M. (2007). Sequential Monte Carlo without likelihoods.Proceedings of the National Academy of Sciences 104, 1760–1765.

Sisson, S., Fan, Y., and Tanaka, M. (2009). Correction for Sisson et al., sequential MonteCarlo without likelihoods. Proceedings of the National Academy of Sciences of the UnitedStates of America 106, 16889–16889.

Suswillo, R., Denham, D., and McGreevy, P. (1982). The number and distribution ofBrugia pahangi in cats at different times after primary infection. Acta Tropica 39,151–156.

Tanaka, M., Francis, A., Luciani, F., and Sisson, S. (2006). Estimating tuberculosistransmission parameters from genotype data using approximate Bayesian computation.Genetics 173, 1511–1520.

Wilkinson, D. J. (2006). Stochastic Modelling for Systems Biology. Chapman andHall/CRC.

Wu, H. and Huffer, F. (1997). Modelling the distribution of plant species using theautologistic regression model. Environmental and Ecological Statistics 4, 31–48.

16

Biometrics 000, 000–000 DOI: 000

000 0000

Web-based Supplementary Materials for “Estimation of Parameters for

Macroparasite Population Evolution using Approximate Bayesian

Computation”

C. C. Drovandi

School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia

email: [email protected]

and

A. N. Pettitt

School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia

Department of Mathematics and Statistics, Lancaster University, Lancaster, LA1 4YW, United Kingdom

email: [email protected]

1

Supp. Mat. Drovandi and Pettitt 1

1. Web Appendix A

Na = integer part of αN ;1.1

for i in 1 to N do1.2

repeat1.3

simulate θi ∼ π(.) then xs ∼ f(.|θi);1.4

ρi = ρ(x,xs);1.5

until ρi 6 ε1 ;1.6

end1.7

sort the particle set (θi, ρi) by ρi;1.8

compute max distance εMAX = ρN ;1.9

while εMAX > εT do1.10

drop Na particles with largest ρ compute ε for the next target εNEXT = ρN−Na ;1.11

for j in 1 to Na do1.12

resample θN−Na+j from {θi}N−Nai=1 ;1.13

propose move θ∗∗ ∼ q(.|θN−Na+j) then simulate xs ∼ f(.|θ∗∗);1.14

compute acceptance ratio MH = min(1, π(θ∗∗)q(θN−Na+j |θ∗∗)

π(θN−Na+j)q(θ∗∗|θN−Na+j)1ρ(x,xs)6εNEXT

);1.15

if u ∼ U(0, 1) < MH then1.16

set θN−Na+j = θ∗∗;1.17

set ρN−Na+j = ρ(x,xs);1.18

end1.19

end1.20

sort the particle set (θi, ρi) by ρi;1.21

compute max distance εMAX = ρN ;1.22

end1.23

Algorithm 1: The SMC ABC replenishment algorithm.

2 Biometrics, 000 0000

The SMC ABC replenishment algorithm is shown above as algorithm 1. Here N is the

number of particles, α is the proportion of particles to drop at each iteration, π(.) is the

prior and f(.|.) is the likelihood. Improvements to the above algorithm regarding the move

step, as specified in the main text, are shown in the algorithm below. The improvements

correspond to the adaptive MCMC proposal distribution and the repetition of the MCMC

step. Here c is the theoretical probability that a resampled particle does not get moved.

compute parameters of the MCMC proposal q(.|.) using particles 1, . . . , N −Na;2.1

set the acceptance counter iacc = 0;2.2

for j in 1 to Na do2.3

resample θN−Na+j from {θi}N−Nai=1 ;2.4

for k in 1 to R do2.5

propose move θ∗∗ ∼ q(.|θN−Na+j) then simulate xs ∼ f(.|θ∗∗);2.6

compute acceptance ratio MH = min(1, π(θ∗∗)q(θN−Na+j |θ∗∗)

π(θN−Na+j)q(θ∗∗|θN−Na+j)1ρ(x,xs)6εNEXT

);2.7

if u ∼ U(0, 1) < MH then2.8

set θN−Na+j = θ∗∗;2.9

set ρN−Na+j = ρ(x,xs);2.10

set iacc = iacc + 1;2.11

end2.12

end2.13

end2.14

compute the acceptance probability pacc = iacc/(RNa);2.15

compute R as the ceiling of log(c)/ log(1− pacc);2.16

Algorithm 2: The resampling and move step of the SMC ABC replenishment algorithm

in more detail.


2. Web Appendix B

The probability equations for the full trivariate model are given by

P

M(t + h) = i + 1, M(t) = i,

L(t + h) = j − 1, L(t) = j,

I(t + h) = k I(t) = k

= γjh + o(h),

P

M(t + h) = i, M(t) = i,

L(t + h) = j − 1, L(t) = j,

I(t + h) = k I(t) = k

= (µL + βk)jh + o(h),

P

M(t + h) = i− 1, M(t) = i,

L(t + h) = j, L(t) = j,

I(t + h) = k I(t) = k

= µM ih + o(h),

P

M(t + h) = i, M(t) = i,

L(t + h) = j, L(t) = j,

I(t + h) = k + 1 I(t) = k

= νjh + o(h),

P

M(t + h) = i, M(t) = i,

L(t + h) = j, L(t) = j,

I(t + h) = k − 1 I(t) = k

= µIkh + o(h).

(1)

Writing the pseudo model in its stochastic form as a bivariate Markov process yields

P

M∗(t + h) = i + 1, M∗(t) = i,

L∗(t + h) = j − 1, L∗(t) = j

= γjh + o(h),

P

M∗(t + h) = i, M∗(t) = i,

L∗(t + h) = j − 1, L∗(t) = j

= (µL + ηj)jh + o(h),

P

M∗(t + h) = i− 1, M∗(t) = i,

L∗(t + h) = j, L∗(t) = j

= µM ih + o(h),

(2)


where in this model, L∗ and M∗ are still discrete and I∗ is continuous and can be calculated

deterministically at any time based on the value of L∗.

3. Web Appendix C

To explain this technique, it is beneficial to summarise the data in terms of the unique

sacrifice times, ti, and the number of observations per sacrfice time, ni, for i = 1, . . . , S. Here

S is the number of unique sacrifice times and∑S

i=1 ni is the total number of observations.

Here we set up a grid of cells for each possible mature count (0-100) and each unique

sacrifice time (see Table 1). K (we used K = 100) simulations are produced from this model

beginning with 100 juveniles and terminating at the final sacrifice time of the dataset. The

mature counts are then recorded at each unique sacrifice time. The expected count in cell

i, j is calculated as the empirical probability of obtaining a mature count of magnitude i

at sacrifice time tj, pi,j, multiplied by the number of observations for sacrifice time tj, nj.

Here 101 expected classes, ea, a = 0, . . . , 100, were obtained by summing over the rows in

Table 1 to produce an empirical count of the number of mature parasites expected from the

model. The observed classes, oa, a = 0, . . . , 100, were set up in a similar fashion. Here we

have 101 summary statistics based on the number of data sets with observed counts, oa,

a = 0, . . . , 100. These are compared with expected frequencies, ea, from the simulated data.

To compare the observed and expected counts a scaled Freeman-Tukey test statistic was

used

T =100∑a=0

(√

ea −√oa)2.

In this instance, the Freeman-Tukey test statistic is favoured as an alternative over others

such as Pearson’s, as it is more robust to small expected counts. The ABC algorithm then

sets out to find parameter values where this statistic is small.

We have marginalised over sacrifice times which is satisfactory here as the model only


accounts for mature parasite deaths. For alternative stochastic models there might be a need

to retain the√

ea − √oa quantities for each sacrifice time by number of surviving mature

parasites consequently giving a statistic which is the sum of (√

ea−√oa)2 terms over S×101

cells.

[Table 1 about here.]

4. Web Appendix D

A technique for computing the likelihood of the Markov processes considered is based

on theory regarding continuous time Markov processes. If Ph is a matrix of probabilities

constructed from the expressions given in (1), the infinitesimal generator, G, for the Markov

process is given by

G = limh→0

1

h(Ph − I) ,

(Grimmett and Stirzaker, 2001, pp. 258). The transition probability matrix Γt is computed

using the matrix exponential Γt = etG (Grimmett and Stirzaker, 2001, pp. 259). The matrix

exponential arises from the solution of the Kolmogorov forward equations (Bailey, 1964,

pp. 79). By denoting the initial distribution as δ, which is a point distribution that has all

mass on the event I(0) = 0, L(0) = LiI ,M(0) = 0, the joint probabilities at autopsy time for

the ath host, ta, is computed by

P (I(ta) = i, L(ta) = j, M(ta) = k; θ) = δetaG. (3)

However, the number of states in the generator matrix for the ith host is (LiI +1)2(IMAX +1),

and due to the large counts, the matrix exponential is computationally intensive and makes

this form of the likelihood also intractable.


References

Bailey, N. T. J. (1964). The elements of Stochastic Processes: with applications to the natural

sciences. New York: Wiley.

Grimmett, G. and Stirzaker, D. (2001). Probability and Random Processes. New York:

Oxford University Press, third edition.


Table 1The grid required to compute the expected values under the model for the test statistic. Each row sum represents an

expected count.

Mature Totals

100 e1,100 = p1,100 × n1 . . . eS,100 = pS,100 × nS e100 =∑S

i=1 ei,100

. . . . . . .

. . . . . . .

. . . . . . .

0 e1,0 = p1,0 × n1 . . . eS,0 = pS,0 × nS e0 =∑S

i=1 ei,0

t1 . . . tSUnique sacrifice time

Documents

c Consult author(s) regarding copyright matters Notice ...€¦ · Analytically or computationally intractable likelihoods often occur in the genetics and biology literature. Applications