Upload
crispin-m-mutshinda
View
213
Download
1
Embed Size (px)
Citation preview
ORIGINAL PAPER
A probabilistic approach to exposure risk assessment
Crispin M. Mutshinda Æ Imoh Antai Æ Robert B. O’Hara
Published online: 20 April 2007
� Springer-Verlag 2007
Abstract The introduction of hazardous substances into
the environment has long been recognized as being a cause
of several diseases in humans, wildlife, and plants. The
damaging character of suspected contaminants is usually
assessed via a ‘‘reject/retain’’ design with no explicit link
between levels of exposure and intensities of the potential
adverse health effects even though this connection may be
important for the development of public health regulations
that limit exposure to hazardous substances. Here, we
propose a probabilistic approach to exposure risk assess-
ment as a way around this typical flaw. We develop a
Bayesian model using proximity to the source of an alleged
contaminant as a surrogate for exposure. Subsequently, we
carry out an experimental study based on simulated data to
illustrate the model implementation with real world data.
We also discuss a possible way of extending the model to
accommodate potential heterogeneity in the spatial distri-
bution of the focal disease.
Keywords Environmental hazards � Inverse square law �Environmental risk assessment � Spatial heterogeneity
1 Introduction
The introduction of hazardous substances into the envi-
ronment has long been recognized as a potential cause of
several diseases in humans, wildlife, and plants. In many
cases, including contact with air pollutants (e.g., Hood
2003; Wilhelm and Ritz 2003), chemical spills (e.g.,
Wilhelm and Ritz 2005), and exposure to electromagnetic
radiation from stationary sources such as overhead power
lines (e.g., Wertheimer and Leeper 1979; Feychting and
Ahlbom 1993; Olsen et al. 1993; Theriault and Li 1997),
the adverse health effects are potentially enhanced by the
proximity of susceptible individuals to a source of the
contaminant and by the duration of the contact. Thus, given
a critical exposure time, distance to the source provides a
sensible proxy for exposure, in consistency with the inverse
square law, which establishes an inverse relationship be-
tween the concentration of a contaminant and the square of
the distance from the source (e.g., Bushong 1993).
The International Commission for Radiological
Protection (ICRP 1991) distinguishes between occupa-
tional, medical, and public exposures: occupational
exposure is one that occurs at workplaces and primarily as
a result of work; medical exposures are those incurred
during medical diagnosis, screening or treatment, usually
from medical equipments, whereas public exposure in-
cludes all exposures other than medical or occupational.
Because of the numerous variables involved in dealing
with medical and occupational exposures, the scope of the
methodology presented in this paper will be limited to
public exposure. We refer to exposure risk as the proba-
bility of disease occurrence in response to environmental
contamination, whereas the process by which the potential
adverse effects of exposure are assessed and characterized
is known as exposure risk assessment. Bates et al. (2003),
C. M. Mutshinda (&) � R. B. O’Hara
Department of Mathematics and Statistics,
University of Helsinki, P.O. Box 68,
Gustaf Hallstromin katu 2b, Helsinki 00014, Finland
e-mail: [email protected]
I. Antai
Department of Marketing, Supply Chain Management
and Corporate Geography, Swedish School of Economics
and Business Administration, Hanken,
P.O. Box 479, 00101 Helsinki, Finland
123
Stoch Environ Res Risk Assess (2008) 22:441–449
DOI 10.1007/s00477-007-0143-0
amongst others, delineate four major activities involved in
the exposure risk assessment process: (1) hazard identi-
fication identifies the potential hazards and the alleged
health defects, (2) exposure assessment identifies popula-
tions, which could be exposed and the possible pathways,
(3) dose–response assessment quantifies the relationship
between levels of exposure and levels of potential adverse
effects and, (4) risk characterization combines the results
of the previous three steps to determine some outcome of
interest.
In practice, the damaging character of suspected con-
taminants is usually assessed via an all-or-nothing ‘‘reject/
retain’’ design with no explicit link between levels of
exposure and intensities of the adverse health effects,
even though this connection may be important for the
development of public health regulations that limit
exposure to hazardous substances. In this paper we pro-
pose a probabilistic approach to exposure risk assessment
as a way around this typical shortcoming. We develop a
Bayesian model using proximity to the source of an al-
leged contaminant as a surrogate for exposure. Subse-
quently, we conduct an experimental study based on
simulated data to illustrate how the model can be imple-
mented with real world data whose acquisition turns out
to be highly involved, owing to numerous constraints.
Although the proposed model is essentially a log-linear
model, which has been quite extensively studied in the
statistical literature (e.g., McCullagh and Nelder 1989;
Lindsey 1997; Dobson 2002), attempts to tackle exposure
risk from a Bayesian perspective remain so far atypical,
despite the convoluted nature of environmental pollution,
which makes this approach particularly promising. We
devote the next section to describing the basics of the
Bayesian inference and Markov Chain Monte Carlo
(MCMC) methods; we refer readers interested in more
details to the appropriate literature such as Robert (2001)
and Gelman et al. (2003).
2 Theoretical background
Bayesian inference is an approach to statistical inference in
which all forms of uncertainty are expressed in terms of the
probabilities. A Bayesian analysis starts with the formu-
lation of a model, p(y|h), that is assumed to describe the
data conditionally on the unknown parameter of interest, h2Q. Subsequently, a prior distribution, p(h), intended to
embody the analyst’s state of knowledge about the plau-
sible parameter values before seeing the data is formulated.
Finally, as data become available, the prior distribution is
updated to the posterior distribution, p(h|y), by means of
the Bayes’ theorem:
p hjyð Þ ¼ p hð Þ p yjhð ÞR
Hp hð Þ p yjhð Þ dh
/ p hð Þ p yjhð Þ ð1Þ
The posterior distribution takes account of both the prior
uncertainty about the parameters and the variability in the
data, and provides a legitimate tool for inferring model
parameters and outcomes of future observations. This is all
done using probabilistic statements that turn out to be
intuitive by contrast to classical tools such as p values and
conventional confidence intervals, which are often erro-
neously interpreted. Nevertheless, two notes of caution
should be kept in mind when applying a Bayesian analysis.
First, Bayesian priors are subjective in the sense that two
analysts faced with the same problem may have different
states of knowledge about the phenomenon of interest and
consequently start from significantly different priors, which
may result in sensibly different conclusions. Most frequ-
entist statisticians view this dependence on prior specifi-
cation as conferring an arbitrary character to the Bayesian
inference whereas Bayesians consider the prospect of
combining available knowledge with data information as a
real advantage, which makes this approach a learning
process, provided any prior input is duly motivated. A prior
distribution can be based on information available in the
literature, experts’ opinions or information from any other
relevant source. On the other hand, the lack of prior
knowledge leads to the use of so-called non-informative,
‘‘flat’’ or ‘‘vague’’ priors as for example, a uniform dis-
tribution on some large compact region or a centred
Gaussian with large variance. A detailed account of the
prior specification issue is given by Spiegelhalter et al.
(1999) and Gelman (2002). Second, the posterior density is
generally not available in closed-form since the computa-
tion of normalizing constant in the Bayes’ theorem usually
involves a high dimensional integration with no analytic
solution, requiring therefore some form of numerical
approximation. The current prominence of Bayesian
applications in practically all quantitative sciences is
undeniably a consequence of the development of Markov
Chain Monte Carlo (MCMC) techniques (e.g., Gelfand and
Smith 1990; Casella and George 1992; Casella and Robert
1999; Gelman et al. 2003), which enable a direct sampling
from distributions with complex algebraic forms. Although
MCMC has been responsible for the revival in Bayesian
applications, its utility is not restricted to the Bayesian
inference. It should also be noted that computational con-
straints are not exclusive to the Bayesian approach; the
optimization problems involved in the classical maximum
likelihood method estimation may in some instances be
computationally intensive, whereas a Bayesian analysis
based on the so-called conjugate priors where the posterior
has the same algebraic form as the prior (e.g., Binomial
442 Stoch Environ Res Risk Assess (2008) 22:441–449
123
likelihood with Beta prior or Poisson likelihood with
Gamma prior) is always straightforward. All members of
the exponential family have conjugate priors (e.g., Gelman
et al. 2003). The motivation for using conjugate priors
remains essentially computational convenience. Indeed,
non-conjugate priors may be preferable in many circum-
stances, and MCMC methods provide a potential key to
computational hurdles. A sketch of the principles of
MCMC techniques is provided below.
The underlying principle of MCMC is to set up a suit-
able Markov chain with the desired posterior density as its
stationary distribution. Then, starting from an arbitrary
state in the parameter space, to simulate the chain until it
converges. One way of assessing convergence is to plot
the trajectories hj(t) against the iteration number t,
for j = 1,...,dim(h) (trace plots), and judge convergence
in an informal visual inspection. It might be easier for each
parameter’s component when multiple chains starting from
different initial states are run simultaneously, to judge
convergence by the mixing between the chains. When the
chain has practically converged, it is then necessary to
ignore the early, pre-convergence, part called the ‘‘burn-
in’’ period in order to avoid the effects of the initial choice
of values. After burn-in removal, one usually simulates the
chains for a number of additional iterations. The most
popular MCMC algorithm is the Metropolis–Hastings
(Metropolis et al. 1953; Hastings 1970), which is briefly
described below.
Let h = (h1, ...,hd) denote the d-dimensional parameter of
interest and p(h|y) the required posterior based on the data
y, and let q(h, h*) = p(h*|h) be a proposal kernel where h is
the current state and h* a (candidate) proposal. The
Metropolis–Hastings (MH) algorithm proceeds as follows:
Algorithm 1: Metropolis–Hastings
1. Pick arbitrarily ( )0θ in the support of ( )|p yθ and set 0i =
2. Generate a proposal *θ from ( ) ( )( )( ),. . | iiq pθ θ=
3. Compute ( )( ) ( )( )( )
( )( )( )( )**
*
*
,|, min , 1
| ,
i
i
i i
qp y
p y q
θ θθα θ θ
θ θ θ
⎧ ⎫⎪ ⎪= ⎨ ⎬⎪ ⎪⎩ ⎭
Generate ( )~ 0,1Unifβ and set ( )1 *iθ θ+ = if ( )( )*,iβ α θ θ< . Otherwise set ( ) ( )1i iθ θ+ =
Repeat 2-3 until “convergence”.
Since a (h(i), h*) only depends on p(h|y) through the ratio
p h�jyð Þp h ið Þjyð Þ ; the normalizing constant of p(h|y) cancels out.
The choice of the proposal distribution is essentially
arbitrary subject only to technical constraints such as the
minimization of the autocorrelation. The original
Metropolis algorithm ‘‘Metropolis chain’’ (Metropolis
et al. 1953) uses a symmetric proposal distribution where
q(h*, h(i)) = q(h(i), h*), as for example, a normal dis-
tribution centered at the current state. A symmetric pro-
posal has the practical advantage that a (h(i), h*) takes the
simpler form a h ið Þ; h�� �
¼ p h�jyð Þp h ið Þjyð Þ : A typical Metropolis
chain is the random walk Metropolis whose proposal
kernel is h* = h(i) + se where e ~ mvnorm(0,S), S is a
suitably chosen covariance matrix, and s is a constant,
usually tuned over the first iterations to get an acceptance
rate of between 20 and 40% by running the chain for
different values of s and monitoring the acceptance rate.
The Gibbs sampler (e.g., Gelfand and Smith 1990) is a
particular case of the M-H algorithm where the proposal
is always accepted (a = 1). The key to the Gibbs sampler
is that one considers univariate conditional distributions.
So, at each step, n random variables are generated
sequentially from n univariate conditionals rather than a
single n-dimensional vector. This supposes that the com-
plete conditional posteriors p(hj|hi, i „ j) are available in
closed-forms. The Gibbs algorithm proceeds as follows:
Algorithm 2: Gibbs sampler
1. Pick arbitrarily ( ) ( ) ( )( )0 0 01 , . . . , dθ θ θ= and set 0i =
2. Generate ( ) ( ) ( )( )11 1 2~ | , ..., ,i i i
kp yθ θ θ θ+ , ( ) ( ) ( ) ( )( )1 12 2 1 1~ | , ,..., ,i i i i
kp yθ θ θ θ θ+ + ,
…, ( ) ( ) ( )( )1 1 11 1~ | ,..., ,i i i
k k kp yθ θ θ θ+ + +−
3. Set 1i i= + and repeat steps 2-3 until “convergence”.
Underlying the usual sample-based inference for a scalar
quantity is the assumption of ‘‘independent and identically
distributed’’ (IID) observations, which does not generally
apply to simulated posterior draws. The autocorrelation
function (ACF) provides the serial correlation between
observations that are k iterations apart. If autocorrelations
die out at lag k, say, then thinning the chain at every nth
observation with n > k, yields a roughly IID sample.
Having obtained the joint conditional distribution of the
unknown variables, one usually needs to marginalize over
the nuisance variables (variables that are not of current
interest) to obtain the distribution of just the quantities of
interest. Mathematically, the required marginalization is
achieved by integrating out the nuisance variables. For a
joint posterior simulated via MCMC, a sample from the
marginal distribution of a component of the parameter
vector can be obtained by overlooking the nuisance
variables.
MCMC algorithms are usually easy to implement. In
practice, a wide range of Bayesian models can be fit-
ted using OpenBUGS, a Bayesian software package freely
available at http://www.mathstat.helsinki.fi/openbugs/
(Thomas et al. 2006).
Stoch Environ Res Risk Assess (2008) 22:441–449 443
123
3 Materials and methods
A supposedly exposed area is partitioned into a number of
sectors according to the distances, expressed in a suitable
unit, to the source of the suspected contaminant. Here we
restrict our attention to individuals whose exposure times
exceed a given critical limit and assume that a complete
census of this target population is available in order to avoid
a lengthy discussion on sampling issues. We assume that the
number, yi of affected individuals in sector i is Poisson-
distributed with intensity k i = l *f(di) where l is a dis-
tance-independent ‘‘baseline’’ intensity, di is the distance of
the focal area to the source, and f is a decaying function of
the distance so that the more remote from the source an
individual lives the less vulnerable that individual is. For
example, f dið Þ ¼/ 1
dið Þ2; di 6¼ 0 or f(di) = exp (–adi), which
is used here. More specifically, we assume that
yi � Pois k ið Þ; and k i ¼ wi l exp �a dið Þ ð2Þ
where a 2[0, +¥) is a parameter, which reflects the
strength of association between the suspected contaminant
and the alleged health defect, and the ‘‘population
weights’’ wi > 0 are known scaling factors intended to
correct for disparate population sizes in different sectors.
The population weights can be interpreted as follows: if, w
is set to 1 in a reference sector i whose population size is ni,
then the population size, nj of an arbitrary sector j is to be
corrected by a factor wj ¼ ni
nj: This correction would not be
necessary if the model was based on the proportions of
affected individuals in each sector, in which case a multi-
nomial likelihood would be more appropriate, condition-
ally on the total population size in the area under study.
The Poisson distribution adopted here is known to be an
appropriate model for data arising in form of counts (e.g.,
Gillman and Hails 1997; Davison 2003), in particular when
the number of cases is much less than the number of ex-
posed individuals. A significant a (significantly different
from zero) provides some evidence in favor of the asso-
ciation between the presumed contaminant and the alleged
disease (but one must bear in mind that correlation does not
necessarily imply causality). Unlike the classical setting
where evidence against the null hypothesis of no associa-
tion is often investigated, the Bayesian approach enables
estimating the probability of the alleged health defect at
different levels of exposure, which is important for the
development of public health regulations regarding expo-
sure to hazardous substances.
It is particularly important to extend the study over a
number of similar situations in order to examine the effects
of potential environmental confounders and mitigate the
likelihood of false alarms. Indeed, it might be the case that
the collected data are not sufficient to identify the subtle
effects of exposure while properly adjusting for con-
founders. The model as presented in Eq. 2 can be
straightforwardly extended to a number of, say, K areas as
follows:
yi;k � Pois k i;k
� �;andk i;k ¼wi;k lk exp �a dið Þ; 16 k6 K
ð3Þ
where the parameters keep the same meaning as in Eq. 2,
but lk is now the baseline intensity associated with area k
only, and wi, k corrects for the population size in sector i of
area k.
Spatial homogeneity in the distribution of the focal
disease across different areas can be examined by testing
for equality of the baseline failure rates lk, for example via
some ANOVA design. Significant differences in baseline
failure rates across areas would suggest the relevance of
anonymous locally acting factors and the necessity of
extending the model to accommodate potential confound-
ers. This can be achieved within the flexible hierarchical
Bayesian framework (e.g., Robert 2001; Gelman et al.
2003), which allows the decomposition of a prior distri-
bution into several conditional levels. By assigning for
example Gamma(tk, bk) priors to the baseline intensi-
ties, lk a spatial discrimination can be worked out from
posterior inference on the hyper-parameters tk and bk.
This approach provides a rational way of modeling the risk
of diseases in connection with the spatial distribution of
populations, as a practical way of integrating the areas of
environmental and biomedical sciences.
After suitable priors have been specified for the model
parameters, the joint posterior can be worked out or sim-
ulated numerically via MCMC. The posterior distribution
or the simulated sample from it includes uncertainty about
the parameters, which can be incorporated into subsequent
inferences. For example, conditionally on the observed
data, y, the predictive distribution, p(Yd|y), of affected
individuals in a sector located d units of distance from the
source of a contaminant and which population weight is w
can be simulated from by using the following algorithm:
Algorithm 3: Simulating a posterior predictive distri-
bution
Set b to 0
1. Generate ( ) ( )* *, ~ , |b b p yµ α µ α and compute ( )* * *expb b bwλ µ α δ= −
2. Generate ( )bb Poisy ** ~ λ
b=b+1
Repeat 1 and 2 until b=B
444 Stoch Environ Res Risk Assess (2008) 22:441–449
123
y* = (y*1, ..., y*B) is then a sample of size B from the
posterior predictive distribution of interest, formally de-
fined as
p Y djy� �
¼Z
H
p Y djh� �
p hjyð Þ dh ð4Þ
Notice that in Eq. 4 the likelihood of the data is averaged
across the uncertainty contained in the posterior distribu-
tion of the parameter. Hence, mean(y*) provides an
estimate of E Ydjy� �
;F�1y�
12
� �estimates the corresponding
median, while the a2
and 1� a2
� �percentiles of y* provide
approximate cutoff points for the (1 – a)% credibility set
(Bayesian confidence interval) for (Yd|y). Moreover, the
risk of (Yd|y) reaching a critical level C is estimated by
Pr Yd>Cjy
� �¼ 1
B
PBb¼ 1 1 y�b>Cf g where 1 y�b�Cf g denotes
the indicator function of {y*b ‡ C}.
The usual asymptotic Gaussian approximation applies to
posterior samples. Indeed, as the sample size increases, the
joint posterior tends to be multivariate normal with the
posterior mode as approximate mean and the asymptotic
covariance matrix given by the negative inverse of the
Hessian of the log-posterior evaluated at the posterior
mode. Consequently, an approximate (1 – a)% credibility
set for a scalar parameter h can be obtained by the usual
recipe: h� za=2 � Var h� �1
2
where h is an estimate of the
posterior mode of h;Var h is the asymptotic variance of h;and za /2 the a
2-percentile of the standard normal distribu-
tion. Unlike the classical (1 – a)% confidence interval, a
(1 – a)% credibility set has probability (1 – a)% of con-
taining the true value of the parameter.
The performance of statistical models is often assessed
through simulation studies because the ‘‘true’’ model from
which the data have been generated is known to the ana-
lyst. This enables a judgment of the extent to which the
underlying mechanisms can be revealed by the model-
based analyses. The next section is devoted to a practical
application of the model based upon simulated data, which
is intended to illustrate the model implementation with real
world data on environmental pollution whose acquisition is
particularly complicated, since beyond the marked bud-
getary constraints, it involves dealing with subject privacy
concerns, economic and even political interests. On the
other hand, sufficient data are needed to identify the subtle
effects of exposure while suitably adjusting for potential
confounders and mitigate the likelihood of false alarms.
4 Report on the simulation study
4.1 Parameters and settings
We used computer simulation to generate a dataset ex-
tended to three areas partitioned in 30 equally populated
sectors each, so that all the weight factors wi were set to 1.
The case of unevenly populated sectors could be straight-
forwardly handled by an appropriate scaling of the linear
predictor mi,k defined below. In each area, the distances of
the 30 sectors to the source were assigned values ranging
from 0.1 to 3 in increments of 0.1. In order to ensure that
the results are not obtained by chance, the ‘‘true’’ baseline
intensities,lk, and the parameter a were assigned different
values in the three areas: l = 100; a = 2 in the first
area, l = 20; a = 1 in the second, and l = 10; a = 0.5 in
the third area. The R-script for data generation, a resulting
dataset (in the BUGS format), and the BUGS code for the
model fitting are provided in Appendices 1, 2 and 3.
4.2 Analyses, results and discussion
The log-rescaled intensity derived from Eq. 3 as
mi;k ¼ logki; k
xi; k
� �¼ lmk � a � di; where lmk = log (lk),
provides a linear predictor in d that can be used to fit the
model. We used a Bayesian approach with non-informative
priors: lmj ~ N (0, 0.01) and a ~ Unif(0, 100), where the
normal distribution is parameterized in terms of the mean
and the precision (inverse of the variance). The required
posteriors were simulated using MCMC methods via
OpenBUGS. We ran 30,000 iterations of three chains,
discarding the first 10,000 iterations as burn-in and thin-
ning the remainder to one in every tenth observation.
Convergence of the MCMC was assessed visually by the
mixing of the chains. The sensitivity of the results to the
prior inputs was examined by varying the range of the
priors by orders of magnitude, but the results remained
similar, suggesting non significant influence of the prior
specification on the results obtained.
Table 1 gives the posterior means, standard errors, as
well as lower and upper bounds of the 95% credibility sets
for each parameter, whereas Figs. 1, 2 and 3 display trace
plots of 3,000 posterior draws from the MCMC outputs,
which illustrate the mixing of the chains. We can see that the
MCMC sampler jumps freely around the parameter space,
and that the resulting estimates are close to the true values.
The posterior autocorrelations (Fig. 4) die out at lag 5,
suggesting that the usual IID assumption is reasonable with
a thinning to every kth observation with k > 5. Conse-
quently, we thinned our samples to k = 10. Once the pos-
teriors of the parameters have been estimated, Algorithm 3
can be used to predict the distribution of affected individ-
uals in a specific area, given its distance from a source of
contamination and its population weight.
The right panel in Fig. 5 shows that the variables l and
a are not strongly correlated (the first area has been se-
lected for illustration). A rough IID sample from the
marginal distribution of one parameter can be obtained
from the simulated joint posterior by overlooking the other.
Stoch Environ Res Risk Assess (2008) 22:441–449 445
123
The normal QQ-plots in the left and the central panels of
Fig. 5 validate the asymptotic normality of the posteriors
for the two variables, which allows the classical hypothesis
and significance test based on large-sample Gaussian
approximation. This illustrates further the flexibility of the
sample-based posterior analysis, in particular the fact that it
does not completely break with the classical approach.
Bayesian analyses based on vague priors are known to
yield similar results to the classical maximum likelihood
estimation, and frequentist statisticians often think that the
two approaches are equivalent. Indeed, even though the
results from the two approaches may seem identical at first
glance, their interpretation is not always the same (e.g.,
confidence intervals are intervals for statistics that would
be calculated from replicate data sets whilst credibility
intervals are intervals for possible values of parameters). In
addition, under the Bayesian framework, the posterior from
a previous analysis can serve as prior input as new data are
obtained, which renders the Bayesian approach more
fruitful when non-trivial prior information is available as is
often the case. More importantly, if prediction about future
observations is of concern, the two approaches may lead to
significantly different results. Indeed, suppose that one has
gotten data y = (y1, ...,yn) and wants to infer about a future
observation, say Pr (yn+1 2 A|y). A solution under classical
statistics, based on the ‘‘plug-in’’ principle, is to compute
Pr ynþ1 2 Ajh ¼ h� �
which ignores the uncertainty of the
estimate h by conditioning on an event that is known to be
only approximately true. On the other hand, the Bayesian
Table 1 Posterior summaries; each parameter is indexed by the
corresponding area
Parameter True value Posterior mean SE 2.5 pc 97.5 pc
l1 100 103.6 7.31 89.75 118.70
l2 50 50.2 3.87 43.0 58.12
l3 10 10.11 1.42 7.50 13.09
a1 2.0 2.046 0.09 1.86 2.24
a2 1 0.99 0.06 0.87 1.12
a3 0.5 0.48 0.002 0.30 0.67
mu[1]
iteration26999 28000 29000
80.0
100.0
120.0
140.0
alpha[1]
iteration26999 28000 29000
1.6
1.8
2.0
2.2
2.4
2.6
Fig. 1 3,000 MCMC steps of
the posteriors of l (top) and
a (bottom) for the first area
plotted against the iteration
number. The true parameter
values are l = 100 and a = 2.
This figure, as well as Figs. 2
and 3, illustrates the mixing of
the three chains
mu[2]
iteration26999 28000 29000
30.0
40.0
50.0
60.0
70.0
alpha[2]
iteration26999 28000 29000
0.6
0.8
1.0
1.2
1.4
Fig. 2 3,000 MCMC steps of
the posteriors of l (top) and
a (bottom) for the second area
plotted against the iteration
number. The true parameter
values are l = 50 and a = 1
446 Stoch Environ Res Risk Assess (2008) 22:441–449
123
solution consists of averaging the likelihood of possible
outcomes over the posterior distribution of the parameter.
That is, calculating Pr ynþ1 2 Ajyð Þ ¼R
A
RH p ynþ1; hjyð Þ
dh dynþ1: It turns out that the posterior predictive distri-
bution includes uncertainties inherent to the parameter
estimate and to the fact that any future value is itself a
random event, whereas only the second source is taken into
account by the classical counterpart. Consequently, pre-
diction intervals based on classical statistics tend to be too
short. However, when the number of observations gets
infinitely large compared to the number of parameters, the
two approaches often agree (e.g., Robert 2001).
5 Concluding remarks
In this paper we have been concerned with the typical flaw
of the missing link between levels of exposure to a con-
taminant and intensities of the alleged health defect. We
proposed a probabilistic approach as a way around this
mu[3]
iteration26999 28000 29000
5.0
10.0
15.0
20.0
alpha[3]
iteration26999 28000 29000
0.0
0.2
0.4
0.6
0.8
1.0
Fig. 3 3,000 MCMC steps of
the posteriors of l (top) and a(bottom) for the third are,
plotted against the iteration
number. The true parameter
values are l = 10 and a = 0.5
mu[1]
lag0 20 40
-1.0-0.50.00.51.0
-1.0-0.50.00.51.0
-1.0-0.50.00.51.0
mu[2]
lag0 20 40
mu[3]
lag0 20 40
alpha[1]
lag0 20 40
-1.0 -0.5 0.0 0.5 1.0
alpha[2]
lag0 20 40
-1.0 -0.5 0.0 0.5 1.0
alpha[3]
lag0 20 40
-1.0 -0.5 0.0 0.5 1.0
Fig. 4 Estimated
autocorrelation functions for all
six parameters. In all cases, the
autocorrelation vanishes
practically at lag 5
Stoch Environ Res Risk Assess (2008) 22:441–449 447
123
shortcoming. We developed a Bayesian model dealing with
a situation where the distance from the source of an alleged
contaminant was used as a proxy for exposure, and con-
ducted an experimental study based upon simulated data to
illustrate the model implementation with real world data.
The model was fitted to the data using Markov chain Monte
Carlo methods via the OpenBUGS software. We pointed
out some difficulties connected with the acquisition of
actual data on environmental pollution and emphasized the
necessity for analysts to ensure the adequateness of the data
to identify the effects of the presumed contaminant (if any)
while properly adjusting for confounders, in order to mit-
igate the likelihood of false alerts. We dealt with time by
restricting the target population to susceptible individuals
whose exposure time exceeded a given critical level. To
avoid a lengthy discussion on sampling issues, we assumed
that a census of the target population was available. We
maintain, however, that more insight can be gained by
extending the study to all susceptible individuals and
treating the exposure time as a covariate. The model as
presented here can be tailored to situations with different
exposure proxies such as the concentration of chemicals in
drinking water.
Bayesian inference provides a coherent framework for
learning from evidence as it accumulates, with the
attractive feature that once a model is defined, all answers
follow directly from probability theory, often with an
insightful meaning. Recent advances in computational
algorithms exemplified by the advent of MCMC have
stimulated a tremendous increase in the use of Bayesian
methods in most quantitative sciences over the last dec-
ade. Researchers in different fields would be well advised,
as a matter of practical necessity, to familiarize with the
basics of the Bayesian methodology, to ensure at least
that they are in a position to understand and discuss the
increasing number of Bayesian reports in the literature
(e.g., Tan 2001).
Acknowledgments We are indebted to the OpenBUGS develop-
ment team for making this software package freely available.
Appendix 1
R-code for data generation
d<-seq(0.1,3,0.1);
lbda<-matrix(0,nrow=30,ncol=3)
y<-matrix(0,nrow=30,ncol=3)
mu<-c(100,50,10)
alpha<-c(2,1,0.5)
for(j in 1:3){
for(i in 1:30){
lbda[i,j]<-mu[j]*exp(-alpha[j]*d[i])
y[i,j]<-rpois(1,lbda[i,j])} }
Appendix 2
Data in the BUGS format
list (d = c( 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9,
1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7,1.8,1.9, 2.0, 2.1,
2.2, 2.3, 2.4,2.5, 2.6, 2.7, 2.8, 2.9, 3.0), y = structure
(.Data= c (80, 66, 61, 44, 39, 26, 27, 26, 14, 14, 14, 5,
10, 7, 6, 4, 3, 0, 0, 1, 3, 1, 2, 1, 0, 0, 1, 0, 0, 0, 34,
47, 41, 37, 21, 29, 32, 35, 22, 13, 14, 11, 12, 13, 12, 10,
7, 11, 6, 6, 4, 4, 8, 6, 3, 5, 1, 3, 5, 3, 11, 9, 7, 7, 7,
3, 9, 8, 14, 5, 6, 5, 3, 4, 7, 4, 8, 2, 2, 4, 4, 4, 4, 0,
5, 6, 3, 0, 2, 2), .Dim = c(3, 30)))
-3 -2 -1 0 1 2 3
9010
011
012
0
Normal Q-Q Plot Normal Q-Q Plot
mu 1
Sam
ple
Qua
ntile
s
-3 -2 -1 0 1 2 3
.18
1.9
2.0
2.1
.22
2.3
2.4
alpha 1
Sma
lpQ e
autnil
s e
1.8 1.9 2.0 2.1 2.2 2.3 2.4
0910
011
02 10
alpha 1
m1 u
Fig. 5 From left to right: normal QQ-plots of the posteriors of l1, QQ-plots of the posteriors of a1, and cloud of the pairs (a1, l1). The first area
has been arbitrarily selected for illustration
448 Stoch Environ Res Risk Assess (2008) 22:441–449
123
Appendix 3
BUGS-code for the model fitting
model{
for(i in 1:3){
for(j in 1:30){
lbda[i,j]<-exp(moy[i,j])
moy[i,j]<-lm[i]-alpha[i]*d[j]
y[i,j]~dpois(lbda[i,j])}
mu[i]<-exp(lm[i])}
for (i in 1:3){
lm[i]~dnorm(0,0.01)
alpha[i]~dunif(0,100)}}
References
Bates SC, Cullen A, Raftery AE (2003) Bayesian uncertainty
assessment in multicompartment deterministic simulation mod-
els for environmental risk assessment. Environmentrics 14:335–
371
Bushong SC (1993) Radiologic science for technologists, 5th edn.
Mosby, St Louis
Casella G, George EI (1992) Explaining the Gibbs sampler. Am Stat
46:167–174
Davison AC (2003) Statistical models. Cambridge University Press,
London
Dobson AJ (2002) An introduction to generalized linear models, 2nd
edn. Chapman & Hall/CRC, London
Feychting M, Ahlbom A (1993) Magnetic fields and cancer of
children residing near high-voltage power lines. Am J Epidemiol
138:467–481
Gelman A (2002) Prior distribution in encyclopedia of environmet-
rics, vol 3. Wiley, Chichester, pp 1634–1637
Gelfand A, Smith A (1990) sampling-based approaches to calculating
marginal densities. J Am Stat Assoc 85:398–409
Gelman A, Carlin JB, Stern HS, Rubin DB (2003) Bayesian data
analysis, 2nd edn. Chapman & Hall/CRC, London
Gillman M, Hails R (1997) An introduction to ecological modelling:
putting practice into theory. Blackwell, Oxford
Gurrin LC, Kurnczuk JJ, Burton PR (2000) Bayesian statistics in
medical research: an intuitive alternative to conventional data
analysis. J Eval Clin Pract 6(2):193–204
Hastings W (1970) Monte Carlo sampling methods using Markov
chains and their applications. Biometrika 57:97–109
Hood E (2003) Life near the fast lane: an increased risk of birth
problems—science selections. Environ Health Perspect
111:207–216
International Commission on Radiological Protection (1991) Recom-
mendations of the international commission on radiological
protection, vol 60. ICRP publications
Lindsey J (1997) Applying generalized linear models. Springer,
Heidelberg
McCullagh P, Nelder JA (1989) Genaralized linear models, 2nd edn.
Chapman & Hall, London
Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E
(1953) Equations of state calculations by fast computing
machines. J Chem Phys 21:1087–1092
Olsen JH, Nielsen A, Schulgen C (1993) Residence near high voltage
facilities and risk of cancer in children. Br Med J 307:891–895
Robert CP (2001) The Bayesian choice: from decision-theoretic
foundations to computational implementation, 2nd edn. Spring-
er, Heidelberg
Spiegelhalter DJ, Myles P, Jones DR, Abraham KR (1999) An
introduction to Bayesian methods in health technology assess-
ment. Br Med J 319:508–512
Spiegelhalter DJ, Thomas A, Best N, Lunn D (2003) WinBugs
version 1.4. User manual. http://www.mrc-bsu.cam.ac.uk/bug
Tan SB (2001) Introduction to Bayesian methods for medical
research. Ann Acad Med Singap 30:444–446
Theriault G, Yi LC (1997) Risk of leukemia among residents close to
high voltage transmission electric lines. Occup Environ Med
54:625–628
Thomas A, O’Hara RB, Ligges U, Sturtz S (2006) Making BUGS
open. R News 6:12–17
Wertheimer N, Leeper E (1979) Electrical wiring configuration and
childhood cancer. Am J Epidemiol 109(3):273–284
Wilhelm M, Ritz B (2003) Residential proximity to traffic and
adverse birth outcomes in Los Angeles County, California,
1994–1996—children’s health. Environ Health Perspect 111:20–
216
Wilhelm M, Ritz B (2005) Local variations in CO and particulate air
pollution and adverse birth outcomes in Los Angeles County,
California, USA. Environ Health Perspect. doi: 10.1289/
ehp.7751
Stoch Environ Res Risk Assess (2008) 22:441–449 449
123