Infectious disease dynamics: a statistical perspective ...dept.stat.lsa.umich.edu/~ionides/talks/idd.pdfEd Ionides Infectious disease dynamics: ... Infectious disease dynamics: a statistical perspective Epidemiology of Infectious Diseases Seminar ... P2 Stationary ...

  • Published on
    06-Apr-2018

  • View
    215

  • Download
    3

Embed Size (px)

Transcript

  • Ed Ionides Infectious disease dynamics: a statistical perspective 1

    Infectious disease dynamics: a statistical perspective

    Epidemiology of Infectious Diseases Seminar

    Harvard University

    May 11, 2009

    Ed Ionides

    The University of Michigan, Department of Statistics

  • Ed Ionides Infectious disease dynamics: a statistical perspective 2

    Collaborators

    Anindya Bhadra (University of Michigan, Statistics)

    Menno Bouma (London School of Hygiene and Tropical Medicine)

    Carles Breto (Universidad Carlos III de Madrid, Statistics)

    Daihai He (McMaster University, Mathematics & Statistics)

    Aaron King (Univ. of Michigan, Ecology & Evolutionary Biology)

    Karina Laneri (Univ. of Michigan, Ecology & Evolutionary Biology)

    Mercedes Pascual (Univ. of Michigan, Ecology & Evolutionary Biology)

  • Ed Ionides Infectious disease dynamics: a statistical perspective 3

    Infectious disease transmission: the public health challenge

    Why do we seek to quantify and understand disease dynamics?

    Prevention and control of emerging infectious diseases (SARS,HIV/AIDS, H5N1 bird flu influenza, H1N1 swine flu influenza)

    Understanding the development and spread of drug resistant strains(malaria, tuberculosis, MRSA)

  • Ed Ionides Infectious disease dynamics: a statistical perspective 4

    Disease dynamics: epidemiology or ecology, or both?

    Environmental host/pathogen dynamics are close to predator/preyrelationships which are a central topic of ecology.

    Analysis of diseases as ecosystems complements more traditionalepidemiology (risk factors etc).

    Ecologists typically seek to avoid extinctions, whereas epidemiologiststypically seek the reverse.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 5

    Infectious disease transmission: the statistical challenge

    Time series data of sufficient quantity and quality to supportinvestigations of disease dynamics are increasingly available.

    Statistical methods are required for partially observed, nonsta-tionary, nonlinear, vector-valued stochastic dynamic systems.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 6

    Population models are typically partially observed Markov

    processes (alias: state space models)

    xt is an unobserved vector valued stochastic process (discrete or

    continuous time).

    It is assumed to be Markovian, i.e., the past and future evolution of

    the system are independent given its current state.

    This is reasonable if all important dynamic processes are modeled

    as part of the system.

    yt is a vector of the available observations (discrete time). These are

    assumed to be conditionally independent given xt (a standardmeasurement model, which can be relaxed).

    is a vector of unknown parameters.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 7

    Plug-and-play inference for partially observed Markov

    processes

    Statistical methods for pomps are plug-and-play if they requiresimulation from the dynamic model but not explicit likelihood ratios.

    Bayesian plug-and-play:1. Artificial parameter evolution (Liu and West, 2001)

    2. Approximate Bayesian computation (sequential Monte Carlo

    without likelihoods, Sisson et al, PNAS, 2007)

    Non-Bayesian plug-and-play:3. Simulation-based prediction rules (Kendall et al, Ecology, 1999)

    4. Maximum likelihood via iterated filtering (Ionides et al, PNAS, 2006)

    Plug-and-play is a VERY USEFUL PROPERTY for investigating

    scientific models.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 8

    Plug-and-play in other settings

    Optimization. Methods requiring only evaluation of the objectivefunction to be optimized are sometimes called gradient-free. This is

    the same concept as plug-and-play: the code to evaluate the objective

    function can be plugged into the optimizer.

    Complex systems. Methods to study the behavior of large simulationmodels that only employ the underlying code as a black box to

    generate simulations have been called equation-free (Kevrekidis

    et al., 2003, 2004).

    This is the same concept as plug-and-play, but we prefer our label! A typical goal is to determine the relationship between macroscopicphenomena (e.g. phase transitions) and microscopic properties (e.g.

    molecular interactions).

  • Ed Ionides Infectious disease dynamics: a statistical perspective 9

    The cost of plug-and-play

    Approximate Bayesian methods and simulated moment methods leadto a loss of statistical efficiency.

    In contrast, iterated filtering enables (almost) exact likelihood-basedinference.

    Improvements in numerical efficiency may be possible when analyticproperties are available (at the expense of plug-and-play). But many

    interesting dynamic models are analytically intractablefor example, it

    is standard to investigate systems of ordinary differential equations

    numerically.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 10

    Summary of plug-and-play inference via iterated filtering

    Filtering is the extensively-studied problem of calculating theconditional distribution of the unobserved state vector xt given the

    observations up to that time, y1, y2, . . . , yt.

    Iterated filtering is a recently developed algorithm which uses asequence of solutions to the filtering problem to maximize the

    likelihood function over unknown model parameters.

    (Ionides, Breto & King. PNAS, 2006)

    If the filter is plug-and-play (e.g. using standard sequentialMonte Carlo methods) this is inherited by iterated filtering.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 11

    Key idea of iterated filtering

    Bayesian inference for time-varying parameters becomes a solveablefiltering problem. Set = t to be a random walk with

    E[t|t1] = t1 Var(t|t1) = 2

    The limit 0 can be used to maximize the likelihood for fixedparameters.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 12

    Theorem 1. (Ionides, Breto & King, PNAS, 2006)

    Suppose 0, C and y1:T are fixed and define

    t = t() = E[t|y1:t]Vt = Vt() = Var(t|y1:t1)

    Assuming sufficient regularity conditions for a Taylor series expansion,

    lim0T

    t=1 V1t (t t1) = (/) log f(y1:T |, =0)

    =0

    The limit of an appropriately weighted average of local filtered

    parameter estimates is the derivative of the log likelihood.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 13

    Theorem 2. (Ionides, Breto & King, PNAS, 2006)

    Set (n+1) = (n) + 2nM(`((n)) + n), where M is a positivedefinite symmetric matrix. Suppose the following:

    1. `() is twice continuously differentiable and uniformly convex.

    2. limn 2nn1 > 0 for some (0, 1).

    3. {n} has E[n] = o(1), Var(2nn) = o(1), Cov(m, n) = 0for m 6= n.

    If there is a with`() = 0 then (n) converges in probability to .With appropriate assumptions, iterated filtering does converge to a

    local maximum if cooled sufficiently slowly.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 14

    Example: cholera (bacterial diarrhea caused by Vibrio cholerae)

  • Ed Ionides Infectious disease dynamics: a statistical perspective 15

  • Ed Ionides Infectious disease dynamics: a statistical perspective 16

    P -b S

    I

    M

    R1 Rk

    Y

    -m

    - -k . . . -k

    k

    -c(t)

    -(1 c)(t)

  • Ed Ionides Infectious disease dynamics: a statistical perspective 17

    parameter symbol

    force of infection (t)

    probability of severe infection c

    recovery rate

    disease death rate m

    mean long-term immune period 1/

    CV of long-term immune period 1/

    k

    mean short-term immune period 1/

  • Ed Ionides Infectious disease dynamics: a statistical perspective 18

    Cholera model: nonlinear SDE driven by Gaussian noise (t)

    ddtS(t) = kRk + Y +

    ddtP (t) + P (t) ((t) + ) S

    ddtI(t) = c (t) S (m + + ) I(t)ddtY (t) = (1 c) (t) S ( + ) YddtR1(t) = I (k + ) R1

    ...

    ddtRk(t) = k Rk1 (k + )Rk

    Stochastic force of infection, with periodic cubic spline seas(t):

    (t) =(etrend t seas(t) + (t)

    ) I(t)P (t)

    +

  • Ed Ionides Infectious disease dynamics: a statistical perspective 19

    Duration of immunity

    Profile likelihood of immune period for historical time series data inDacca, Bangladesh. Other districts give similar results.

    Conclusion: asymptomatic infections have short-term immunitywith epidemiological consequences

    (King, Ionides, Pascual & Bouma, Nature, 2008).

  • Ed Ionides Infectious disease dynamics: a statistical perspective 20

    Monthly reported cholera mortality for Dacca, 18901940

    Year

    mon

    thly

    mor

    talit

    y

    1890 1900 1910 1920 1930 1940

    010

    0030

    0050

    00

  • Ed Ionides Infectious disease dynamics: a statistical perspective 21

    Historic Bengal (now Bangladesh and the Indian state of West Bengal)

  • Ed Ionides Infectious disease dynamics: a statistical perspective 22

    Parameter estimates vary smoothly across space

    Example: estimated effect of environmental cholera (larger

    environmental reservoirs are shown as darker orange).

  • Ed Ionides Infectious disease dynamics: a statistical perspective 23

    Stochastic differential equations (SDEs) vs. Markov chains

    SDEs are a simple way to add stochasticity to widely used ordinarydifferential equation models for population dynamics.

    When some species have low abundance (e.g. fade-outs andre-introductions of diseases within a population) discreteness can

    become important.

    Discrete population Markov process models are called Markov chainsTime may be continuous or discrete: for disease transmission,

    continuous time is usually appropriate.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 24

    Over-dispersion in Markov chain models of populations

    Remarkably, in the vast literatures on continuous-timeindividual-based Markov chains for population dynamics (e.g.

    applied to ecology and chemical reactions) no-one has

    previously proposed models capable of over-dispersion.

    It turns out that the usual assumption that no events occur

    simultaneously creates fundamental limitations in the statistical

    properties of the resulting class of models.

    Over-dispersion is the rule, not the exception, in data. Perhaps this discrepancy went un-noticed before statistical techniques

    became available to fit these models to data.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 25

    Implicit models for plug-and-play inference

    Adding white noise to the transition rates of existing Markov chainpopulation models would be a way to introduce an infinitesimal

    variance parameter, by analogy with the theory of SDEs.

    We do this by defining our model as a limit of discrete-timemodels. We introduce the term implicit for such a model.

    This is backwards to the usual approach of checking that a numerical

    scheme (i.e. a discretization) converges to the desired model.

    Implicit models are convenient for numerical solution, by definition, andtherefore fit in well with plug-and-play methodology.

    Details are in Breto, He, Ionides & King Time series analysis viamechanistic models, Annals of Applied Statistics, 2009.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 26

    A general mechanistic model

    A compartment model groups a population into c compartments. X(t) = (X1(t), . . . , Xc(t)) can be written in terms of the flows

    Nij(t) from i to j, via a conservation of mass identity:

    Xi(t) = Xi(0) +

    j 6=iNji(t)

    j 6=iNij(t).

    Each flow Nij is associated with a rate function ij = ij(t,X(t)). Here, Xi(t) is non-negative integer valued. X(t) models a

    population divided into c groups; ij is the rate at which each

    individual in compartment i moves to j.

    This makes the compartment model closed. Immigration, birth anddeath can be included via source and sink compartments.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 27

    Noise for rates of Markov chain compartment models

    For each rate ij between pairs of compartments (i, j) we specify anintegrated noise process ij(t) giving rise to a noise processij(t) = ddtij(t).

  • Ed Ionides Infectious disease dynamics: a statistical perspective 28

    Properties of the integrated noise processes ij(t)

    P1 Independent increments.

    P2 Stationary increments.

    P3 Non-negative increments. Therefore, ij(t) = ddtij(t) isnon-negative white noise.

    P4 Unbiased multiplicative noise: E[ij(t)] = t.

    P5 Partial independence: ij independent of ik for j 6= k.P6 Full independence: all noise processes independent.

    P7 Gamma noise: marginally, ij(t) is a gamma process.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 29

    A general over-dispersed compartment model

    This is an implicit model: we use an Euler approximation to define the

    process:

    1. Divide the interval [0, T ] into N intervals of width = T/N

    2. Set initial value X(0)

    3. FOR n = 0 to N 14. Generate noise increments {ij = ij(n + ) ij(n)}5. Generate process increments (Ni1, . . . , Ni,i1, Ni,i+1, Nic, Ri)

    Multinomial(Xi(n), pi1, . . . , pi,i1, pi,i+1, . . . , pic, 1P

    k 6=i pik)

    where pij = pij({ij(n, X(n))}, {ij}) is given in (1) below6. Set Xi(n + ) = Ri +

    Pj 6=i Nji

    7. END FOR

  • Ed Ionides Infectious disease dynamics: a statistical perspective 30

    The limiting Markov chain is specified follows:

    P [Nij = nij , for all 1 i c, 1 j c, i 6= j | X(t) = (x1, . . . , xc)]

    = E

    c

    i=1

    (xi

    ni1 . . . nii1 nii+1 . . . nic ri

    )(1k 6=ipik)ri

    j 6=ip

    nijij

    + o()

    where ri = xi

    k 6=i nik,(

    nn1 ... nc

    )is a multinomial coefficient and

    pij = pij({ij(t, x)}, {ij(t)})= (1 exp {

    k

    ikik})ijij/

    k

    ikik, (1)

    with ij = ij(t, x).

  • Ed Ionides Infectious disease dynamics: a statistical perspective 31

    Theorem 1 (Breto, He, Ionides & King, 2008). Supposing assumptions

    (P1P5) about the noise process, this limit does indeed specify a Markov

    chain.

    Proof. An explicit construction involving exponential transition clocks for

    each individual, based on the method of Sellke (1983). Such methods are

    standard for networks of interacting Poisson processes (i.e., our

    compartment model with no noise). Care is required here due to the

    introduction of noise.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 32

    Theorem 1, formal statement. Suppose (P1P5) and that ij(t, x) isuniformly continuous as a function of t. Let C(, 0) be the compartmentcontaining individual at time t = 0. Set ,0 = 0, and generateindependent Exponential(1) random variables M,0,j for each andj 6= C(, 0). For m 1, recursively set

    ,m,j = infn

    t :

    Z t,m1

    C(,m1),j(s, X(s)) dC(,m1),j(s) > M,m1,jo

    .

    At time ,m = minj ,m,j , set C(, m) = arg minj ,m,j and foreach j 6= C(,m) generate an independent Exponential(1) randomvariable M,m,j . The increments

    dNij(t) =

    ,m I{C(, m 1) = i, C(, m) = j, ,m = t}specify a Markov chain X(t) whose infinitesimal transition probabilitiesare given by the limit of the numerical algorithm as 0.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 33

    Theorem 2 (Breto, He, Ionides & King, 2008). For the case of

    independent gamma noise, an analytic formula is available for the

    infinitesimal probabilities and infinitesimal moments of this chain.

  • Ed Ionides Infectious disease dynamics: a statistical perspective 34

    Theorem 2, formal statement. Supposing (P1P7), the infinitesimal

    transition probabilities are

    P [Nij = nij , for all i 6= j | X(t) = (x1, . . . , xc)]=

    i

    j 6=i(nij , xi, ij , ij) + o()

    where

    (n, x, , ) = 1{n=0} +

    x

    n

    !nX

    k=0

    n

    k...

Recommended

View more >