Informing disease control strategies using stochastic models S

Informing disease control strategies using stochastic

models

S SS S

S SS S

S SS S

S SS S S S

S S

S SS SS S

S S

Gastrointestinal Illness

• # 2 cause of death in children worldwide

• Largely preventable

• Transmission pathways– Contaminated water– Lack of sanitation facilities– Poor hygiene

Question:

• Suppose you are interested in reducing the burden of G.I. illness in children in a country that currently has a high burden of G.I. disease.

• Also suppose that resources are limited

• What intervention(s) should you implement?– Review recent literature

Previous Research• Fewtrell & Colford: meta-analyses of RCTs to reduce

diarrhea, 2004

(Summary of results from developing countries)

Intervention Studies Estimate 95%C.I.

Hand Washing 5 0.56 (0.33, 0.93)Sanitation 2 0.68 (0.53, 0.87)Water Supply 4 1.03 (0.73, 1.46)Water Quality 15 0.69 (0.53, 0.89)Multiple 5 0.67 (0.59, 0.76)

What explains this variability?

? ? ?

Effect estimates vary substantially from 0% reduction up to 85% reduction in G.I. illness

Some Considerations:

• Randomization and blinding procedures (internal validity)

• Generalizability (external validity)

• Selection bias (participants are different than non-participants)

• Publication bias (positive findings are more likely to be published)

Randomization and Blinding:

• Randomization: ensures that comparison groups differ only by chance

• Double blinding: ensures that neither the participant or investigator knows which treatment group the participant is in – prevents investigator/participant bias

• Makes RCTs the gold standard of Epi studies

Randomization and Blinding:

• Both are often missing in GI interventions

• Fewtrell & Colford– 52 distinct studies– 16/52 (31%) employed randomization– 3/52 (3%) blinded participants to their exposure

status

Another Consideration….0

10

20

30

Pre

vale

nce

of D

iarr

hea

in H

ouse

hol

dsw

ith P

iped

Wat

er a

nd

Flu

sh T

oile

ts

0 20 40 60 80Pecent of Population with no Sanitation Services

Prevalence of Diarrhea in Previous 2 weeksby Sanitation Coverage

We Can Use Models

• Statistical models?

T h e d a t a a r e n o t

r e a l l y i n d e p e n d e n t. Y I K E S !

W e n e e d a p a d d l e

Why Mathematical Models?

• Controlled

• Inexpensive

• Able to handle complex interactions

• Generate and test hypotheses

• Provide explicit description of the system under study (as opposed to statistical models)

A note on models:• A primary goal of using models is not

necessarily to come up with accurate predictions about the process under study, but is to observe and describe fundamental principles and relationships that emerge

• Rule of thumb: don’t make it more complex than it needs to be!

Some model assumptions

• Population is static: no one enters, no one leaves or dies

• Individuals can only become infected once

• People are equally likely to contact one another – there are no “networks” or cliques

Village Model

S SS S

S SS S

S SS S

S SS S

S SS S

S SS S

S SS S

2500 households4 people per householdAll are initially susceptible

- People can become infected by being exposed to an infected person in their household- h governs this route of infection

I S

S S

- People can become infected by being exposed to an infected person from another household in theircommunity- c governs this route of infection

S I

S S

S S

S S

-People can become infected by being exposed to pathogens from the environment (does not includedrinking water)- e governs this route of infection

S S

S S

Pathogens from the

Environment

-People can become infected by consuming pathogensin drinking water - dw governs this route of infection

S S

S S

Pathogens from drinking

water

- Infected individuals shed pathogens into the drinking water supply at a constant rate- governs the rate of shedding- The total number of pathogens shed is directly dependent on the total number of infected individuals, Itotal

S I

S I

-After time, infected people recover and are no longersusceptible to infection- governs the time to recovery

S I

S S

S R

S S

Household at time of Infection, T0

Household at time of Recovery, T1

- Pathogens in drinking water and the environment die off at a constant rate. - governs pathogen die off

Time = T1 Time = T2

Summary of routes of infection for any given individual

Environment(not drinking water)

e

Householdmember

h

Pathogens in sourceof drinking water

dwItotal *

People from other

households

c

Die off

Simulation steps

• Determine hazards for each event

• Determine time to next event

• Determine type of event

• Determine which household is affected

• Update I, S, and R values for each household

Model Events

• Five events can occur:– Infection via drinking water

– Infection via a household member

– Infection via another member of the village

– Infection via the environment

– Recovery

• The number of Susceptibles (S), Infecteds (I), and Recovereds (R) for each household is updated after each event

Hazards

• Hazards for each event are calculated based on transmission parameters,c, h, e, dw, , ,

And based on I and S

• Hazards are calculated for each household, thus, each household has 5 hazards associated with it

Hazard Formulas

• Hazard for infection via drinking water:

dw = dw * Shh

• Hazard for infection via household contact:

h = h * Shh * Ihh

• Hazard for infection via community contact:

c = c * Shh * Itotal

Drinking Water Hazard

( )tr S W t

( ) tdw

W t Idt

Hazard Formulas

• Hazard for infection via the environment:

e = e * Shh

• Hazard for recovery (moving from I to R):

r = * Ihh

Time to Infection and Hazard

• Recall that for an exponential distribution with mean = 1/:

P (T < t) = 1 – e-t

• This is a probability, it must be between 0 and 1

Time of Next Event

– We know that 1-e-t is between 0 and 1

– Solving for t: t = -log (1-p)/– Thus, given a uniform random number

and substituting it in for 1-p, we can randomly generate an event time that will come from an exponential distribution with mean = 1/

Example

• Suppose = 2

• We generate 1000 random numbers between 0 and 1 and plug into the formula:

t = -log (1-p)/2

• The resulting distribution of t looks like the following:

Mean(t) = 1/ = 1/2

Time of Next Event

• In our case the total hazard is the sum of all ’s, thus:

t = i

• Time to next event is then: Tnext = - log (U(0,1)) / t

• The average time will be 1/t

Which event occurred? Where?

• We still need to determine which event will happen at Tnext and where it will happen ( which household)

• Random numbers are employed to make these decisions

dw1 r1e1 h1c1

Total for Household #1


dw2 r2e2 h2c2 dw3

• Remember that the total hazard is divided up into may smaller hazards for each event in each household

dw1 r1e1 h1c1



dw2 r2e2 h2c2 dw3

Random selection

• Randomly selecting a number between 0 and t determineswhich event happens and in which household

• In the example below, an individual in household 2 is infected from someone else in the community

Bookkeeping

• After an event is determined, the appropriate household is updated so it has the correct number of I’s, S’s, and R’s

• The process is then repeated:– New hazards are calculated– Tnext is determined– An event is selected– A household population is updated

When does it end?

• After each time step the total time, T, is updated:

T = T + Tnext

• When T becomes greater than a predetermined value, the simulation stops.

So What?

• Example questions this model can help to answer:– How do sanitation, hygiene, and water quality

impact disease transmission?– Under what conditions is diarrheal disease

endemic vs epidemic?– What amount of disease is attributable to

drinking water?

How much disease is attributable to contaminated water?

• Extremely difficult to answer this question with observational data or randomized controlled trials (experiments)

• Bias due to confounding (observational data)

• Bias due to lack of blinding and randomization procedures (RCT/experiments)

The ‘Perfect Study’

• We’d like to have two observations from everyone

– Disease status with clean water

– Disease status without clean water

• In reality, we can only observe one outcome

• Counterfactual outcomes are the hypothetical outcomes we don’t observe

An Example(?)

Can’t observe both – but with models, WE CAN

Example: Water Quality Intervention

• We run the model two times– First run: don’t allow people to become

infected via drinking water. This is like “filtering” their water so that no exposure via drinking water is possible

– Second run: normal run. We allow exposure via drinking water

– In both runs we keep h and c relatively small

Example

• Total cases with active filter = 412• Total cases with placebo filter = 1651• Total population = 10000• Under these conditions (low h, c),

(1651 – 412) / 1651 = .75

• 75 percent of cases could have been prevented if all drinking water would have been filtered

Example

• Suppose we repeat this in a population where h and c are higher

• Person-to-person transmission will be more of a factor

Example

• Total cases with active filter = 5619• Total cases with placebo filter = 7866• Total population = 10000• Under these conditions (high h, c),

(7866 – 5619) / 7866 = .29

• 29 percent of cases could have been prevented if all drinking water would have been filtered

Impact

• This example illustrates that the success of water quality interventions is dependent on the level of person-to-person (PTP) transmission occurring in a population

– Water quality interventions will likely have more impact when PTP levels are low

– They may not be the best choice in areas where PTP is high

But wait – we can do better

• Investigate many different parameter sets

• Use a ‘realistic’ population (household size not constant)

• What generalizations can be made?

• Can community transmission and household level transmission explain null effects in water quality interventions?

Transmission Pathways

Population

• Population mirrored that of villages in coastal Ecuador

• Median household size = 4 – Min: 1 Max: 19

• Number of households = 2498

• Total population = 11260

Distribution of Household Size

0 5 10 15 20 250

50

100

150

200

250

300

350

400

450

Household Size

Fre

quen

cy

Parameter Values

Simulations

• Simulations were run for all combinations of parameter sets

• 5 x 8 x 8 x 2 = 640 parameter sets

• 10 simulations were run for each set

• Total simulations = 640 x 10 = 6400

Poor Hygiene

Poo

r S

anit

atio

n

% d

isea

se a

ttri

buta

ble

to w

ater

No shedding of pathogens (contamination) into the water ( = 0)

Poor Hygiene

Poo

r S

anit

atio

n

% d

isea

se a

ttri

buta

ble

to w

ater

Some contamination ( = 0.5)

Poor Hygiene

Poo

r S

anit

atio

n

% d

isea

se a

ttri

buta

ble

to w

ater

Moderate contamination ( = 1.0)

Poor Hygiene

Poo

r S

anit

atio

n

% d

isea

se a

ttri

buta

ble

to w

ater

High contamination ( = 1.5)

Poor Hygiene

Poo

r S

anit

atio

n

% d

isea

se a

ttri

buta

ble

to w

ater

Very high contamination ( = 2.0)

Conclusions

• Person-to-person transmission may explain variability in water quality interventions

• When both HH and Community transmission levels are high, water interventions may not be the best choice

• When either is high, water interventions have the potential to significantly reduce disease

Conclusions

• Rate of pathogen shedding influences effectiveness of intervention

• Public health efforts to reduce enteric diseases should focus on critical transmission pathways

• If more than one exists, multiple interventions may be necessary

Acknowledgments

• Co-authors– Joseph Eisenberg– Travis Porco

• Contributors– Bryan Lewis

Drinking Water Hazard

( )tr S W t

( ) (0) 1tt tI

W t W e e

( ) tdw

W t Idt

Survival function

ttdtrSWdtt

eetF 001)()(

)(

)()(

)(exp()( tttt

t eI

eWI

SrtF

110

12

Drinking Water events

• In step 3, a uniform random number, U, is selected and compared to the value 1 - F(t) where t is equal to the time of the next non-drinking water event. The value F(t) represents the probability that a drinking water event occurs by time t. Thus if U < F(t), a drinking water event occurs, otherwise, one of the four non-drinking water events occur at time t. If a drinking water event occurs, the time of the event is calculated by substituting U in for 1 - F(t) and using a Matlab interpolating algorithm to solve for t.

Motivation• VanDerslice and Briscoe showed that:

– The effect of drinking water quality on diarrheal disease varies based on household and community sanitation levels:

• Water quality interventions have the most impact in places with good sanitation

• They have less impact in places with poor sanitation

• Can this be recreated with a disease transmission model?

Documents

Informing disease control strategies using stochastic models S