Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Comparing Compartment and Agent-basedModels
Shannon GallagherJSM Baltimore, MDAugust 2, 2017
Thesis work with:William F. Eddy (Chair)Joel GreenhouseHoward SeltmanCosma ShaliziSamuel L. Ventura
Goal: Combine two goodmodels into a better one
1
Studying infectious disease is important
2
Compartment vs. Agent-basedModels
Compartment models (CMs) describe how individuals evolve over time
Assumptions (Anderson and May 1992) :
1. Homogeneity of individuals
2. Law of mass actionI(t+ 1) ∝ I(t)
4
Compartment models (CMs) describe how individuals evolve over time
Assumptions (Anderson and May 1992) :
1. Homogeneity of individuals
2. Law of mass actionI(t+ 1) ∝ I(t)
4
Agent-basedmodels (AMs) simulate the spread of disease
Assumptions (Helbing 2002):
1. Heterogeneity of agents
2. Model adequately reflects reality
5
Agent-basedmodels (AMs) simulate the spread of disease
Assumptions (Helbing 2002):
1. Heterogeneity of agents2. Model adequately reflects reality
5
CMs and AMs: a side by side comparison
CMs
∙ Equation-based∙ Computationally fast∙ Homogeneous individuals∙ No individual properties
AMs
∙ Simulation-based∙ Computationally slow∙ Heterogeneous individuals∙ Individual properties
6
Combining the two together
(Bobashev 2007, Banos 2015, Wallentin 2017)
∙ ad hoc approaches
∙ perspective from non-statisticians
Goal: Create a statistically justified hybrid model
7
Combining the two together
(Bobashev 2007, Banos 2015, Wallentin 2017)
∙ ad hoc approaches
∙ perspective from non-statisticians
Goal: Create a statistically justified hybrid model
7
CurrentWork
There are twomain avenues of improvement
1. Quantifying how similar CMs and AMs are
2. Speeding up AM run-time
9
The SIRmodel: a detailed look
(Kermack and McKendrick 1927) dSdt = −βSI
NdIdt = βSI
N − γIdRdt = γI
∙ β – rate of infection∙ γ – rate of recovery∙ N – total population size
10
The SIRmodel: a detailed look
(Kermack and McKendrick 1927) ∆S∆t = −βSI
N∆I∆t = βSI
N − γI∆R∆t = γI
∙ β – rate of infection∙ γ – rate of recovery∙ N – total population size
11
Our stochastic CM approach
S(t+ 1) = S(t)− stR(t+ 1) = R(t) + rtI(t+ 1) = N− S(t+ 1)− R(t+ 1),
with
st+1 ∼ Binomial(S(t), βI(t)N
)rt+1 ∼ Binomial
(I(t), γ
).
12
Our stochastic AM approach
For an agent xn(t), n = 1, 2, . . . ,N, the forward operator for t > 0 is
xn(t+ 1) =
xn(t) + Bernoulli
(βI(t)N
)if xn(t) = 1
xn(t) + Bernoulli (γ) if xn(t) = 2xn(t) otherwise
.
where xn(t) = k, k ∈ {1, 2, 3} corresponds to state S, I, and R, respectively
Let the aggregate total in each compartment be
Xk(t) =N∑
n=1I{xn(t) = k}
13
Themeans overlap
0
25
50
75
100
0 25 50 75 100Time
% o
f Pop
ulat
ion
Type
S−CM
I−CM
R−CM
S−AM
I−AM
R−AM
1000 agents; 5000 runs; β = 0.10; γ = 0.03Mean Proportion of Compartment Values
14
The distributions look the same
15
These approaches are equivalent
Theorem
Let the CM and AM be as previously described. Then for all t ∈ {1, 2, . . . , T},
S(t) d= XS(t) (1)
I(t) d= XI(t)
R(t) d= XR(t).
16
These approaches are equivalent
Theorem
Let the CM and AM be as previously described. Then for all t ∈ {1, 2, . . . , T},
S(t) d= XS(t) (1)
I(t) d= XI(t)
R(t) d= XR(t).
16
We can compare CM/AM pairs and AM/AM pairs by fitting the underlying model
0.028
0.029
0.030
0.031
0.096 0.098 0.100 0.102β
γ
Simulation Type
CM
AM
1000 agents; 5000 runs; β = 0.10; γ = 0.03Fitted SIR parameters distribution
17
AMs are appealing because they can be runmultiple times
∙ Simulate an epidemic en masse!
∙ A run - same initial parameters, different random numbers
∙ Runs (L) are independent of one another =⇒ parallelization
∙ Roughly, the variance of compartments ↓ when N, L ↑
Goal: Improve computation time without sacrificing statistical details
18
AMs are appealing because they can be runmultiple times
∙ Simulate an epidemic en masse!
∙ A run - same initial parameters, different random numbers
∙ Runs (L) are independent of one another =⇒ parallelization
∙ Roughly, the variance of compartments ↓ when N, L ↑
Goal: Improve computation time without sacrificing statistical details
18
There is a tradeoff between the number of agents and number of runs
6
8
10
12
14
0 25 50 75 100Time
V(S
1)V
(S2)
5000 runs; β = 0.10; γ=0.03; Model 1−1000 agents, Model 2−100 agentsRatio of Variance of # Susceptibles
6
8
10
12
14
0 25 50 75 100Time
V(I
1)V
(I2)
5000 runs; β = 0.10; γ=0.03; Model 1−1000 agents, Model 2−100 agentsRatio of Variance of # Infected
6
8
10
12
14
0 25 50 75 100Time
V(R
1)V
(R2)
5000 runs; β = 0.10; γ=0.03; Model 1−1000 agents, Model 2−100 agentsRatio of Variance of # Recovered
19
The calculations show that the variance scales
∙ Note that for a given β and γ, if S1(0)N1
= S2(0)N2
=⇒ S1(t)N1
= S2(t)N2
∙ V[S(t+ 1)
]= S(t)(1− pt)pt + (1− pt)
2V[S(t)
]∙ V[S2(t)] = N2
N1V[S1(t)]
V[
1L1∑
runs ℓS1(t)N1
]V[
1L2∑
runs ℓS2(t)N2
] =L2N2
2L1N2
1· V[S1(t)]V[S2(t)]
=L2N2L1N1
.
We can replace agents with runs!
20
The calculations show that the variance scales
∙ Note that for a given β and γ, if S1(0)N1
= S2(0)N2
=⇒ S1(t)N1
= S2(t)N2
∙ V[S(t+ 1)
]= S(t)(1− pt)pt + (1− pt)
2V[S(t)
]
∙ V[S2(t)] = N2N1V[S1(t)]
V[
1L1∑
runs ℓS1(t)N1
]V[
1L2∑
runs ℓS2(t)N2
] =L2N2
2L1N2
1· V[S1(t)]V[S2(t)]
=L2N2L1N1
.
We can replace agents with runs!
20
The calculations show that the variance scales
∙ Note that for a given β and γ, if S1(0)N1
= S2(0)N2
=⇒ S1(t)N1
= S2(t)N2
∙ V[S(t+ 1)
]= S(t)(1− pt)pt + (1− pt)
2V[S(t)
]∙ V[S2(t)] = N2
N1V[S1(t)]
V[
1L1∑
runs ℓS1(t)N1
]V[
1L2∑
runs ℓS2(t)N2
] =L2N2
2L1N2
1· V[S1(t)]V[S2(t)]
=L2N2L1N1
.
We can replace agents with runs!
20
The calculations show that the variance scales
∙ Note that for a given β and γ, if S1(0)N1
= S2(0)N2
=⇒ S1(t)N1
= S2(t)N2
∙ V[S(t+ 1)
]= S(t)(1− pt)pt + (1− pt)
2V[S(t)
]∙ V[S2(t)] = N2
N1V[S1(t)]
V[
1L1∑
runs ℓS1(t)N1
]V[
1L2∑
runs ℓS2(t)N2
] =L2N2
2L1N2
1· V[S1(t)]V[S2(t)]
=L2N2L1N1
.
We can replace agents with runs!
20
The calculations show that the variance scales
∙ Note that for a given β and γ, if S1(0)N1
= S2(0)N2
=⇒ S1(t)N1
= S2(t)N2
∙ V[S(t+ 1)
]= S(t)(1− pt)pt + (1− pt)
2V[S(t)
]∙ V[S2(t)] = N2
N1V[S1(t)]
V[
1L1∑
runs ℓS1(t)N1
]V[
1L2∑
runs ℓS2(t)N2
] =L2N2
2L1N2
1· V[S1(t)]V[S2(t)]
=L2N2L1N1
.
We can replace agents with runs!20
Through paralellization, we can get a speed-up without losing statistical information
0e+00
2e−04
4e−04
6e−04
0 25 50 75 100TimeV
aria
nce
of ∑ l
S(t)
(NL
)
Simulation
100 agents, 4 cores
400 agents, 1 core
S(t) − % susceptible averaged over # of runs
Variance of S(t)
0e+00
2e−04
4e−04
6e−04
0 25 50 75 100TimeV
aria
nce
of ∑ l
I(t)
(NL
)
Simulation
100 agents, 4 cores
400 agents, 1 core
I(t) − % infected averaged over # of runs
Variance of I(t)
0e+00
2e−04
4e−04
6e−04
0 25 50 75 100TimeV
aria
nce
of ∑ l
R(t
)(N
L)
Simulation
100 agents, 4 cores
400 agents, 1 core
R(t) − % recovered averaged over # of runs
Variance of R(t)
Simulation 1 (100 agents, 4 cores, 100 times): 3:30 minutesSimulation 2 (400 agents, 1 core, 100 times): 4:05 minutes
21
Future work
There is more work to be done: short-term
∙ Implementation of current methods in FRED∙ FRED - an open source, supported, flexible AM∙ Incorporate different levels of homogeneity
1. Independent agents2. Agents go to one other activity (school, work, neighborhood)3. Multiple activities
∙ Compare CM and AM parameters empirically
∙ Empirically determine when different regions can be combined
23
Thank you!
Questions?
24