Comparing Compartment and Agent-based Modelssgallagh/papers/jsm-2017-slides.pdf · Let the aggregate total in each compartment be ^X k(t) = ∑N n=1 Ifxn(t) = kg 13. Themeansoverlap

Comparing Compartment and Agent-basedModels

Shannon GallagherJSM Baltimore, MDAugust 2, 2017

Thesis work with:William F. Eddy (Chair)Joel GreenhouseHoward SeltmanCosma ShaliziSamuel L. Ventura

Goal: Combine two goodmodels into a better one

1

Studying infectious disease is important

2

Compartment vs. Agent-basedModels

Compartment models (CMs) describe how individuals evolve over time

Assumptions (Anderson and May 1992) :

1. Homogeneity of individuals

2. Law of mass actionI(t+ 1) ∝ I(t)

4

Compartment models (CMs) describe how individuals evolve over time

Assumptions (Anderson and May 1992) :

1. Homogeneity of individuals

2. Law of mass actionI(t+ 1) ∝ I(t)

4

Agent-basedmodels (AMs) simulate the spread of disease

Assumptions (Helbing 2002):

1. Heterogeneity of agents

2. Model adequately reflects reality

5

Agent-basedmodels (AMs) simulate the spread of disease

Assumptions (Helbing 2002):

1. Heterogeneity of agents2. Model adequately reflects reality

5

CMs and AMs: a side by side comparison

CMs

∙ Equation-based∙ Computationally fast∙ Homogeneous individuals∙ No individual properties

AMs

∙ Simulation-based∙ Computationally slow∙ Heterogeneous individuals∙ Individual properties

6

Combining the two together

(Bobashev 2007, Banos 2015, Wallentin 2017)

∙ ad hoc approaches

∙ perspective from non-statisticians

Goal: Create a statistically justified hybrid model

7

Combining the two together

(Bobashev 2007, Banos 2015, Wallentin 2017)

∙ ad hoc approaches

∙ perspective from non-statisticians

Goal: Create a statistically justified hybrid model

7

CurrentWork

There are twomain avenues of improvement

1. Quantifying how similar CMs and AMs are

2. Speeding up AM run-time

9

The SIRmodel: a detailed look

(Kermack and McKendrick 1927) dSdt = −βSI

NdIdt = βSI

N − γIdRdt = γI

∙ β – rate of infection∙ γ – rate of recovery∙ N – total population size

10

The SIRmodel: a detailed look

(Kermack and McKendrick 1927) ∆S∆t = −βSI

N∆I∆t = βSI

N − γI∆R∆t = γI

∙ β – rate of infection∙ γ – rate of recovery∙ N – total population size

11

Our stochastic CM approach

S(t+ 1) = S(t)− stR(t+ 1) = R(t) + rtI(t+ 1) = N− S(t+ 1)− R(t+ 1),

with

st+1 ∼ Binomial(S(t), βI(t)N

)rt+1 ∼ Binomial

(I(t), γ

).

12

Our stochastic AM approach

For an agent xn(t), n = 1, 2, . . . ,N, the forward operator for t > 0 is

xn(t+ 1) =

xn(t) + Bernoulli

(βI(t)N

)if xn(t) = 1

xn(t) + Bernoulli (γ) if xn(t) = 2xn(t) otherwise

.

where xn(t) = k, k ∈ {1, 2, 3} corresponds to state S, I, and R, respectively

Let the aggregate total in each compartment be

Xk(t) =N∑

n=1I{xn(t) = k}

13

Themeans overlap

0

25

50

75

100

0 25 50 75 100Time

% o

f Pop

ulat

ion

Type

S−CM

I−CM

R−CM

S−AM

I−AM

R−AM

1000 agents; 5000 runs; β = 0.10; γ = 0.03Mean Proportion of Compartment Values

14

The distributions look the same

15

These approaches are equivalent

Theorem

Let the CM and AM be as previously described. Then for all t ∈ {1, 2, . . . , T},

S(t) d= XS(t) (1)

I(t) d= XI(t)

R(t) d= XR(t).

16

These approaches are equivalent

Theorem

Let the CM and AM be as previously described. Then for all t ∈ {1, 2, . . . , T},

S(t) d= XS(t) (1)

I(t) d= XI(t)

R(t) d= XR(t).

16

We can compare CM/AM pairs and AM/AM pairs by fitting the underlying model

0.028

0.029

0.030

0.031

0.096 0.098 0.100 0.102β

γ

Simulation Type

CM

AM

1000 agents; 5000 runs; β = 0.10; γ = 0.03Fitted SIR parameters distribution

17

AMs are appealing because they can be runmultiple times

∙ Simulate an epidemic en masse!

∙ A run - same initial parameters, different random numbers

∙ Runs (L) are independent of one another =⇒ parallelization

∙ Roughly, the variance of compartments ↓ when N, L ↑

Goal: Improve computation time without sacrificing statistical details

18

AMs are appealing because they can be runmultiple times

∙ Simulate an epidemic en masse!

∙ A run - same initial parameters, different random numbers

∙ Runs (L) are independent of one another =⇒ parallelization

∙ Roughly, the variance of compartments ↓ when N, L ↑

Goal: Improve computation time without sacrificing statistical details

18

There is a tradeoff between the number of agents and number of runs

6

8

10

12

14

0 25 50 75 100Time

V(S

1)V

(S2)

5000 runs; β = 0.10; γ=0.03; Model 1−1000 agents, Model 2−100 agentsRatio of Variance of # Susceptibles

6

8

10

12

14

0 25 50 75 100Time

V(I

1)V

(I2)

5000 runs; β = 0.10; γ=0.03; Model 1−1000 agents, Model 2−100 agentsRatio of Variance of # Infected

6

8

10

12

14

0 25 50 75 100Time

V(R

1)V

(R2)

5000 runs; β = 0.10; γ=0.03; Model 1−1000 agents, Model 2−100 agentsRatio of Variance of # Recovered

19

The calculations show that the variance scales

∙ Note that for a given β and γ, if S1(0)N1

= S2(0)N2

=⇒ S1(t)N1

= S2(t)N2

∙ V[S(t+ 1)

]= S(t)(1− pt)pt + (1− pt)

2V[S(t)

]∙ V[S2(t)] = N2

N1V[S1(t)]

V[

1L1∑

runs ℓS1(t)N1

]V[

1L2∑

runs ℓS2(t)N2

] =L2N2

2L1N2

1· V[S1(t)]V[S2(t)]

=L2N2L1N1

.

We can replace agents with runs!

20



= S2(0)N2

=⇒ S1(t)N1

= S2(t)N2

∙ V[S(t+ 1)

]= S(t)(1− pt)pt + (1− pt)

2V[S(t)

]

∙ V[S2(t)] = N2N1V[S1(t)]

V[

1L1∑

runs ℓS1(t)N1

]V[

1L2∑

runs ℓS2(t)N2

] =L2N2

2L1N2

1· V[S1(t)]V[S2(t)]

=L2N2L1N1

.


20



= S2(0)N2

=⇒ S1(t)N1

= S2(t)N2

∙ V[S(t+ 1)

]= S(t)(1− pt)pt + (1− pt)

2V[S(t)

]∙ V[S2(t)] = N2

N1V[S1(t)]

V[

1L1∑

runs ℓS1(t)N1

]V[

1L2∑

runs ℓS2(t)N2

] =L2N2

2L1N2

1· V[S1(t)]V[S2(t)]

=L2N2L1N1

.


20



= S2(0)N2

=⇒ S1(t)N1

= S2(t)N2

∙ V[S(t+ 1)

]= S(t)(1− pt)pt + (1− pt)

2V[S(t)

]∙ V[S2(t)] = N2

N1V[S1(t)]

V[

1L1∑

runs ℓS1(t)N1

]V[

1L2∑

runs ℓS2(t)N2

] =L2N2

2L1N2

1· V[S1(t)]V[S2(t)]

=L2N2L1N1

.


20



= S2(0)N2

=⇒ S1(t)N1

= S2(t)N2

∙ V[S(t+ 1)

]= S(t)(1− pt)pt + (1− pt)

2V[S(t)

]∙ V[S2(t)] = N2

N1V[S1(t)]

V[

1L1∑

runs ℓS1(t)N1

]V[

1L2∑

runs ℓS2(t)N2

] =L2N2

2L1N2

1· V[S1(t)]V[S2(t)]

=L2N2L1N1

.

We can replace agents with runs!20

Through paralellization, we can get a speed-up without losing statistical information

0e+00

2e−04

4e−04

6e−04

0 25 50 75 100TimeV

aria

nce

of ∑ l

S(t)

(NL

)

Simulation

100 agents, 4 cores

400 agents, 1 core

S(t) − % susceptible averaged over # of runs

Variance of S(t)

0e+00

2e−04

4e−04

6e−04

0 25 50 75 100TimeV

aria

nce

of ∑ l

I(t)

(NL

)

Simulation

100 agents, 4 cores

400 agents, 1 core

I(t) − % infected averaged over # of runs

Variance of I(t)

0e+00

2e−04

4e−04

6e−04

0 25 50 75 100TimeV

aria

nce

of ∑ l

R(t

)(N

L)

Simulation

100 agents, 4 cores

400 agents, 1 core

R(t) − % recovered averaged over # of runs

Variance of R(t)

Simulation 1 (100 agents, 4 cores, 100 times): 3:30 minutesSimulation 2 (400 agents, 1 core, 100 times): 4:05 minutes

21

Future work

There is more work to be done: short-term

∙ Implementation of current methods in FRED∙ FRED - an open source, supported, flexible AM∙ Incorporate different levels of homogeneity

1. Independent agents2. Agents go to one other activity (school, work, neighborhood)3. Multiple activities

∙ Compare CM and AM parameters empirically

∙ Empirically determine when different regions can be combined

23

Thank you!

Questions?

24

Documents

Comparing Compartment and Agent-based Modelssgallagh/papers/jsm-2017-slides.pdf · Let the aggregate total in each compartment be ^X k(t) = ∑N n=1 Ifxn(t) = kg 13. Themeansoverlap