35
Model-Data Interface Parameter estimation and statistical inference

Model-Data Interface

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Model-Data Interface

Model-Data InterfaceParameter estimation and statistical inference

Page 2: Model-Data Interface

Parameteres)ma)on

• We’veseenthatbasicreproduc)vera)o,R0,isaveryimportantquan)ty

• Howdowecalculateit?

• Ingeneral,wemightnotknow(many)modelparameters.Howdoweachieveparameteres)ma)onfromepidemiologicaldata?

• Reviewsomesimplemethods

Page 3: Model-Data Interface

1a.Finaloutbreaksize

• Fromlecture1,werecallthatatendofepidemic:§ S(∞)=1–R(∞)=S(0)e–R(∞)R0

• So,ifweknowpopula)onsize(N),ini)alsuscep)bles(togetS(0)),andtotalnumberinfected(togetR(∞)),wecancalculateR0

Note:Ma&Earn(2006)showedthisformulaisvalidevenwhennumerousassump)onsunderlyingsimpleSIRarerelaxed

R0 = � log(1�R(1))

R(1)

Page 4: Model-Data Interface

1.Finaloutbreaksize

• Workedexample:

InfluenzaepidemicinaBri)shboardingschoolin1978

N=764X(0)=763Z(∞)~512

R0~1.65

Page 5: Model-Data Interface

1b.Finaloutbreaksize

• Beckershowedthatwithmoreinforma)on,wecanalsoes)mateR0from

• Again,weneedtoknowpopula)onsize(N),ini)alsuscep)bles(X0),totalnumberinfected(C)

• Usefully,standarderrorforthisformulahasalsobeenderived

(~1.66)

Page 6: Model-Data Interface

Smallaside:meanageatinfec)on

• Anepidemiologicallyinteres)ngquan)tyismeanageatinfec)on–howdowecalculateitinsimplemodels?

• Fromfirstprinciples,it’smean)mespentinsuscep)bleclass

• Atequilibrium,thisisgivenby1/(βI*),whichleadsto

§ ThiscanbewrigenasR0-1≈L/A (L=lifeexpectancy)

§Historically,thisequa)on’sbeenanimportantlinkbetweenepidemiologicales)matesofAandderivinges)matesofR0

A =

✓1

µ(R0 � 1)

Page 7: Model-Data Interface

2.Independentdata

• ForS(E)IRmodel,wecancalculateaveragelengthof)meittakesforanindividualtoacquireinfec)on(assumingbornsuscep)ble)

• ExpressionforMeanAgeatInfec1onis

R0 is mean life expectancy (L) divided by mean age at infec8on (A)

Page 8: Model-Data Interface

MeaslesAge-Stra)fiedSeroprevalence

Infec8on-derived immunity

Maternally-derived an8bodies

Mean age at infec8on (A) is ~4.5 yearsAssume L~75, so R0 ~ 16.6

Page 9: Model-Data Interface

Historicalsignificance

Anderson&May(1982;Science)

Page 10: Model-Data Interface

3.EpidemicTake-off

Recallfromlinearstabilityanalysisthat

Takelogarithms

So,regressionslopewillgiveR0

Aslightlymorecommonapproachistostudytheepidemictakeoff

Page 11: Model-Data Interface

3.Epidemictake-off

• Backtoschoolboys

Lookslikeclassicexponen)altake-off

Page 12: Model-Data Interface

Epidemictake-off

Λ=1.0859

So,R0=1.0859*2.5+1=3.7

Ourvaluefor‘fluincuba1onperiod

Page 13: Model-Data Interface

Vynnyckyetal.(2007)

Page 14: Model-Data Interface

Vynnyckyetal.(2007)

Page 15: Model-Data Interface

Variantsonthistheme

• Recall

• LetTdbe‘doubling)me’ofoutbreak

• Then,★ R0=log(2)/Tdγ +1

Page 16: Model-Data Interface

4.Likelihood&inference

• Wefocusonrandomprocessthat(puta)vely)generateddata

• Amodelisexplicit,mathema)caldescrip)onofthisrandomprocess

• “Thelikelihood”isprobabilitythatdatawereproducedgivenmodelanditsparameters:

L(model|data)=Pr(data|model)

• Likelihoodquan)fies(insomesenseop)mally)modelgoodnessoffit

16

Page 17: Model-Data Interface

4.Likelihood&es)ma)on

• Assumewehavedata,D,andmodeloutput,M(botharevectorscontainingstatevariables).Modelpredic)onsgeneratedusingsetofparameters,θ

• Transmissiondynamicssubjectto– “processnoise”:heterogeneityamongindividuals,randomdifferencesin)mingofdiscreteevents(environmentalanddemographicstochas)city)– “observa)onnoise”:randomerrorsmadeinmeasurementprocessitself

Page 18: Model-Data Interface

4.Likelihood&es)ma)on

• Ifweignoreprocessnoise,thenmodelisdeterminis)candallvariabilityagributedtomeasurementerror

• Observa)onerrorsassumedtobesequen)allyindependent

• Maximizinglikelihoodinthiscontextiscalled‘trajectorymatching’

Page 19: Model-Data Interface

4.Likelihood&es)ma)on

• Data,D• Modeloutput,M• Parameters,θ

• Ifweassumemeasurementerrorsarenormallydistributed,withmeanµandvarianceσ2then

Page 20: Model-Data Interface

4.Likelihood&es)ma)on

• Data,D• Modeloutput,M• Parameters,θ

• OweneasiertodealwithLog-likelihoods:

Page 21: Model-Data Interface

4.Likelihood&es)ma)on

• Undersuchcondi)ons,MaximumLikelihoodEs)mate,MLE,issimplyparametersetwithsmallestdevia)onfromdata

• Equivalenttousingleastsquareerrors,todecideongoodnessoffit

– LeastSquaresSta)s)c=SSE=Σ(Di–Mi)2

• Then,miminiseSSEtoarriveatMLE

Page 22: Model-Data Interface

Trajectorymatching

0 2 4 6 8 10 12 140

50

100

150

200

250

300

350

400β=0.006; γ=0.7719; SSE=384519.15

Time (days)

Infecteds

β=4.58,γ=0.7719,SSE=384519

Page 23: Model-Data Interface

Trajectorymatching

23

0 2 4 6 8 10 12 140

50

100

150

200

250

300

350

400

450β=0.004; γ=0.4719; SSE=195130.1341

Time (days)

Infecteds

β=3.05,γ=0.47,SSE=195130

Page 24: Model-Data Interface

Modeles)ma)on:InfluenzaoutbreakC

ases

0

75

150

225

300

Day of Outbreak1 2 3 4 5 6 7 8 9 10 11 12 13 14

•Systema)callyvaryβandγ,calculateSSE

•Parametercombina)onwithlowestSSEis‘bestfit’

0 0.5 11.5 0

24

68

0

2

4

6

8

10

12

14

16

x 105

Transmission rate (βRecovery rate (γ)

Sum of Squared Errors (SSE)

2

4

6

8

10

12

14

x 105

00.5

11.5 0

24

68

8

9

10

11

12

13

14

15

Transmission rate (β)Recovery rate (γ)

Log(SSE)

9

9.5

10

10.5

11

11.5

12

12.5

13

13.5

14

Page 25: Model-Data Interface

0.2 0.4 0.6 0.8 1 1.2 1.40

2

4

6

8x 105

SSE

0.2 0.4 0.6 0.8 1 1.2 1.41

2

3

4

5

6

7

Recovery rate (γ)

Transmission rate (β)

0 2 4x 105

1

2

3

4

5

6

7

SSE

Modeles)ma)on:Influenzaoutbreak

25

Bestfitparametervalues:1. β=1.96(perday)2. 1/γ=2.1days3. R0~4.15

β=1.96

γ=0.47

Generally,mayhavemoreparameterstofit,sogridsearchnotefficient

Nonlinearop)miza)onalgorithms(egNelder-Mead)wouldbeused

Page 26: Model-Data Interface

4.Likelihood&es)ma)on

• HowdowerelateSSEtologLik?

=SSE=SSE/n

=lengthofdata

Page 27: Model-Data Interface

Modeles)ma)on:Influenzaoutbreak

00.5

11.5 0

24

68

8

9

10

11

12

13

14

15

Transmission rate (β)Recovery rate (γ)

Log(SSE)

9

9.5

10

10.5

11

11.5

12

12.5

13

13.5

14

SSE LogLik

Page 28: Model-Data Interface

Modeles)ma)on:Influenzaoutbreak

28

MaximumLikelihoodEs)mates:1. β=1.96(perday)2. 1/γ=2.1days3. R0~4.15

β=1.96

γ=0.47

Recall2log-likelihoodunitsindicatesignificantdifference

Canuselikelihoodprofilestoputconfidenceintervalsones)mates

β=1.96(1.90,2.04)γ=0.47(0.43,0.50)

Page 29: Model-Data Interface

Modelcomparison

• Howtocomparemodelswithdifferentnumberofes)matedparameters?

• CommonlyuseAkaike’sInforma)onCriterion

• AIC=2p-2logLik,wherepisnumberofes)matedparametersformodel

• rule-of-thumb:ifAICdifference<2,modelsindis)nguishable

29

SIR Model2

β 1.96(1.90,2.04)

γ 0.47(0.43,0.50)

logLik -60.95

AIC 125.9

Page 30: Model-Data Interface

Likelihoodes)ma)on

30

β=1.96,γ=0.47,Loglik=-60.95

0 2 4 6 8 10 12 140

50

100

150

200

250

300

350β=0.004; γ=0.4719; SSE=4951.6403

Time (days)

Infecteds

Page 31: Model-Data Interface

Likelihoodsurface

31

−1−0.5

00.5

11.5

22.5

3

−3

−2

−1

0

1

0

1

2

3

4

5

6

x 106

log10(Recovery rate (γ))log10(Transmission rate (β))

SSE

Whenlikelihoodsurfaceissomewhatcomplex,successofes)ma)onusinggradient-basedop)miza)onalgorithms(egNelder-Mead)willdependonprovidingagoodini)alguess

Page 32: Model-Data Interface

Caveat

• Inboardingschoolexample,datarepresentnumberofboyssick~Y(t)

• Typically,dataare‘incidence’(newlydetectedorreportedinfec)ons)

• Don’tcorrespondtoanymodelvariables• Mayneedto‘construct’newinforma)on:– dC/dt=γY diagnosisatendofinfec)ousness– dC/dt=βXY/N

• SetC(t+Δt)=0whereΔtissamplingintervalofdata

Page 33: Model-Data Interface

LectureSummary…

• R0canbees)matedfromepidemiologicaldatainavarietyofways– Finalepidemicsize–Meanageatinfec)on– Outbreakexponen)algrowthrate– CurveFi�ng

• Inprinciple,varietyofunknownparametersmaybees)matedfromdata

Page 34: Model-Data Interface

Further,...

1.Includeuncertaintyinini)alcondi)ons•WetookI(0)=1.Insteadcouldes)mateI(0)togetherwithβandγ(nowhave1fewerdatapoints)

2.Explicitobserva)onmodel•Implicitlyassumedmeasurementerrorsnormallydistributedwithfixedvariance,butcanrelaxthisassump)on

3.Whatisappropriatemodel?•SEIRmodel?(latentperiodbeforebecominginfec)ous)•SEICRmodel?(“confinementtobed”)•Timevaryingparameters?(e.g.ac)ontakentocontrolspread)

34

Page 35: Model-Data Interface

Further,...

4.Assumedmodeldeterminis)c--howdowefitastochas)cmodel?•Usea‘par)clefilter’tocalculatelikelihood

5.Canwesimultaneouslyes)matenumerousparameters?•Morecomplexmodelshavemoreparameters…es)mateallfrom14datapoints?⇒iden)fiability

6.Morecomplexmodelsaremoreflexible,sotendtofitbeger•Howdowedetermineifincreasedfitjus)fiesincreasedcomplexity?⇒informa)oncriteria

35