Upload
calum
View
29
Download
0
Embed Size (px)
DESCRIPTION
Disease signatures – a simple combinatorial-type exploitation of them for our own evil purposes. Prof. Nina H. Fefferman Visiting DIMACS from : Tufts Univ. School of Medicine, Dept. Public Health and Family Medicine. Plan for today: Looking very quickly at traditional SIR models - PowerPoint PPT Presentation
Citation preview
Disease signatures – a simple combinatorial-type
exploitation of them for our own evil purposes
Prof. Nina H. Fefferman
Visiting DIMACS from :
Tufts Univ. School of Medicine, Dept. Public Health and Family Medicine
Plan for today:1) Looking very quickly at traditional SIR
models
2) Communication problems
3) Tweaking parameter definitions
4) Using these definitions to clear up communication
5) Building disease signatures
6) Decomposing reported disease into component signature curves
7) Checking this method against reality
8) Where this method can take us from here…
A quick look at SIR models
I(t) = number of infected S(t) = number of susceptibles R(t) = number of recovered
in the population at time t
And if we want spatial spread :
Keep R, but I(t, x, y) and S(t, x, y) become functions of position (x, y), and a is replaced by an expression involving two other constants related to the rate at which the infection diffuses through space
Pictures of equations stolen from : http://maven.smith.edu/~callahan/ili/pde.html
Go ask HHS or NIH or CDC for a and b for
the next flu season so
our models can predict
it.
Good luck.
Leads us to : Communication Problems
Parameters/Variables used by epidemiologists are warm and fuzzy and not rigorously defined
So modelers made up their own (you just saw them) – these aren’t things doctors/public health people can really measure we can’t get accurate parameter values
Example: MANY people are worried about outbreaks
There is no good definition of what constitutes an outbreak
BIG problem (mostly just ignored)
Modelers use the concept of R0 – the reproductive number of disease (in the differential equation model, it’s the ratio of S to a/b)
It’s when the average number of new infections caused by contact with a current infection
is greater than 1
Really, if we think about it, public health people want ‘outbreak’ to refer to “times when we need to pay attention to disease spread for some reason”
How can we say this mathematically?
Communication Problems cont.
R0 gives us a rigorous definition of something good, but not of what we really
need ‘outbreak’ to mean
InfectivityInfectivity :: Probability of becoming infectious Probability of becoming infectious after becoming exposed after becoming exposed
Attack rateAttack rate :: Probability of developing disease Probability of developing disease after becoming exposedafter becoming exposed
Pathogenicity :Pathogenicity : Probability of developing disease Probability of developing disease after becoming infected after becoming infected
Virulence :Virulence : Probability of dying after becoming ill Probability of dying after becoming ill
ImmunogenicityImmunogenicity :: Attack rate for re-exposure Attack rate for re-exposure
What can public health people/ doctors measure (at least sometimes)?
Communication Problems cont.
So : • E(X,T)= Probability of exposure in population X at time T
• I = Probability of infection from exposure
• ST = Probability that infection at time 0 leads to manifestation of symptoms at time T (a distribution function which does not need to sum to one if not all of the infected develop symptoms)
• CT = Probability that infection takes T days to become contagion
• MT = Probability that the time from the onset of symptoms to death from the disease is T days
• NT = Size of the population possibly exposed to infection on day T (this will be our disease signature curve)
• IT = Probability of infection from current exposure, given previous infection T days ago
Tweaking Parameter Definitions
Really, these are
all functions of time, but my journal
referees got upset
with functions, so most are now subscript
s
Clearing up communication
With those we can build :Pathogenicity :Pathogenicity : The probability of developing The probability of developing
disease after becoming infecteddisease after becoming infected
= = SSTT , for n the maximum recovery time, for n the maximum recovery timen
T=0
Virulence :Virulence : The probability of dying after becoming illThe probability of dying after becoming ill
= = MMTT , for n the maximum , for n the maximum
recovery timerecovery time
n
T=0
Infectivity : The probability of becoming infectious after becoming exposed
= I* CT , for n the end of the window for the disease
n
T=0
And : Attack rate : The probability of developing disease
after becoming exposed
= I * ST , for n the end of the window for disease expression
n
T=0
But now we notice that, from our original list, Immunogenicity is not a truly meaningful idea, so we define instead:
PsuedoImmunogenicity : Probability of infection from current exposure, given previous infection T days ago = IT
Clearing up communication cont.
We won’t be using all of these today, but they’re still useful to have if you ever need to talk to health people
Now both the math and
health people have
the same picture!
Clearing up communication cont.
But this is only one town
The SIR models could handle spatial spread with PDEs…
Uses a slightly different notation
Clearing up communication cont.
? ?
With multiple locations and central reporting :
Notice : different occurrences don’t have to Notice : different occurrences don’t have to be separated only spatially or temporallybe separated only spatially or temporally
Can be different demographic populations, or Can be different demographic populations, or anything that allows narrower, more anything that allows narrower, more accurate estimations of exposure or accurate estimations of exposure or susceptibilitysusceptibility
Let’s call these narrower things Let’s call these narrower things subpopulationssubpopulations
Clearing up communication cont.
For a given subpopulation, we can For a given subpopulation, we can compute a ‘disease signature curve’ compute a ‘disease signature curve’ representing the number of cases representing the number of cases predicted over time from a predicted over time from a singlesingle instance of exposureinstance of exposure
Notice : these signature curves depend on Notice : these signature curves depend on subpopulation-specific etiology, subpopulation-specific etiology, including the including the shapeshape of the distribution of the distribution for some parameters – for some parameters – notnot just averages just averages
Building Disease Signatures
So, using our definitions and our flow chart:
Decomposing curves into signatures
So, if we have a total reported disease So, if we have a total reported disease curve, we can iteratively definecurve, we can iteratively define
(Notice populations exposed on different days are disjoint sets (Notice populations exposed on different days are disjoint sets due to the definitions)due to the definitions)
Now we can think of a single reported Now we can think of a single reported curve curve CCTT as the composition of these as the composition of these curvescurves
Decomposing curves into signatures cont.
Since we are interested in exploiting the Since we are interested in exploiting the heterogeneity of etiological response heterogeneity of etiological response within a diverse population, we can within a diverse population, we can specify these curves by subpopulation specify these curves by subpopulation YY: :
Yielding the total disease incidence curve:Yielding the total disease incidence curve:
Decomposing curves into signatures cont.
And we can even exploit immune memory And we can even exploit immune memory by further dividing subpopulations into by further dividing subpopulations into classes of those with similar immune classes of those with similar immune protection from previous infection protection from previous infection
With With IIT = = Probability of infection given previous infection T Probability of infection given previous infection T
days agodays ago
And T* = the last day of most recent prior infectionAnd T* = the last day of most recent prior infection
Giving usGiving us
Now we can use Now we can use high school mathhigh school math to find to find combinations of signature curves that make up the combinations of signature curves that make up the
total reported cases curve!total reported cases curve!
How many different combinations of coins can make $1.50…
Similarly, we can ask how many
combinations of ‘signature curves’
can go into a ‘Total Reported Cases’
curve:
10¢ 5¢25¢
Coins Sub-Populations
Important because public health people may trust it
Decomposing curves into signatures cont.
Decomposing curves into signatures cont.
Now let’s come back to the idea of an outbreak:
Remember, we wanted ‘outbreak’ to mean “times when we need to pay attention to
disease spread for some reason”Suppose that the only combination of disease signature curves
was to have EVERY subpopulation just beginning to show symptoms from a disease – that means that soon many many more people will be sick – we should probably pay attention to
that
OR
Maybe the only combination of signature curves indicates that only one location has been exposed – we might want to use that to find out what the source of exposure was, or quarantine the
area
No matter how we choose to define it (will be arbitrary), this method can tell us WHY we should care now
Decomposing curves into signatures cont.
Let’s take a look at an example of how this can work
To begin with, let’s look at something very simple :
Giardiasis – a waterborne infection causing diarrheal disease in humans
with extremely low levels of secondary transmission (makes life simpler)
There was an actual “outbreak” in MA in 1995
Decomposing curves into signatures cont.
Reported incidence for MA (all of it)
HIPPA requires aggregation of data released to public and to most researchers without special access
Decomposing curves into signatures cont.
Decomposing curves into signatures cont.
To use this method, we need some measured parameter
values
I’m cheating a little because I’m assuming
values for I, but we could in theory measure this
Decomposing curves into signatures cont.
We know that most of the reporting came from 3 urban centers:
Decomposing curves into signatures cont.
Then we can decompose by demographic subgroup for each town:
Decomposing curves into signatures cont.
That was a really simple disease without any secondary transmission
So what happens if there is secondary spread?
It gets MUCH more complicated…
First of all, the probability of exposure in each subpopulation can start to depend on the levels of
infection in each other subpopulation
Now we start getting into the social network stuff
An aside
Social Networks : Oy vey
Since this is a talk and not a course, I can’t leave this as an exercise to the reader, but I can use the
‘we only have a little over an hour’ excuse to hand-wave some of the modeling details on this –
I’m going to talk about the concepts
If you are interested in the details, well, that’s why I’m going to be around for the year
Again, rather than using mass averages, let’s still keep the idea of a disease signature
So exposure isn’t a simple underlying rate - it’s based on contacting an infected individual
We can think of individuals in each subgroup as having certain probabilities of interacting with
others, possibly in other subgroups(People in the room who think of social interactions as edges in a graph, this is almost the same - it’s like weighted edges in a complete graph)
Also, membership in particular subgroups can changes over time (e.g. children becoming adults)
(In this case, both vertex states and edge weights can be thought of as vertex-state dependent progressions)
This all gets complicated enough that it’s nerve wracking not to check model
outcomes against some form of reality
Need to :
1. measure all model parameters
2. create disease outbreaks
3. check predicted spread against what actually happens
(I tried to get Thus Spake Zarathustra to play now, but I couldn’t make it work)
My beautiful termites
Checking Reality
On Thursday, at the DIMACS Mixer, I’ll be talking to you about ‘Why Termites’
For now, just go with it
Checking Reality cont.
Spores land on termite
Allogroomed off
Temporary
Immunity
Burrow throug
h cuticle
Death
The particular details:
Not a termite
Zootermopsis angusticollis
Metarhizium anisopliae
So we built some CA simulation models
Including age-based differences in :
1. direction of wandering through nest
2. interaction rates
3. exposure rates
4. susceptibility to infection from exposure
5. mortality from infection
6. efficacy/duration of induced immunity (via social vaccination)
As the model ran, individuals aged and behaved accordingly
Checking Reality cont.
Checking Reality cont.
And…
Thank god, all the work so far has shown that the models predict
spread accurately
Whew!
We’re even getting some interesting new directions
Regardless of why specific outputs happen
Now that we know the model can work, we can work backwards
Fit model outcome to observed data and look at which sets of parameter values and behavioral
mixing rates produce them
This might provide an odd way of understanding human social networks – especially since they can so dramatically
affect model output
Maybe this last part is a pipe-dream.
Who knows, but it’s so crazy it just might work…
Thanks for asking me to speak to you
I hope you’ve had funSome of what I’ve talked about has been accomplished in collaboration with Elena Naumova, James Traniello
and Rebeca Rosengaus
My thanks to the NIH for funding support for this research