Upload
michael-stumpf
View
487
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Mathematical models of signalling and gene regulatory systems are abstractions of much more complicated processes. Even as more and larger data sets are becoming available we are not be able to dispense entirely with mechanistic models of real-world processes; nor should we. However, trying to develop informative and realistic models of such systems typically involves suitable statistical inference methods, domain expertise and a modicum of luck. Except for cases where physical principles provide sucient guidance it will also be generally possible to come up with a large number of potential models that are compatible with a given biological system and any finite amount of data generated from experiments on that system. Here I will discuss how we can systematically evaluate potentially vast sets of mechanistic candidate models in light of experimental and prior knowledge about biological systems. This enables us to evaluate quantitatively the dependence of model inferences and predictions on the assumed model structures. Failure to consider the impact of structural uncertainty introduces biases into the analysis and potentially gives rise to misleading conclusions.
Citation preview
Gaining Confidence in Signalling and Regulatory Networks
Michael P.H. Stumpf
Theoretical Systems Biology, Imperial CollegeLondon
15/09/2014
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 1 of 17
Modelling Choices and Opportunities
Data
ODEs, PDEs,SDEs, SSA
Figure 10: A directed network with a feed-forward motif highlighted in colour. In the case of a transcriptionregulation network the arrows indicate the direction of transcriptional control.
Appendix: Motifs and Networks
A network or a graph is a mathematical object consisting of a set V of vertices, and a set E of edges
connecting vertices. If the edges have arrows or directions then the network is called a directed network.
Many different types of data and relationships may be represented in this way. Examples of undirected
networks are networks of pairs of proteins which are known to interact. The networks considered in this
paper have as their vertices genes which regulate or are regulated by the product of other genes. The edges
then represent the relationship of control, and therefore the network is a directed network, with the
direction of the edges indicating the direction of control. Figure 10 is an example of a directed network.
Motifs, introduced in [1] are small sub-networks or patterns of vertices and edges which occur with in the
network. A motif is considered to be interesting if it occurs unusually frequently in the network. In Figure
10 a (feed-forward) motif is highlighted in colour.
Supplementary Material: Ordinary Differential Equations describing model
The followingare the system of coupled ordinary differential equations which model the coherent bifan
network, for which full cooperativity occurs. In this simple case, the kinetic parameters used for all four
genes are identical. Note in particular the coupling terms between the DNA elements, DZ and DW and the
regulatory proteins PX and PY . The functions InX and InY are modelled as offset Heaviside (step
funtions), eg:
InX = 100Θ(3600− t)
21
Petri Nets,Boolean Nets,Bayesian Nets
GraphicalModels, Rele-
vance Networks
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 2 of 17
How Do we Gauge Models?
Essentially, all models are wrong, but some are useful.
George E.P. Box
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 3 of 17
How Do we Gauge Models?
Essentially, all models are wrong, but some are useful.
George E.P. Box
Graphical Models• In graphical models we
assign a probability for eachedge to be present or not.
• Thus we infer a probabilitydistribution over networks.
• This depends on the dataand the manner in which wecalculate the edgeprobabilities.
Thorne et al., MolBiosyst (2013).
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 3 of 17
How Do we Gauge Models?
Essentially, all models are wrong, but some are useful.
George E.P. Box
Proteasome Kinectis
Maud Menten
E + Sk1−−−−
k−1
ESk2−−→ E + P
v =[S]Vmax
[S] + Km
Leonor Michaelis
For any mechanistic (biophysical) model any statement is conditionalon the assumed model structure.
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 3 of 17
How Do we Gauge Models?
Essentially, all models are wrong, but some are useful.
George E.P. Box
Useful Models• If some models are useful in a given context there will probably be
quite a few that are useful.• Vice versa, if a given model an abstraction of something more
complex, then we need to see how robust our analysis is to theunderlying assumptions.
Statisticians, like artists, should never fall in love with theirmodels.
George E.P. Box
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 3 of 17
How Do we Gauge Models?
Essentially, all models are wrong, but some are useful.
George E.P. Box
Useful Models• If some models are useful in a given context there will probably be
quite a few that are useful.• Vice versa, if a given model an abstraction of something more
complex, then we need to see how robust our analysis is to theunderlying assumptions.
Statisticians, like artists, should never fall in love with theirmodels.
George E.P. Box
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 3 of 17
Model Selection
Kirk et al., Curr.Opin.Biotech., 2013, 24:767-774.
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 4 of 17
Models and Reality
(Hidden) Assumptions• Models are oversimplified by design and necessity.
• The underlying assumptions may influence — even bias — whatwe learn about reality.
• Here we consider how model assumptions affect our analyses.
True Models
Time
Concentration
x(t) = f(x(t), t;)
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 5 of 17
Models and Reality
(Hidden) Assumptions• Models are oversimplified by design and necessity.• The underlying assumptions may influence — even bias — what
we learn about reality.
• Here we consider how model assumptions affect our analyses.
True Models
Time
Concentration
x(t) = f(x(t), t;)
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 5 of 17
Models and Reality
(Hidden) Assumptions• Models are oversimplified by design and necessity.• The underlying assumptions may influence — even bias — what
we learn about reality.• Here we consider how model assumptions affect our analyses.
True Models
Time
Concentration
x(t) = f(x(t), t;)
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 5 of 17
Models and Reality
(Hidden) Assumptions• Models are oversimplified by design and necessity.• The underlying assumptions may influence — even bias — what
we learn about reality.• Here we consider how model assumptions affect our analyses.
True Models
Time
Concentration
x(t) = f(x(t), t;)
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 5 of 17
Too Many Models
1
3
2
1
3
2 1
3
2 1
3
2
A
B
A) No. of combinations for a coupled ODE system:
B) No. of combinations if species considered independently:
Babtie et al., (2014)
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 6 of 17
Should we Trust the Data or the Model?
t
x(t)/y(t)
dxdt
= f (x , y ; θx)
dydt
= g(x , y ; θy)
xobs
If the model is correct and the data is noiseless then
xobs =dxdt
= f (x , y ; θx)
We replace y by yobs and then consider
dxdt
= f (x , yobs; θx)
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 7 of 17
Should we Trust the Data or the Model?
t
x(t)/y(t)
dxdt
= f (x , y ; θx)
dydt
= g(x , y ; θy)
xobs
If the model is correct and the data is noiseless then
xobs =dxdt
= f (x , y ; θx)
We replace y by yobs and then consider
dxdt
= f (x , yobs; θx)
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 7 of 17
Should we Trust the Data or the Model?
t
x(t)/y(t)
dxdt
= f (x , y ; θx)
dydt
= g(x , y ; θy)
xobs
If the model is correct and the data is noiseless then
xobs =dxdt
= f (x , y ; θx)
We replace y by yobs and then consider
dxdt
= f (x , yobs; θx)
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 7 of 17
Should we Trust the Data or the Model?
t
x(t)/y(t)
dxdt
= f (x , y ; θx)
dydt
= g(x , y ; θy)
xobs
If the model is correct and the data is noiseless then
xobs =dxdt
= f (x , y ; θx)
We replace y by yobs and then consider
dxdt
= f (x , yobs; θx)
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 7 of 17
Gradient Matching
Gaussian ProcessRegressionWe use Gaussianprocesses (GPs) because:• They offer flexible
descriptions of functions.• The derivative of a GP is
again a GP.• We can fit them to
time-course data.Kirk & Stumpf, Bioinformatics (2009); Silk et al.,Nature Communications (2011)
See also Poster #9, Ann Babtie
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 8 of 17
Method Illustration
1
3 2
5 4
1
3 2
5 4
1
3 2
5 4
Model A Model B Model C
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 9 of 17
Method Illustration
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 9 of 17
Too Many Good Models?
1
3 2
5 4
1
3 2
5 4
1
3 2
5 4
Model A Model B Model C
0 2 4 6 8 100
1
2
3
4
5
Time
Con
cent
ratio
n
1 2 3 4 5data6data7data8data9data10
0 2 4 6 8 100
1
2
3
4
5
Time
Con
cent
ratio
n
Species:!
Initial condition:!
0 2 4 6 8 100
1
2
3
4
5
Time
Con
cent
ratio
n
1234 1 2data7data8data9data10
0 1 2 3 4 5x 104
0
1
2
3
4
5 x 104
Model rank (condition 1)
Mod
el ra
nk (c
ondi
tion
2)
A
B C Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 10 of 17
Too Many Good Models?
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 11 of 17
Too Many Good Models?
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 11 of 17
Model Fit and Posterior Estimates
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 12 of 17
Optimal Models
Pr(θ|D)
θ
t
y(t)
θ1
Xθ1
θ1+δ
Xθ1+δ
θ2
Xθ2
θ2+δ
Xθ2+δ
Pr(M |D) ∝∫Ω Pr(D|θ)π(θ|M)dθ× π(M)
Barnes et al.Interface Focus (2011).Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 13 of 17
Optimal Models
Pr(θ|D)
θ
t
y(t)
θ1
Xθ1
θ1+δ
Xθ1+δ
θ2
Xθ2
θ2+δ
Xθ2+δ
Pr(M |D) ∝∫Ω Pr(D|θ)π(θ|M)dθ× π(M)
Barnes et al.Interface Focus (2011).Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 13 of 17
Optimal Models
Pr(θ|D)
θ
t
y(t)
θ1
Xθ1
θ1+δ
Xθ1+δ
θ2
Xθ2
θ2+δ
Xθ2+δ
Pr(M |D) ∝∫Ω Pr(D|θ)π(θ|M)dθ× π(M)
Barnes et al.Interface Focus (2011).Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 13 of 17
Optimal Models
Pr(θ|D)
θ
t
y(t)
θ1
Xθ1
θ1+δ
Xθ1+δ
θ2
Xθ2
θ2+δ
Xθ2+δ
Pr(M |D) ∝∫Ω Pr(D|θ)π(θ|M)dθ× π(M)
Barnes et al.Interface Focus (2011).Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 13 of 17
Optimal Models
Pr(θ|D)
θ
t
y(t)
θ1
Xθ1
θ1+δ
Xθ1+δ
θ2
Xθ2
θ2+δ
Xθ2+δ
Pr(M |D) ∝∫Ω Pr(D|θ)π(θ|M)dθ× π(M)
Barnes et al.Interface Focus (2011).Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 13 of 17
Optimal Models
Pr(θ|D)
θ
t
y(t)
θ1
Xθ1
θ1+δ
Xθ1+δ
θ2
Xθ2
θ2+δ
Xθ2+δ
Pr(M |D) ∝∫Ω Pr(D|θ)π(θ|M)dθ× π(M)
Barnes et al.Interface Focus (2011).Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 13 of 17
Optimal Models
Pr(θ|D)
θ
t
y(t)
θ1
Xθ1
θ1+δ
Xθ1+δ
θ2
Xθ2
θ2+δ
Xθ2+δ
Pr(M |D) ∝∫Ω Pr(D|θ)π(θ|M)dθ× π(M)
Barnes et al.Interface Focus (2011).Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 13 of 17
Optimal Models
Pr(θ|D)
θ
t
y(t)
θ1
Xθ1
θ1+δ
Xθ1+δ
θ2
Xθ2
θ2+δ
Xθ2+δ
Pr(M |D) ∝∫Ω Pr(D|θ)π(θ|M)dθ× π(M)
Barnes et al.Interface Focus (2011).Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 13 of 17
How Do We Gain Confidence in Models
Graphical Models• Statistical confidence is assessed automatically; robustness is
enforced in Bayesian frameworks.• Other network methods (e.g. correlation-based networks) are
harder to assess.
Mechanistic Models• Parametric sensitivity/robustness analysis (PSA) provides some
measure of overfitting or appropriateness of the model.• Topological sensitivity analysis (TSA) allows us to assess which
model features are strongly supported.• Where there is a strong biophysical rationale, models can be set
up and compared directly; ideally followed by PSA and TSA.• However, experimental design will typically influence which model
is seen as preferable and avoiding this takes effort.
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 14 of 17
How Do We Gain Confidence in Models
Graphical Models• Statistical confidence is assessed automatically; robustness is
enforced in Bayesian frameworks.• Other network methods (e.g. correlation-based networks) are
harder to assess.
Mechanistic Models• Parametric sensitivity/robustness analysis (PSA) provides some
measure of overfitting or appropriateness of the model.• Topological sensitivity analysis (TSA) allows us to assess which
model features are strongly supported.• Where there is a strong biophysical rationale, models can be set
up and compared directly; ideally followed by PSA and TSA.• However, experimental design will typically influence which model
is seen as preferable and avoiding this takes effort.
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 14 of 17
Models of Proteasome Action
Proteasome Kinectis
Maud Menten
E + Sk1−−−−
k−1
ESk2−−→ E + P
v =[S]Vmax
[S] + Km
Leonor Michaelis
Liepe et al., Biomolecules (2014).
Proteasomal degradation is much more tightly andactively controlled than can be accounted for inMichaelis–Menten kinetics.
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 15 of 17
Improving Models of Proteasome Dynamics
Liepe et al., PLoS Comp Biol (2013); Silk et al.PLoS Comp Biol (2014).
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 16 of 17
Improving Models of Proteasome Dynamics
None of these models can fit the time-resolved data.
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 16 of 17
Improving Models of Proteasome Dynamics
Liepe et al., (2014).
Toni et al.; J.Roy.Soc.Interface (2009); Toni&Stumpf,Bioinformatics (2010); Liepe et al., Nature Protocols(2014).
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 16 of 17
Improving Models of Proteasome Dynamics
Liepe et al., (2014).
Toni et al.; J.Roy.Soc.Interface (2009); Toni&Stumpf,Bioinformatics (2010); Liepe et al., Nature Protocols(2014).
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 16 of 17
Improving Models of Proteasome Dynamics
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 16 of 17
Acknowledgements
Gaining Confidence in Signalling and Regulatory Networks Michael P.H. Stumpf 17 of 17