Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Stochastic Simulation of Chemical ReactionsComputational Models for Complex Systems
Paolo Milazzo
Dipartimento di Informatica, Universita di Pisahttp://pages.di.unipi.it/milazzo
milazzo di.unipi.it
Laurea Magistrale in InformaticaA.Y. 2018/2019
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 1 / 28
Introduction
We have seen that the dynamics of chemical reactions can be studied byanalyzing the associated system of ODEs obtained through the applicationof the law of mass action
ODEs are continuous (both in variables values and time) and deterministic
Sometimes chemical reactions exhibit random behaviors
This led to the definition of stochastic simulation algorithms forchemical reactions
See also:
Stochastic Simulation of Chemical Kinetics by Daniel T. GillespieFreely accessible if you are within the UniPi subnet
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 2 / 28
Randomness in chemical reactions (1)
The occurrence of a chemical reaction in a chemical solution is actuallydifficult to be predicted in advance.
the movement of molecules in the chemical solution is related withthe interaction (elastic collisions) of the molecules with the fluidmedium containing them
so, if we ignore such low level interaction, we cannot predict exactlywhen a specific combination of reactants will meet (and react) in thechemical solution
even in the case of a reaction with a single reactant (e.g. a
dissociation Ak→ B + C ) it is not possibile to predict exactly when
the reactant will “decide” to react
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 3 / 28
Randomness in chemical reactions (2)
In the case of high concentrations of reactants, the law of large numbersallow us to ignore random aspects of chemical reactions
This justifies the use of ODEs
But when numbers are small (one or a few molecules)
random aspects become crucial
and also the use of descrete variables becomes necessary
Example, think about Ak→ B + C and assume that only one molecule A is
present in the solution.
ODEs would describe the concentration of A to pass slowly (andcontinuously) from 1 to 0
in the real system the number of instances of A would pass from 1 to0 with a discrete (instantaneous) change
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 4 / 28
Gillespie’s Stochastic Approach
Gillespie’s Stochastic Simulation Algorithm (SSA) is an exact procedurefor simulating the time evolution of a chemical reacting system by takingproper account of the randomness inherent in such a system.
Given a set of reactions R = {R1, . . . ,Rn}, the SSA:
assumes a stochastic reaction constant cµ for each chemical reactionRµ ∈ Rsuch that cµdt is the probability that a particular combination ofreactants of Rµ react in an infinitesimal time interval dt
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 5 / 28
Gillespie’s Stochastic Approach
The constant cµ is used to compute the propensity (or stochastic rate) ofRµ to occur in the whole chemical solution, denoted aµ, as follows:
aµ = hµcµ
where hµ is the number of distinct molecular reactant combinations.
Let Rµ be
`1S1 + . . .+ `ρSρc−→ `′1P1 + . . .+ `′γPγ
In accordance with standard combinatorics, the number of distint reactantcombinations of Rµ in a solution with Xi molecules of Si , with 1 ≤ i ≤ ρ,is given by
hµ =
ρ∏i=1
(Xi
`i
)
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 6 / 28
Gillespie’s Stochastic ApproachExample:solution with X1 molecules S1 and X2 molecules S2
(remark: number of molecules, not concentrations...)
reaction R1 : S1 + S2 → 2S1
h1 =(X1
1
)(X21
)= X1X2
a1 = X1X2c1
reaction R2 : 2S1 → S1 + S2
h2 =(X1
2
)= X1(X1−1)
2
a2 = X1(X1−1)2 c2
Note that propensity aµ is similar, for suitable kinetic constants, to themass action rates (note: here unitary volume is assumed):
For R1 with k1 = c1, the law of mass action gives k1[S1][S2] ≈ a1
For R2 with k2 = c2/2, the law of mass action gives k2[S1]2 ≈ a2
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 7 / 28
Gillespie’s Stochastic Approach
Propensity aµ is used in Gillespie’s approach as the parameter of anexponential probability distribution modelling the time between subsequentoccurrences of reaction Rµ.
Exponential distrubution is a continuous probability distribution (on[0,∞]) describing the timing between events in a Poisson process, namelya process in which events occur continuously and independently at aconstant average rate (taken as parameter).
The probability density function f and the cumulative distribution functionF of an exponential distribution with parameter λ are as follows:
f (x) =
{λe−λx x ≥ 0
0 x < 0F (x) =
{1− e−λx x ≥ 0
0 x < 0
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 8 / 28
Gillespie’s Stochastic Approach
Probability density function Cumulative distribution function
The mean of an exponentially distributed variable with parameter λ is 1λ .
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 9 / 28
Gillespie’s Stochastic Approach
Two important properties of the exponential distribution hold:
The exponential distribution is memoryless:
P(X > t + s | X > s) = P(X > t)
This allows a simulation algorithm in which the exponentialdistribution is used to forget about the history of the simulation
Let X1, . . . ,Xn be independent exponentially distributed randomvariables with parameters λ1, . . . , λn. Then
X = min(X1, . . . ,Xn) is also exponentially distributed
with parameter λ = λ1 + . . .+ λn. This allows a simulation algorithmto use a unique exponential distribution for the whole set of reactionsto be simulated
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 10 / 28
Gillespie’s Stochastic Simulation Algorithm (SSA)
Given:
a set of molecular species {S1, . . . ,Sn}initial numbers of molecules of each species {X1, . . .Xn} with Xi ∈ IN
a set of chemical reactions {R1, . . .RM}Gillespie’s algorithm computes a possible evolution of the system
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 11 / 28
Gillespie’s Stochastic Simulation Algorithm (SSA)
Gillespie’s algorithm:
The state of the simulation:
is a vector representing the multiset of molecules in the chemicalsolution (initially [X1, . . . ,Xn])
a real variable t representing the simulation time (initially t = 0)
The algorithm iterates the following steps until t reaches a final value tstop.
1 The time t + τ at which the next reaction will occur is randomlychosen with τ exponentially distributed with parameter a0 =
∑Mν=1 aν ;
2 The reaction Rµ that has to occur at time t + τ is randomly chosenwith probability
aµ∑Mν=1 aν
(that isaµa0
).
At each step t is incremented by τ and the multiset representing thechemical solution is updated by subtracting reactants and adding products.
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 12 / 28
Gillespie’s Stochastic Simulation Algorithm (SSA)Example: Let’s consider the following reactions:
R1 : A2→ B + C R2 : B + C
0.1→ A
and the following initial quantities: A0 = 10,B0 = 5,C0 = 20 representedas [10, 5, 20] in vector notation.
The initial state is X = [10, 5, 20], t = 0.
The first iteration of Gillespie’s algorithm is as follows:
compute propensities:I a1 = 10 · 2 = 20 a2 = 5 · 20 · 0.1 = 10 a0 = a1 + a2 = 30
generate τ exponentially distributed with parameter a0 = 30I the average τ is 1/a0 = 1/30 ∼ 0.0333
choose the reaction that has to occur: R1 with probabilitya1/a0 = 10/30 and R2 with probability a2/a0 = 20/30
If R2 is chosen, the state of the simulation becomes X = [11, 4, 19], t = τPaolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 13 / 28
Gillespie’s algorithm: implementation detailsImplementation detail: Generation of τ (exponentially distributed)
A random number with any probability distribution can be computedfrom a random number with uniform distribution by applying theinversion sampling method
The idea is to use the inverse of the cumulative distribution function
Given the cumulative distribution function F of a probabilitydistribution dist and a uniformly distributed random variable U, thevariable X = F−1(U) is a random variable with distribution dist
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 14 / 28
Gillespie’s algorithm: implementation details
In the case of the exponential distribution, the cumulative distributionfunction (for x ≥ 0) is F (x) = 1− e−λx
Let’s invert the function:
F (x) = 1− e−λx =⇒ 1− F (x) = e−λx =⇒ ln(1− F (x)) = −λx
=⇒ − 1
λln(1− F (x)) = x =⇒ 1
λln(
1
1− F (x)) = x
So, we obtain F−1(Y ) = 1λ ln( 1
1−Y ). Since Y is uniformly distributed, also1− Y is uniformly distributed. This allows us to simplify the definition ofF−1 as follows: F−1(Y ) = 1
λ ln( 1Y )
τ exponentially distributed with parameter a0 can be computed asτ = 1
a0ln( 1
Y ) with Y obtained from a standard random numbergenerator
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 15 / 28
Gillespie’s algorithm: implementation detailsImplementation detail: Choice of reaction Rµ (with probability
aµa0
)
Idea:1 generate a random number N uniformly distributed in [0, a0), that is
N = n · a0 with n ∈ [0, 1) obtained from a standard number generator2 Start summing a1, a2, . . .3 as soon as the sum becomes greather than N, the number of
completed iterations gives you µ
Formally:
µ is the smallest integer k satisfying∑k
i=1 ai > na0 with n uniformlydistributed in [0, 1)
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 16 / 28
ODEs vs SSALet us compare the deterministic and stochastic approach with someexamples of (bio)chemical reactions:
First example: Enzymatic activity: E + S0.3
10.0ES
0.01→ E + P
Starting with: 100E and 100S .
ODEs SSA
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 17 / 28
ODEs vs SSASecond example: Lotka/Volterra reactions:
Y110→ 2Y1
Y1 + Y20.01→ 2Y2
Y210→ Z
Starting with 1000Y1 and 1000Y2
ODEsPaolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 18 / 28
ODEs vs SSASecond example: Lotka/Volterra reactions:
Y110→ 2Y1
Y1 + Y20.01→ 2Y2
Y210→ Z
Starting with 1000Y1 and 900Y2 (slight perturbation).
ODEsPaolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 19 / 28
ODEs vs SSASecond example: Lotka/Volterra reactions:
Y110→ 2Y1
Y1 + Y20.01→ 2Y2
Y210→ Z
Starting with 1000Y1 and 1000Y2
SSAPaolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 20 / 28
ODEs vs SSAThird example: Negative feedback loop:
G110−→ G1 + P1 P1 + G2
102P1G2 P1
1−→
G210000−−−→ G2 + P2 P2 + G3
0.120
P2G3 P2100−−→
G310−→ G3 + P3 P3 + G1
1020
P3G1 P31−→
Starting with Gi = 1 and Pi = 0
ODEs SSAPaolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 21 / 28
Computational cost of Gillespie’s algorithm
The computational cost of this detailed stochastic simulation algorithmmay become extremely high for large models
the key issue is that the time elapsing between two reactions canbecome extremely small
The algorithm becomes very inefficient when:
there are large number of molecules
kinetic constant are high
that is, when reaction rates become increase...
Computational cost is the main disadvantage of stochastic simulation withrespect to ODEs
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 22 / 28
Computational cost of Gillespie’s algorithm
Several variants of Gillespie’s algorithm aimed at reducing thecomputational cost have been proposed.
Exact approaches (improving the computation of each step):
Gibson and Bruck proposed the use of some efficient data structures(indexed binary tree priority queues) to improve the choice of thereaction Rµ to occur at each step
Cao et al. and McCollum et al. proposed dynamical orderingstrategies for reaction propensities a1, a2, . . . in order toprobabilistically reduce the time needed to choose Rµ at each step
See lecture notes for more details...
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 23 / 28
Computational cost of Gillespie’s algorithmSeveral variants of Gillespie’s algorithm aimed at reducing thecomputational cost have been proposed.
Approximate approaches (reducing the number of steps):
Gillespie proposed the τ -leaping method: the idea is to allow severalreactions to take place in a single (longer) time step, under thecondition that reaction rates do not change too much during thattime.
Gillespie et al. proposed the slow-scale Stochastic SimulationAlgorithm (ssSSA) which separates fast reactions from slow reactions.At each step fast reactions are dealt with by assuming that they reacha dynamic equilibrium (so, only their steady state is computed). Slowreaction are simulated one by one as in the standard SSA.
Hybrid simulation is a technique which combines ODEs withstochastic simulation: ODEs are applied to molecules occurring in bignumbers, stochastic simulation to molecules occurring in smallnumbers
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 24 / 28
Implementations
Tools and libraries for the numerical simulation of chemical reactionsystems often implement also (some variants of) Gillespie’s algorithm:
COPASI (http://copasi.org/)Professional (free) tool
libRoadRunner (http://libroadrunner.org/)Library for C++/Python
Dizzy (available on the course web page)Multiplatform simulation tool. Unmantained, but very simple...
Many other tools...See sbml.org/SBML_Software_Guide/SBML_Software_Summary
for a list
In addition:
StochKit (https://sourceforge.net/projects/stochkit/) is astate-of-the-art C++ library (with a Python binding) for thestochastic simulation of chemical reactions
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 25 / 28
Implementations
In Dizzy, stochastic simulation algorithms (Gillespie SSA, Gibson&Bruck,and τ -leaping) can be selected from the simulation window
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 26 / 28
Lessons learnt
Summing up:
Stochastic (and discrete) simulation methods are more accurate thanODEs for the analysis of reaction systems, in particular whenmolecules are present in small quantities
Gillespie’s algorithm is THE stochastic simulation algorithm
Gillespie’s algorithm perform one step for each occurrence of areaction in the chemical solution
I performance problems...
Approximate variants of Gillespie’s algorithm improve performances(to some extent) with still a good accuracy
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 27 / 28
Exercises
Implement Gillespie’s algorithm in your favorite programminglanguage
I inspect its execution with a profiler... where does it spend the majorityof its execution time?
I test some implementation strategies to improve the computation ofeach single step (e.g. McCollum dynamical ordering strategy, or yourown strategy). How much do they improve performances?
Paolo Milazzo (Universita di Pisa) CMCS - Stochastic Simulation A.Y. 2018/2019 28 / 28