25
François Fages CPCV, March 2004 Constraint-based Model Checking of Hybrid Systems: A First Experiment in Systems Biology François Fages, INRIA Rocquencourt http://contraintes.inria.fr/ Joint work with and Nathalie Chabrier-Rivier Sylvain Soliman In collaboration with ARC CPBIO http://contraintes.inria.fr/cpbio Alexander Bockmayr, Vincent Danos, Vincent Schächter et al.

François Fages CPCV, March 2004 Constraint-based Model Checking of Hybrid Systems: A First Experiment in Systems Biology François Fages, INRIA Rocquencourt

  • View
    214

  • Download
    1

Embed Size (px)

Citation preview

François Fages CPCV, March 2004

Constraint-based Model Checking of Hybrid Systems:

A First Experiment in Systems Biology

François Fages, INRIA Rocquencourt http://contraintes.inria.fr/

Joint work with andNathalie Chabrier-Rivier Sylvain Soliman

In collaboration with ARC CPBIO http://contraintes.inria.fr/cpbioAlexander Bockmayr, Vincent Danos, Vincent Schächter et al.

François Fages CPCV, March 2004

Current revolution in Biology

• Elucidation of high-level biological processes

in terms of their biochemical basis at the molecular level.

• Mass production of genomic and post-genomic data:

ARN expression, protein synthesis, protein-protein interactions,…

• Need for a strong parallel effort on the formal representation of biological processes: Systems Biology.

• Need for formal tools for modeling and reasoning about their global behavior.

François Fages CPCV, March 2004

Formalisms for modeling biochemical systems

• Diagrammatic notation • Boolean networks [Thomas 73]• Milner’s pi–calculus [Regev-Silverman-Shapiro 99-01, Nagasali et al. 00]• Concurrent transition systems [Chabrier-Chiaverini-Danos-Fages-Schachter

03] Biochemical abstract machine BIOCHAM [Chabrier-Fages-Soliman 03] Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02]• Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03]

• Differential equations • Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00]• Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01]• Hybrid concurrent constraint languages [Bockmayr-Courtois 01]

François Fages CPCV, March 2004

Our goal

Beyond simulation, provide formal tools for querying, validating and completing biological models.

Our proposal:

• Use of temporal logic CTL as a query language for models of biological processes;

• Use of concurrent transition systems for their modeling;

• Use of symbolic and constraint-based model checkers for automatically evaluating CTL queries in qualitative and quantitative models.

• Use of inductive logic programming for learning models

In course, learn and teach bits of biology with logic programs.

François Fages CPCV, March 2004

Plan of the talk

1. Introduction

2. The Biochemical Abstract Machine BIOCHAM• Simple algebra of cell compounds

• Modeling reactions with concurrent transition systems

3. Temporal logic CTL as a query language• Example of the MAPK signaling pathway

• Symbolic model-checking with NuSMV in BIOCHAM

4. Kinetics models• Constraint-based model checking with DMC

5. Conclusion and perspectives

François Fages CPCV, March 2004

2. A Simple Algebra of Cell Molecules

Small molecules: covalent bonds (outer electrons shared) 50-200 kcal/mol

• 70% water

• 1% ions

• 6% amino acids (20), nucleotides (5),

fats, sugars, ATP, ADP, …

Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol

Stability and bindings determined by the number of weak bonds: 3D shape

• 20% proteins (50-104 amino acids)

• RNA (102-104 nucleotides AGCU)

• DNA (102-106 nucleotides AGCT)

François Fages CPCV, March 2004

Formal proteins

Cyclin dependent kinase 1 Cdk1

(free, inactive)

Complex Cdk1-Cyclin B Cdk1–CycB

(low activity)

Phosphorylated form Cdk1~{thr161}-CycB

at site threonine 161

(high activity)

(BIOCHAM syntax)

François Fages CPCV, March 2004

Algebra of Cell Molecules

E ::= Name|E-E|E~{E,…,E}|(E) S ::= _|E+S

Names: molecules, proteins, #gene binding sites, abstract @processes…

- : binding operator for protein complexes, gene binding sites, …

Associative and commutative.

~{…}: modification operator for phosphorylated sites, …

Set (Associative, Commutative, Idempotent).

+ : solution operator, “soup aspect”, Assoc. Comm. Idempotent, Neutral _

No membranes, no transport formalized. Bitonal calculi [Cardelli 03].

François Fages CPCV, March 2004

Concurrent Transition Syst. of Biochemical Reactions

Enzymatic reactions:

R ::= S=>S | S=[E]=>S | S=[R]=>S | S<=>S | S<=[E]=>S

(where A<=>B stands for A=>B B=>A and A=[C]=>B for A+C=>B+C, etc.)

define a concurrent transition system over integers denoting the multiplicity of the molecules (multiset rewriting).

One can associate a finite abstract CTS over boolean state variables denoting the presence/absence of molecules

which correctly over-approximates the set of all possible behaviors

a reaction A+B=>C+D is translated with 4 rules for possible consumption:

A+BA+B+C+D A+BA+B +C+D

A+BA+B+C+D A+BA+B+C+D

François Fages CPCV, March 2004

Six Rule Schemas

Complexation: A + B => A-B Decomplexation A-B => A + B

Cdk1+CycB => Cdk1–CycB

Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A

Cdk1–CycB =[Myt1]=> Cdk1~{thr161}-CycB

Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB

Synthesis: _ =[C]=> A.

_ =[#Ge2-E2f13-Dp12]=> CycA

Degradation: A =[C]=> _.

CycE =[@UbiPro]=> _ (not for CycE-Cdk2 which is stable)

François Fages CPCV, March 2004

3. Temporal Logic CTL as a Query Language

Computation Tree Logic

Time

Non-determinism E, A

F,G,U EF

EU

AG

Choice

Time

E

exists 

A

always

X

next time

EX() AX()

F

finally

EF()

AG()

AF()

liveness

G

globally

EG()

AF( )

AG()

safety

U

untilE (U ) A (U )

François Fages CPCV, March 2004

Biological Queries

About reachability:

• Given an initial state init, can the cell produce some protein P? init EF(P)

• Which are the states from which a set of products P1,. . . , Pn can be produced simultaneously? EF(P1^…^Pn)

About pathways:

• Can the cell reach a state s while passing by another state s2? init EF(s2^EFs)

• Is state s2 a necessary checkpoint for reaching state s? EF(s2U s)

• Can the cell reach a state s without violating some constraints c? init EF(c U s)

François Fages CPCV, March 2004

Biological Queries

About stability:

• Is a certain (partially described) state s a stable state? sAG(s) sAG(s) (s denotes both the state and the formula describing it).

• Is s a steady state (with possibility of escaping) ? sEG(s)

• Can the cell reach a stable state? initEF(AG(s))not a LTL formula.

• Must the cell reach a stable state? initAF(AG(s))

• What are the stable states? Not expressible in CTL [Chan 00].

• Can the system exhibit a cyclic behavior w.r.t. the presence of P ? init EG((P EF P) ^ (P EF P))

François Fages CPCV, March 2004

MAPK Signaling Pathway

RAF + RAFK <=> RAF-RAFK.RAF~{p1} + RAFPH <=> RAF~{p1}-RAFPH.MEK~$P + RAF~{p1} <=> MEK~$P-RAF~{p1} where p2 not in $P.MEKPH + MEK~{p1}~$P <=> MEK~{p1}~$P-MEKPH.MAPK~$P + MEK~{p1,p2} <=> MAPK~$P-MEK~{p1,p2} where p2 not in $P.MAPKPH + MAPK~{p1}~$P <=> MAPK~{p1}~$P-MAPKPH.

RAF-RAFK => RAFK + RAF~{p1}.RAF~{p1}-RAFPH => RAF + RAFPH.MEK~{p1}-RAF~{p1} => MEK~{p1,p2} + RAF~{p1}.MEK-RAF~{p1} => MEK~{p1} + RAF~{p1}.MEK~{p1}-MEKPH => MEK + MEKPH.MEK~{p1,p2}-MEKPH => MEK~{p1} + MEKPH.MAPK-MEK~{p1,p2} => MAPK~{p1} + MEK~{p1,p2}.MAPK~{p1}-MEK~{p1,p2} => MAPK~{p1,p2} + MEK~{p1,p2}.MAPK~{p1}-MAPKPH => MAPK + MAPKPH.MAPK~{p1,p2}-MAPKPH => MAPK~{p1} + MAPKPH.

François Fages CPCV, March 2004

MAPK Signaling Pathway

MEK~{p1} is a checkpoint for producing MAPK~{p1,p2}biocham: !E(!MEK~{p1} U MAPK~{p1,p2})True

The PH complexes are not compulsory for the cascadebiocham: !E(!MEK~{p1}-MEKPH U MAPK~{p1,p2})falseStep 1 rule 15 Step 2 rule 1 RAF-RAFK presentStep 3 rule 21 RAF~{p1} presentStep 4 rule 5 MEK-RAF~{p1} presentStep 5 rule 24 MEK~{p1} presentStep 6 rule 7 MEK~{p1}-RAF~{p1} presentStep 7 rule 23 MEK~{p1,p2} presentStep 8 rule 13 MAPK-MEK~{p1,p2} presentStep 9 rule 27 MAPK~{p1} presentStep 10 rule 15 MAPK~{p1}-MEK~{p1,p2} presentStep 11 rule 28 MAPK~{p1,p2} present

François Fages CPCV, March 2004

Mammalian Cell Cycle Control Map [Kohn 99]

François Fages CPCV, March 2004

Mammalian Cell Cycle Control Benchmark

700 rules, 165 proteins and genes, 500 variables, 2500 states.

BIOCHAM NuSMV model-checker time in seconds:

Initial state G2 Query: Time:

compiling 29

Reachability G1 EF CycE 2

Reachability G1 EF CycD 1.9

Reachability G1 EF PCNA-CycD 1.7

Checkpoint

for mitosis complex

EF ( Cdc25~{Nterm}

U Cdk1~{Thr161}-CycB)

2.2

Cycle EG ( (CycA EF CycA) ( CycA EF CycA))

31.8

François Fages CPCV, March 2004

4. Kinetics Models

Enzymatic reactions with rates k1 k2 k3

E+S k1 C k2 E+P

E+S k3 C

can be compiled by the law of mass action into a system of

Michaelis-Menten Ordinary Differential Equations (non-linear)

dE/dt = -k1ES+(k2+k3)C

dS/dt = -k1ES+k3C

dC/dt = k1ES-(k2+k3)C

dP/dt = k2C

François Fages CPCV, March 2004

MAPK kinetics model

François Fages CPCV, March 2004

Gene Interaction Networks

Gene interaction example [Bockmayr-Courtois 01]

Hybrid Concurrent Constraint Programming HCC [Saraswat et al.]

2 genes x and y.

Hybrid linear approximation

dx/dt = 0.01 – 0.02*x if y < 0.8

dx/dt = – 0.02*x if y ≥ 0.8

dy/dt = 0.01*x

François Fages CPCV, March 2004

Concurrent Transition System

Time discretization using Euler’s method:

y < 0.8 x’ = x + dt*(0.01-0.02*x) , y’ = y + dt*0.01*x

y ≥ 0.8 x’ = x + dt*(0.01-0.02*x) , y’ = y + dt*0.01*x

Initial condition: x=0, y=0.

CLP(R) program (dt=1)Init :- X=0, Y=0, p(X,Y).

p(X,Y):-X>=0, Y>=0, Y<0.8,

X1=X-0.02*X+0.01, Y1=Y+0.01*X, p(X1,Y1).

p(X,Y):-X>=0, Y>=0, Y>=0.8,

X1=X-0.02*X, Y1=Y+0.01*X, p(X1,Y1).

François Fages CPCV, March 2004

Proving CTL properties by computing fixpoints of CLP programs

Theorem [Delzanno Podelski 99]

EF(f)=lfp(TP{p(x):-f}),

EG(f)=gfp(TPf ).

Safety property AG(f) iff EF(f) iff initlfp(TP{f})

Liveness property AG(f1AF(f2)) iff initlfp(TPf1gfp(T P{f2} ) )

Implementation in Sicstus-Prolog CLP(R,B) [Delzanno 00]

François Fages CPCV, March 2004

Deductive Model Checker DMC: Gene Interaction

r(init, p(s_s,A,B), {A=0,B=0}).r(p(s_s,A,B), p(s_s,C,D), {A>=0,B>=0.8,C=A-

0.02*A,D=B+0.01*A}).r(p(s_s,A,B), p(s_s,C,D), {A>=0,B>=0,B<0.8, C=A-0.02*A+0.01,D=B+0.01*A}).| ?- prop(P,S).P = unsafe, S = p:s*(x>=0.6) | ?- ti.Property satisfied. Execution time 0.0 | ?- ls.s(0, p(s_s,A,_), {A>=0.6}, 1, (0,0)).

François Fages CPCV, March 2004

Gene interaction (continued)

| ?- prop(P,S).P = unsafe, S = p:s*(x>=0.2) ?| ?- ti. Property NOT satisfied. Execution time 1.5| ?- ls.s(0, p(s_s,A,_), {A>=0.2}, 1, (0,0)).s(1, p(s_s,A,B), {B<0.8,B>=-0.0,A>=0.19387755102040816}, 2,

(2,1)).…s(26, p(s_s,A,B), {B>=0.0,A>=0.0, B+0.1982676351105516*A<0.7741338175552753}, 27,

(2,26)). s(27, init, {}, 28, (1,27)).

François Fages CPCV, March 2004

Conclusion and Perspectives

• The biochemical abstract machine BIOCHAM provides:

a first-order-rule-based language for modeling biochemical systems

a powerful query language based on temporal logic CTL

• Implementation in Prolog + model-checker NuSMV + Constraint-based model checker DMC for Ordinary Differential Equations (Euler method)

• models of metabolic and signaling pathways, cell-cycle control,…

• Combination of boolean models with ODE models

• Proof of concept, issue of scaling-up: efficient constraints, abstractions

STREP APrIL 2: learning of reaction weights and rules. http://www.rewerse.net

EU 6th PCRD NoE REWERSE semantic web for bioinformatics http://www.rewerse.net