30
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group, INRIA Rocquencourt mailto:[email protected] http://contraintes.inria.fr/

François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

Embed Size (px)

Citation preview

Page 1: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Formal Biology of the Cell

Inferring Reaction Rules from Temporal Properties

François Fages, Constraint Programming Group,

INRIA Rocquencourt mailto:[email protected]://contraintes.inria.fr/

Page 2: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Overview of the Lectures

1. Introduction. Formal molecules and reactions in BIOCHAM.

2. Formal biological properties in temporal logic. Symbolic model-checking.

3. Continuous dynamics. Kinetic models.

4. Learning kinetic parameter values. Constraint-based model checking.

5. Abstract Interpretation for systems biology I: hierarchy of semantics

6. Abstract Interpretation for systems biology II: types

7. Locations, transport and intercellular signalling

8. Inferring reaction rules from temporal properties

9. …

10.Protein structure prediction in constraint logic programming

Page 3: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

A Logical Paradigm for Systems Biology

Biological model = Transition System

Biological property = Temporal Logic Formula

Biological validation = Model-checkingInitial state : experimental conditions, wild-life/mutated organisms,…

Boolean semantics (propositionnal-CTL)

reachable(P) = EF(P) checkpoint(s2,s) = E(s2U s)

stable(s) = AG(s) steady(s) = EG(s)

oscil(P) = EG(F P ^ F P)

Differential semantics (constraint-LTL)

Reach threshold concentration : F([M]>0.2) on derivative : F(d([M]/dt)>0.2)

Reach and stays above threshold : FG([M]>0.2)

oscil(P,n)=F(d([M])/dt>0 & F(d([M])/dt<0 & … )) n times

Page 4: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Learning Model Revision from Temporal Properties

• Theory T: BIOCHAM model • molecule declarations

• reaction rules: complexation, phosphorylation, …

• Examples φ: CTL specification of biological properties• Reachability

• Checkpoints

• Stable states

• Oscillations

• Bias R: Rule pattern• Kind of rules to add or delete

Find a revision T’ of T such that T’ |= φ

Page 5: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Kripke Semantics of CTL*

Kripke structure K=(S,R) where S is a set of states and RSxS is total.

s |= if propositional formula is true in s,

s |= E if there is a path from s such that |= ,

s |= A if for every path from s, |= ,

|= if s |= where s is the starting state of ,

|= X if 1 |= ,

|= U iff there exists k ≥ 0 such that k |= for all j < k j |= |= W iff j j |= or k ≥ 0 k |= and j < k j |=

F = (true U |= F if there exists k ≥ 0 such that k |= ,

G = (W false |= G if for every k ≥ 0, k |=

Page 6: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Duality in CTL*

E = A

X = X

U = W

F = G

CTL*(X) : fragment of CTL* without U, W, F, G

CTL*(U) : fragment of CTL* without X

CTL : fragment of CTL* with E, A immediately before X, F, G, U , W can be identified to the set of states where it is true ~ {sS : s |=

}

LTL : fragment of CTL* without E, A

LTL(U) : fragment of LTL without X

LTL(F) : fragment of LTL without X, U, W

Page 7: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Complexity of Model-checking and Satisfiability

Model-checking Satisfiability

given an explicit Kripke structure K given a formula , does there exist

and a formula , does K,s |= ? a structure K,s such that K,s |= ?

LTL, LTL(U) : Pspace complete Pspace complete

LTL(F) : NP-complete NP-complete

CTL : Ptime DetExpTime complete

CTL* : Pspace complete DetExpExpTime complete

Page 8: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Simple Model of Cell Cycle Control

[Tyson et al. 91] model over 6 variables, initial state present(cdc2).

_=>Cyclin. Cyclin=>_. Cyclin+Cdc2~{p1}=>Cdc2-Cyclin~{p1,p2}.

Cdc2-Cyclin~{p1,p2}=>Cdc2-Cyclin~{p1}.Cdc2-Cyclin~{p1,p2}=[Cdc2-Cyclin~{p1}]=>Cdc2-Cyclin~{p1}.Cdc2-Cyclin~{p1}=>Cdc2-Cyclin~{p1,p2}.Cdc2-Cyclin~{p1}=>Cyclin~{p1}+Cdc2.Cyclin~{p1}=>_.Cdc2=>Cdc2~{p1}.Cdc2~{p1}=>Cdc2.

Page 9: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

(Aut. Generated) CTL Specification of the Model

biocham: add_genCTL.reachable(Cyclin).reachable(!(Cyclin)).oscil(Cyclin).reachable(Cdc2~{p1}).reachable(!(Cdc2~{p1})).checkpoint(Cdc2, Cdc2~{p1}).

oscil(Cdc2).

reachable(Cyclin~{p1}).reachable(!(Cyclin~{p1}))

oscil(Cyclin~{p1}).checkpoint(Cdc2-Cyclin~{p1}, Cyclin~{p1}).

Page 10: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Model Compression

biocham: reduce_model.1: deleting Cyclin=>_2: deleting Cdc2-Cyclin~{p1,p2}=[Cdc2-Cyclin~{p1}]=>Cdc2-Cyclin~{p1}3: deleting Cdc2-Cyclin~{p1}=>Cdc2-Cyclin~{p1,p2}4: deleting Cdc2~{p1}=>Cdc2After reduction, 6 rules remain corresponding to the bias ? => ?Deletion(s):Cyclin=>_.Cdc2-Cyclin~{p1,p2}=[Cdc2-Cyclin~{p1}]=>Cdc2-Cyclin~{p1}.Cdc2-Cyclin~{p1}=>Cdc2-Cyclin~{p1,p2}.Cdc2~{p1}=>Cdc2.

Page 11: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Theory Revision

biocham: delete_rules(Cdc2=>Cdc2~{p1}).Cdc2=>Cdc2~{p1}

biocham: revise_model.1: adding Cdc2-Cdc2~{p1}=>Cdc2+Cdc2~{p1}2: adding Cdc2=>_2: backtracking on previous add -> deleting Cdc2=>_2: adding Cdc2=[Cyclin]=>_2: backtracking on previous add -> deleting Cdc2=[Cyclin]=>_2: adding Cdc2=[Cdc2-Cdc2~{p1}]=>_3: adding Cdc2=>Cdc2~{p1}4: deleting Cdc2=[Cdc2-Cdc2~{p1}]=>_5: deleting Cdc2-Cdc2~{p1}=>Cdc2+Cdc2~{p1}

Modifications found:  Deletion(s):  Addition(s): Cdc2=>Cdc2~{p1}.

Page 12: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Search for all Solutions

biocham: learn_one_addition(elementary_interaction_rules).Time: 5.00 sRules tested: 112Good rules to be added: 2Cdc2=>Cdc2~{p1}Cdc2=[Cyclin]=>Cdc2~{p1}

Page 13: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

CTL Equivalence of Boolean Models

For a class C of CTL formulae, and an initial state s,

two Kripke structures K=(S,R), K’=(S,R’) are equivalent

K ~C K’ iff {C : K,s|= } = {C : K’,s|= }

Page 14: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

CTL Equivalence of Boolean Models

For a class C of CTL formulae, and an initial state s,

two Kripke structures K=(S,R), K’=(S,R’) are equivalent

K ~C K’ iff {C : K,s|= } = {C : K’,s|= }

Which model transformations preserve a class of CTL properties?

Model refinement or simplification preserving a CTL specification

Which model transformations can make a CTL property true?

Learning of rules to add or to delete to satisfy a CTL specification

Page 15: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

CTL Equivalence for a Simple Enzymatic Reaction

Two Biocham models: M1= {A+B<=>D, D=>A+C} M

M2 = {B =[A]=> C} M

D having no occurrence in M nor in the initial state s , is atomic.

Page 16: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

CTL Equivalence for a Simple Enzymatic Reaction

Two Biocham models: M1= {A+B<=>D, D=>A+C} M

M2 = {B =[A]=> C} M

D having no occurrence in M nor in the initial state s, is atomic.

Proposition If M2 ,s |= EF() then M1 ,s |= EF().

Proof In M2 the transitions A+BA+C (resp. A+BA+B+C) can be replaced in M1 by A+BDA+C (resp. A+BB+DB+A+C).

Page 17: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

CTL Equivalence for a Simple Enzymatic Reaction

Two Biocham models: M1= {A+B<=>D, D=>A+C} M

M2 = {B =[A]=> C} M

D having no occurrence in M nor in the initial state s , is atomic.

Proposition If M2 ,s |= EF() then M1 ,s |= EF().

Proposition If M1 ,s |= EF() then M2 ,s |= EF() whenever A and B do not appear negatively (i.e. under an odd number of negations) in and D does not appear at all in .

Proof Let be a path in M1 such that k |= . If k does not contain D then one can easily mimick with ’ in M2 such that ’k’ = k for some k’≤k. Otherwise, the last transition on D is either DD+A+C and can be replaced by DA+C, or A+BD and can be erased. In both cases the path is mimicked in M2.

Page 18: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

CTL Equivalence for a Simple Enzymatic Reaction

Two Biocham models: M1= {A+B<=>D, D=>A+C} M

M2 = {B =[A]=> C} M

D having no occurrence in M nor in the initial state s, ψ atomic.

Proposition If M2 ,s |= ¬ E(¬ U ψ) then M1 ,s |= ¬ E(¬ U ψ) whenever A and B do not appear negatively in ψ and D does not appear positively in ψ

Page 19: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

CTL Equivalence for a Simple Enzymatic Reaction

Two Biocham models: M1= {A+B<=>D, D=>A+C} M

M2 = {B =[A]=> C} M

D having no occurrence in M nor in the initial state s, ψ atomic.

Proposition If M2 ,s |= ¬ E(¬ U ψ) then M1 ,s |= ¬ E(¬ U ψ) whenever A and B do not appear negatively in ψ and D does not appear positively in ψ

Proposition If M1 ,s |= ¬ E(¬ U ψ) implies M2 ,s |= ¬ E(¬ U ψ) A and B do not appear negatively in and D does not appear positively in

Page 20: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Positive and Negative CTL Formulae

Let K = (S,R,L) and K’ = (S,R’,L) be two Kripke structures such that RR’

Def. An ECTL (positive) formula is a CTL formula with no occurrence of A (nor negative occurrence of E).

Ex. : reachability EF(), steady EG()

Def. An ACTL (negative) formula is a CTL formula with no occurrence of E (nor negative occurrence of A).

Ex. : checkpoint E(2U ), stable AG()

Page 21: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Monotonicity of Positive ECTL Formulae

Let K = (S,R) and K’ = (S,R’) be two Kripke structures such that RR’.

Proposition For any ECTL formula , if K’,s |≠ then K,s |≠ .

Proof We show that K,s |= implies K’,s |= by induction on the proof of If is propositionnal, s |= hence K’,s |= ;

If =1&2 (resp. 1|2) then by induction K’,s|=1 and (resp. or) K’,s|=2.

If =EX then K, |= X 1 for some path in K, hence in K’, so K, 1|= 1 and by induction K’, 1|= 1 hence K’, |= X 1

If =E(U 2) then K, |= 1 U 2 for some path in K, hence in K’, so there exists k K, k|= 2 and for all j<k K, j|= 1. By induction K’, k|= 2 and for all j<k K’, j|= 1 hence K, |= 1 U 2.

Page 22: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Anti-monotonicity of Negative ECTL Formulae

Let K = (S,R) and K’ = (S,R’) be two Kripke structures such that RR’.

Proposition For any ACTL formula , if K,s |≠ then K’,s |≠ .

Proof If K,s |≠ then K,s |= where is an ECTL formula.

By the previous proposition, K’,s |= hence K’,s |≠ .

Page 23: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Theory Revision Algorithm

General idea of constraint programming: replace a generate-and-test algorithm by a constrain-and-generate algorithm.

Anticipate whether one has to add or remove a rule?

• Positive ECTL formula: if false, remains false after removing a rule• Reachability, stability

• Need to add rules

• Negative ACTL formula: if false, remains false after adding a rule• Checkpoints

• Need to remove a rule on the path given by the model checker

• Unclassified CTL formulae• oscil(a)= AG((a EFa)^(a EFa))

Page 24: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Theory Revision Algorithm Rules

Initial state: <(0, 0, 0), (E,U,A), R>

E transition: <(E,U,A), (E{e},U,A), R> <(E{e},U,A), (E,U,A),R> if R |= e

E’ transition: <(E,U,A), (E {e},U,A), R> <(E {e},U,A), (E,U,A),R {r}>

if R |≠ e and f {e} E U A, K {r} |= f

Page 25: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Theory Revision Algorithm Rules

Initial state: <(0, 0, 0), (E,U,A), R>E transition: <(E,U,A), (E{e},U,A), R> <(E{e},U,A), (E,U,A),R> if R |= eE’ transition: <(E,U,A), (E {e},U,A), R> <(E {e},U,A), (E,U,A),R {r}> if R |≠ e and f {e} E U A, K {r} |= fU transition: <(E,U,A), (0,U {u},A), R > <(E,U {u},A), (0,U,A),R> if R |= uU’ transition: <(E,U,A), (0,U {u},A), R > <(E,U {u},A), (0,U,A),R {r}> if R|≠u and f {u} E U A, R {r} |= fU” transition: <(E,U,A), (0,U {u},A), R Re > <(E,U {u},A),(0,U,A), R> if K, si|≠u and f {u} E U A, R |= f

Page 26: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Theory Revision Algorithm Rules

Initial state: <(0, 0, 0), (E,U,A), R>

E transition: <(E,U,A), (E{e},U,A), R> <(E{e},U,A), (E,U,A),R> if R |= e

E’ transition: <(E,U,A), (E {e},U,A), R> <(E {e},U,A), (E,U,A),R {r}>

if R |≠ e and f {e} E U A, K {r} |= f

U transition: <(E,U,A), (0,U {u},A), R > <(E,U {u},A), (0,U,A),R> if R |= u

U’ transition: <(E,U,A), (0,U {u},A), R > <(E,U {u},A), (0,U,A),R {r}>

if R|≠u and f {u} E U A, R {r} |= f

U” transition: <(E,U,A), (0,U {u},A), R Re > <(E,U {u},A),(0,U,A), R>

if K, si|≠u and f {u} E U A, R |= f

A transition: <(E,U,A), (0, 0,A {a}), R > <(E,U,A {a}), (Ep,Up,A),R> if R |= a

A’ transition: <(EEp,UUp,A),(0,0,A{a}), RRe><(E,U,A{a}),(Ep,Up,A),R> if R|≠ a, f {u} [ E U A, R |= f and Ep Up is the set of formulae no longer satisfied after the deletion of the rules in Re.

Page 27: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Termination

Proposition The model revision algorithm terminates.

Proof

The termination of the algorithm is proved by considering the lexicographic

ordering over the couple < a, n >

where a is the number of unsatisfied ACTL formulae,

and n is the number of unsatisfied ECTL and UCTL formulae.

Each transition strictly decreases a,

or lets a unchanged and strictly decreases n.

Page 28: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Correctness

Proposition If the terminal configuration is of the form < (E,U,A), (0,0,0), R > then the model R satisfies the initial CTL specification.

Proof

Each transition maintains only true formulae in the satisfied set,

and preserves the complete CTL specification

in the union of the satisfied set and the untreated set.

Page 29: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Incompleteness

Two reasons:

1) The satisfaction of ECTL and UCTL formula is searched by adding only one rule to the model (transition E’ and U’)

2) The Kripke structure associated to a Biocham set of rules adds loops on terminal states. Hence adding or removing a rule may have an opposite deletion or addition of those loops.

Page 30: François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,

François Fages MPRI Bio-info 2007

Optimisations

Restrict the search space for rules to add by:

• Considering type information on molecular species• Kinase(A) B=[A]=>B~{p}. for any B• Phosphatase(A) B~{p}=[A]=>B. for any B• Kinase(A,B)• Phosphatase(A,B)

• Considering the influence graph between molecular species• Activates(A,B) _=[A]=>B. A+B’=>B. B~{p}=[A]=>B.

B’=[A]=>B. • Inhibits(A,B) B=[A]=>_. A+B=>A-B. B=[A]=>B~{p}.

B=[A]=>B’.

• Considering the topology of locations• Neighbor(L,L’) A:L+…=>B:L’+…