29
Optimal design of experiments in systems biology Michał Komorowski together with Juliane Liepe Sarah Filippi and Michael Stumpf Institute of Fundamental Technological Research Polish Academy of Sciences and Imperial College London 06 Feb 2013 Michał Komorowski Experimental design 06 Feb 2013 1/7

Optimal design of experiments

Embed Size (px)

Citation preview

Page 1: Optimal design of experiments

Optimal design of experiments in systems biology

Michał Komorowskitogether with

Juliane LiepeSarah Filippi

and

Michael Stumpf

Institute of Fundamental Technological ResearchPolish Academy of Sciences

andImperial College London

06 Feb 2013

Michał Komorowski Experimental design 06 Feb 2013 1 / 7

Page 2: Optimal design of experiments

A good experiment

Michał Komorowski Experimental design Motivation 06 Feb 2013 2 / 7

Page 3: Optimal design of experiments

A good experiment

Experiment 1

Michał Komorowski Experimental design Motivation 06 Feb 2013 2 / 7

Page 4: Optimal design of experiments

A good experiment

Experiment 1 Experiment 2

Michał Komorowski Experimental design Motivation 06 Feb 2013 2 / 7

Page 5: Optimal design of experiments

A good experiment

Experiment 1 Experiment 2 Experiment 3

Michał Komorowski Experimental design Motivation 06 Feb 2013 2 / 7

Page 6: Optimal design of experiments

A good experiment

Experiment 1 Experiment 2 Experiment 3

Michał Komorowski Experimental design Motivation 06 Feb 2013 2 / 7

Page 7: Optimal design of experiments

A good experiment

Experiment 1 Experiment 2 Experiment 3

Inform

ation

!ow

Michał Komorowski Experimental design Motivation 06 Feb 2013 2 / 7

Page 8: Optimal design of experiments

A good experiment

Experiment 1 Experiment 2 Experiment 3

Inform

ation

!ow

Michał Komorowski Experimental design Motivation 06 Feb 2013 2 / 7

Page 9: Optimal design of experiments

A good experiment - Fisher Information

FI(✓) = E✓@ log (X, ✓)

@✓

◆2

How does the behaviour changes, when we perturb parameters(✓ ! ✓ + d✓) ���! (x(✓) ! x(✓ + d✓))

d✓ ���! x(✓ + d✓) � x(✓)

⇡ @x@✓

1

d✓1

+ @x@✓

2

d✓d

qd✓2

1

+ d✓2

2

!r

(@x@✓

1

d✓1

)2 + (@x@✓

2

d✓2

)2 + 2

@x@✓

1

@x@✓

2

d✓1

d✓2

�d✓

1

d✓2

( @x@✓

1

)2

@x@✓

1

@x@✓

2

@x@✓

1

@x@✓

2

( @x@✓

2

)2

!

| {z }FI(✓)

✓d✓

1

d✓2

Komorowski et al., PNAS (2011).

Michał Komorowski Experimental design Fisher Information 06 Feb 2013 3 / 7

Page 10: Optimal design of experiments

A good experiment - Fisher Information

FI(✓) = E✓@ log (X, ✓)

@✓

◆2

How does the behaviour changes, when we perturb parameters(✓ ! ✓ + d✓) ���! (x(✓) ! x(✓ + d✓))

d✓ ���! x(✓ + d✓) � x(✓)

⇡ @x@✓

1

d✓1

+ @x@✓

2

d✓d

qd✓2

1

+ d✓2

2

!r

(@x@✓

1

d✓1

)2 + (@x@✓

2

d✓2

)2 + 2

@x@✓

1

@x@✓

2

d✓1

d✓2

�d✓

1

d✓2

( @x@✓

1

)2

@x@✓

1

@x@✓

2

@x@✓

1

@x@✓

2

( @x@✓

2

)2

!

| {z }FI(✓)

✓d✓

1

d✓2

Komorowski et al., PNAS (2011).

Michał Komorowski Experimental design Fisher Information 06 Feb 2013 3 / 7

Page 11: Optimal design of experiments

A good experiment - Fisher Information

FI(✓) = E✓@ log (X, ✓)

@✓

◆2

How does the behaviour changes, when we perturb parameters(✓ ! ✓ + d✓) ���! (x(✓) ! x(✓ + d✓))

d✓ ���! x(✓ + d✓) � x(✓)

⇡ @x@✓

1

d✓1

+ @x@✓

2

d✓d

qd✓2

1

+ d✓2

2

!r

(@x@✓

1

d✓1

)2 + (@x@✓

2

d✓2

)2 + 2

@x@✓

1

@x@✓

2

d✓1

d✓2

�d✓

1

d✓2

( @x@✓

1

)2

@x@✓

1

@x@✓

2

@x@✓

1

@x@✓

2

( @x@✓

2

)2

!

| {z }FI(✓)

✓d✓

1

d✓2

Komorowski et al., PNAS (2011).

Michał Komorowski Experimental design Fisher Information 06 Feb 2013 3 / 7

Page 12: Optimal design of experiments

A good experiment - Fisher Information

FI(✓) = E✓@ log (X, ✓)

@✓

◆2

How does the behaviour changes, when we perturb parameters(✓ ! ✓ + d✓) ���! (x(✓) ! x(✓ + d✓))

d✓ ���! x(✓ + d✓) � x(✓) ⇡ @x@✓

1

d✓1

+ @x@✓

2

d✓d

qd✓2

1

+ d✓2

2

!r

(@x@✓

1

d✓1

)2 + (@x@✓

2

d✓2

)2 + 2

@x@✓

1

@x@✓

2

d✓1

d✓2

�d✓

1

d✓2

( @x@✓

1

)2

@x@✓

1

@x@✓

2

@x@✓

1

@x@✓

2

( @x@✓

2

)2

!

| {z }FI(✓)

✓d✓

1

d✓2

Komorowski et al., PNAS (2011).

Michał Komorowski Experimental design Fisher Information 06 Feb 2013 3 / 7

Page 13: Optimal design of experiments

A good experiment - Fisher Information

FI(✓) = E✓@ log (X, ✓)

@✓

◆2

How does the behaviour changes, when we perturb parameters(✓ ! ✓ + d✓) ���! (x(✓) ! x(✓ + d✓))

d✓ ���! x(✓ + d✓) � x(✓) ⇡ @x@✓

1

d✓1

+ @x@✓

2

d✓d

qd✓2

1

+ d✓2

2

!r

(@x@✓

1

d✓1

)2 + (@x@✓

2

d✓2

)2 + 2

@x@✓

1

@x@✓

2

d✓1

d✓2

�d✓

1

d✓2

( @x@✓

1

)2

@x@✓

1

@x@✓

2

@x@✓

1

@x@✓

2

( @x@✓

2

)2

!

| {z }FI(✓)

✓d✓

1

d✓2

Komorowski et al., PNAS (2011).

Michał Komorowski Experimental design Fisher Information 06 Feb 2013 3 / 7

Page 14: Optimal design of experiments

A good experiment - Fisher Information

FI(✓) = E✓@ log (X, ✓)

@✓

◆2

How does the behaviour changes, when we perturb parameters(✓ ! ✓ + d✓) ���! (x(✓) ! x(✓ + d✓))

d✓ ���! x(✓ + d✓) � x(✓) ⇡ @x@✓

1

d✓1

+ @x@✓

2

d✓d

qd✓2

1

+ d✓2

2

!r

(@x@✓

1

d✓1

)2 + (@x@✓

2

d✓2

)2 + 2

@x@✓

1

@x@✓

2

d✓1

d✓2

�d✓

1

d✓2

( @x@✓

1

)2

@x@✓

1

@x@✓

2

@x@✓

1

@x@✓

2

( @x@✓

2

)2

!

| {z }FI(✓)

✓d✓

1

d✓2

Komorowski et al., PNAS (2011).

Michał Komorowski Experimental design Fisher Information 06 Feb 2013 3 / 7

Page 15: Optimal design of experiments

Information in Single Cell Measurements

t0

t1

t2

t3

t4

Time-SeriesData

Time-Point Data

t0

t1

t2

t3

t4

Komorowski et al., PNAS (2011).

Michał Komorowski Experimental design Fisher Information 06 Feb 2013 4 / 7

Page 16: Optimal design of experiments

Fluorescent microscopy vs flow cytometry for the p53system

Fluorescent microscopy or flow cytometry?

Komorowski et al., PNAS (2011).

Michał Komorowski Experimental design Fisher Information 06 Feb 2013 5 / 7

Page 17: Optimal design of experiments

Fluorescent microscopy vs flow cytometry for the p53system

Fluorescent microscopy or flow cytometry?

Komorowski et al., PNAS (2011).

Michał Komorowski Experimental design Fisher Information 06 Feb 2013 5 / 7

Page 18: Optimal design of experiments

A good experiment - Shannon Information

Prior P(✓) Experiment P(X|✓) Posterior P(✓|X)

Initial uncertainty

H(✓) = �Z

log2

(p(✓))p(✓)d✓

Posterion uncertainty

HX(✓|X) = �Z

log2

(p(✓|X))p(✓|X)d✓

Average posterior uncertainty

H(✓|X) =

ZHX(✓|X)p(X)dX

Mutual information

I(X, ✓) = H(✓) � H(✓|X)

Michał Komorowski Experimental design Mutual Information 06 Feb 2013 6 / 7

Page 19: Optimal design of experiments

A good experiment - Shannon Information

Prior P(✓)

Experiment P(X|✓) Posterior P(✓|X)

Initial uncertainty

H(✓) = �Z

log2

(p(✓))p(✓)d✓

Posterion uncertainty

HX(✓|X) = �Z

log2

(p(✓|X))p(✓|X)d✓

Average posterior uncertainty

H(✓|X) =

ZHX(✓|X)p(X)dX

Mutual information

I(X, ✓) = H(✓) � H(✓|X)

Michał Komorowski Experimental design Mutual Information 06 Feb 2013 6 / 7

Page 20: Optimal design of experiments

A good experiment - Shannon Information

Prior P(✓) Experiment P(X|✓)

Posterior P(✓|X)

Initial uncertainty

H(✓) = �Z

log2

(p(✓))p(✓)d✓

Posterion uncertainty

HX(✓|X) = �Z

log2

(p(✓|X))p(✓|X)d✓

Average posterior uncertainty

H(✓|X) =

ZHX(✓|X)p(X)dX

Mutual information

I(X, ✓) = H(✓) � H(✓|X)

Michał Komorowski Experimental design Mutual Information 06 Feb 2013 6 / 7

Page 21: Optimal design of experiments

A good experiment - Shannon Information

Prior P(✓) Experiment P(X|✓) Posterior P(✓|X)

Initial uncertainty

H(✓) = �Z

log2

(p(✓))p(✓)d✓

Posterion uncertainty

HX(✓|X) = �Z

log2

(p(✓|X))p(✓|X)d✓

Average posterior uncertainty

H(✓|X) =

ZHX(✓|X)p(X)dX

Mutual information

I(X, ✓) = H(✓) � H(✓|X)

Michał Komorowski Experimental design Mutual Information 06 Feb 2013 6 / 7

Page 22: Optimal design of experiments

A good experiment - Shannon Information

Prior P(✓) Experiment P(X|✓) Posterior P(✓|X)

Initial uncertainty

H(✓) = �Z

log2

(p(✓))p(✓)d✓

Posterion uncertainty

HX(✓|X) = �Z

log2

(p(✓|X))p(✓|X)d✓

Average posterior uncertainty

H(✓|X) =

ZHX(✓|X)p(X)dX

Mutual information

I(X, ✓) = H(✓) � H(✓|X)

Michał Komorowski Experimental design Mutual Information 06 Feb 2013 6 / 7

Page 23: Optimal design of experiments

A good experiment - Shannon Information

Prior P(✓) Experiment P(X|✓) Posterior P(✓|X)

Initial uncertainty

H(✓) = �Z

log2

(p(✓))p(✓)d✓

Posterion uncertainty

HX(✓|X) = �Z

log2

(p(✓|X))p(✓|X)d✓

Average posterior uncertainty

H(✓|X) =

ZHX(✓|X)p(X)dX

Mutual information

I(X, ✓) = H(✓) � H(✓|X)

Michał Komorowski Experimental design Mutual Information 06 Feb 2013 6 / 7

Page 24: Optimal design of experiments

A good experiment - Shannon Information

Prior P(✓) Experiment P(X|✓) Posterior P(✓|X)

Initial uncertainty

H(✓) = �Z

log2

(p(✓))p(✓)d✓

Posterion uncertainty

HX(✓|X) = �Z

log2

(p(✓|X))p(✓|X)d✓

Average posterior uncertainty

H(✓|X) =

ZHX(✓|X)p(X)dX

Mutual information

I(X, ✓) = H(✓) � H(✓|X)

Michał Komorowski Experimental design Mutual Information 06 Feb 2013 6 / 7

Page 25: Optimal design of experiments

A good experiment - Shannon Information

Prior P(✓) Experiment P(X|✓) Posterior P(✓|X)

Initial uncertainty

H(✓) = �Z

log2

(p(✓))p(✓)d✓

Posterion uncertainty

HX(✓|X) = �Z

log2

(p(✓|X))p(✓|X)d✓

Average posterior uncertainty

H(✓|X) =

ZHX(✓|X)p(X)dX

Mutual information

I(X, ✓) = H(✓) � H(✓|X)

Michał Komorowski Experimental design Mutual Information 06 Feb 2013 6 / 7

Page 26: Optimal design of experiments

Repressilator - selecting best available experiment

Figure 2: The repressilator model and the set of possible experiments. (A) Illustration of the wildtype repressilator model. The model consists of 3 mRNA species and their corresponding proteins(shown in the same colour). (B) The ordinary di�erential equations which describes the evolutionof the concentration of the mRNAs and proteins over time. (C) FOur potential modifications of thewild-type model. For each experimental intervention the modified parameters are listed (colours are asin A). The modifications of the wild-type model consist of decreasing one or several of the parametersof the system: in sets 2, 3 and 5, the regime of the parameter � is changed; in sets 3, 4 and 5,respectively, parameters �0, � and � are modified for only one gene which breaks the symmetry ofthe system.

data we perform parameter inference using an approximate Bayesian computation approach [29] foreach experiment separately and compare the posterior distributions shown in Figure 3. We observethat using the data generated from set 1 (wild type) only 2 parameters can be inferred: h and �0. Bycontrast, the data generated by set 2 and set 5 allow us to estimate all 4 parameters. In addition, foreach experiment we compute the reduction of uncertainty from the prior to the posterior distribution.The results are consistent with the results using mutual information and confirm that we should chooseexperiment 2 or 5 for parameter inference. In practice not all molecular species may be experimentallyaccessible and it is therefore also of interest to decide which species carries most information aboutthe parameters. We can estimate the mutual information between the parameter and each speciesindependently, and, for example, for experimental set 5 we observe that mRNA m1 and m2 as well asprotein p1 carry equally high information; see Figure S2.

Sometimes we are interested in estimating only some of the parameters, e.g. those that have a directphysiological meaning or are under experimental control. To investigate this aspect we consider theHes1 transcription factor that plays a number of important roles, including in the cell di�erentiationand segmentation of vertebrate embryos. In 2002 oscillations in the Hes1 system were observed by [30]and such oscillations might be connected with formation of spatial patterns during development. TheHes1 oscillator can be modelled by a simple three-component ODE model [31] as shown in Figure 4

6

Liepe et al., PLOS Comp Biol.

Michał Komorowski Experimental design Mutual Information 06 Feb 2013 7 / 7

Page 27: Optimal design of experiments

Repressilator - selecting best available experiment

Figure 2: The repressilator model and the set of possible experiments. (A) Illustration of the wildtype repressilator model. The model consists of 3 mRNA species and their corresponding proteins(shown in the same colour). (B) The ordinary di�erential equations which describes the evolutionof the concentration of the mRNAs and proteins over time. (C) FOur potential modifications of thewild-type model. For each experimental intervention the modified parameters are listed (colours are asin A). The modifications of the wild-type model consist of decreasing one or several of the parametersof the system: in sets 2, 3 and 5, the regime of the parameter � is changed; in sets 3, 4 and 5,respectively, parameters �0, � and � are modified for only one gene which breaks the symmetry ofthe system.

data we perform parameter inference using an approximate Bayesian computation approach [29] foreach experiment separately and compare the posterior distributions shown in Figure 3. We observethat using the data generated from set 1 (wild type) only 2 parameters can be inferred: h and �0. Bycontrast, the data generated by set 2 and set 5 allow us to estimate all 4 parameters. In addition, foreach experiment we compute the reduction of uncertainty from the prior to the posterior distribution.The results are consistent with the results using mutual information and confirm that we should chooseexperiment 2 or 5 for parameter inference. In practice not all molecular species may be experimentallyaccessible and it is therefore also of interest to decide which species carries most information aboutthe parameters. We can estimate the mutual information between the parameter and each speciesindependently, and, for example, for experimental set 5 we observe that mRNA m1 and m2 as well asprotein p1 carry equally high information; see Figure S2.

Sometimes we are interested in estimating only some of the parameters, e.g. those that have a directphysiological meaning or are under experimental control. To investigate this aspect we consider theHes1 transcription factor that plays a number of important roles, including in the cell di�erentiationand segmentation of vertebrate embryos. In 2002 oscillations in the Hes1 system were observed by [30]and such oscillations might be connected with formation of spatial patterns during development. TheHes1 oscillator can be modelled by a simple three-component ODE model [31] as shown in Figure 4

6

Figure 2: The repressilator model and the set of possible experiments. (A) Illustration of the wildtype repressilator model. The model consists of 3 mRNA species and their corresponding proteins(shown in the same colour). (B) The ordinary di�erential equations which describes the evolutionof the concentration of the mRNAs and proteins over time. (C) FOur potential modifications of thewild-type model. For each experimental intervention the modified parameters are listed (colours are asin A). The modifications of the wild-type model consist of decreasing one or several of the parametersof the system: in sets 2, 3 and 5, the regime of the parameter � is changed; in sets 3, 4 and 5,respectively, parameters �0, � and � are modified for only one gene which breaks the symmetry ofthe system.

data we perform parameter inference using an approximate Bayesian computation approach [29] foreach experiment separately and compare the posterior distributions shown in Figure 3. We observethat using the data generated from set 1 (wild type) only 2 parameters can be inferred: h and �0. Bycontrast, the data generated by set 2 and set 5 allow us to estimate all 4 parameters. In addition, foreach experiment we compute the reduction of uncertainty from the prior to the posterior distribution.The results are consistent with the results using mutual information and confirm that we should chooseexperiment 2 or 5 for parameter inference. In practice not all molecular species may be experimentallyaccessible and it is therefore also of interest to decide which species carries most information aboutthe parameters. We can estimate the mutual information between the parameter and each speciesindependently, and, for example, for experimental set 5 we observe that mRNA m1 and m2 as well asprotein p1 carry equally high information; see Figure S2.

Sometimes we are interested in estimating only some of the parameters, e.g. those that have a directphysiological meaning or are under experimental control. To investigate this aspect we consider theHes1 transcription factor that plays a number of important roles, including in the cell di�erentiationand segmentation of vertebrate embryos. In 2002 oscillations in the Hes1 system were observed by [30]and such oscillations might be connected with formation of spatial patterns during development. TheHes1 oscillator can be modelled by a simple three-component ODE model [31] as shown in Figure 4

6

Liepe et al., PLOS Comp Biol.

Michał Komorowski Experimental design Mutual Information 06 Feb 2013 7 / 7

Page 28: Optimal design of experiments

Repressilator - selecting best available experiment

Figure 2: The repressilator model and the set of possible experiments. (A) Illustration of the wildtype repressilator model. The model consists of 3 mRNA species and their corresponding proteins(shown in the same colour). (B) The ordinary di�erential equations which describes the evolutionof the concentration of the mRNAs and proteins over time. (C) FOur potential modifications of thewild-type model. For each experimental intervention the modified parameters are listed (colours are asin A). The modifications of the wild-type model consist of decreasing one or several of the parametersof the system: in sets 2, 3 and 5, the regime of the parameter � is changed; in sets 3, 4 and 5,respectively, parameters �0, � and � are modified for only one gene which breaks the symmetry ofthe system.

data we perform parameter inference using an approximate Bayesian computation approach [29] foreach experiment separately and compare the posterior distributions shown in Figure 3. We observethat using the data generated from set 1 (wild type) only 2 parameters can be inferred: h and �0. Bycontrast, the data generated by set 2 and set 5 allow us to estimate all 4 parameters. In addition, foreach experiment we compute the reduction of uncertainty from the prior to the posterior distribution.The results are consistent with the results using mutual information and confirm that we should chooseexperiment 2 or 5 for parameter inference. In practice not all molecular species may be experimentallyaccessible and it is therefore also of interest to decide which species carries most information aboutthe parameters. We can estimate the mutual information between the parameter and each speciesindependently, and, for example, for experimental set 5 we observe that mRNA m1 and m2 as well asprotein p1 carry equally high information; see Figure S2.

Sometimes we are interested in estimating only some of the parameters, e.g. those that have a directphysiological meaning or are under experimental control. To investigate this aspect we consider theHes1 transcription factor that plays a number of important roles, including in the cell di�erentiationand segmentation of vertebrate embryos. In 2002 oscillations in the Hes1 system were observed by [30]and such oscillations might be connected with formation of spatial patterns during development. TheHes1 oscillator can be modelled by a simple three-component ODE model [31] as shown in Figure 4

6

Figure 2: The repressilator model and the set of possible experiments. (A) Illustration of the wildtype repressilator model. The model consists of 3 mRNA species and their corresponding proteins(shown in the same colour). (B) The ordinary di�erential equations which describes the evolutionof the concentration of the mRNAs and proteins over time. (C) FOur potential modifications of thewild-type model. For each experimental intervention the modified parameters are listed (colours are asin A). The modifications of the wild-type model consist of decreasing one or several of the parametersof the system: in sets 2, 3 and 5, the regime of the parameter � is changed; in sets 3, 4 and 5,respectively, parameters �0, � and � are modified for only one gene which breaks the symmetry ofthe system.

data we perform parameter inference using an approximate Bayesian computation approach [29] foreach experiment separately and compare the posterior distributions shown in Figure 3. We observethat using the data generated from set 1 (wild type) only 2 parameters can be inferred: h and �0. Bycontrast, the data generated by set 2 and set 5 allow us to estimate all 4 parameters. In addition, foreach experiment we compute the reduction of uncertainty from the prior to the posterior distribution.The results are consistent with the results using mutual information and confirm that we should chooseexperiment 2 or 5 for parameter inference. In practice not all molecular species may be experimentallyaccessible and it is therefore also of interest to decide which species carries most information aboutthe parameters. We can estimate the mutual information between the parameter and each speciesindependently, and, for example, for experimental set 5 we observe that mRNA m1 and m2 as well asprotein p1 carry equally high information; see Figure S2.

Sometimes we are interested in estimating only some of the parameters, e.g. those that have a directphysiological meaning or are under experimental control. To investigate this aspect we consider theHes1 transcription factor that plays a number of important roles, including in the cell di�erentiationand segmentation of vertebrate embryos. In 2002 oscillations in the Hes1 system were observed by [30]and such oscillations might be connected with formation of spatial patterns during development. TheHes1 oscillator can be modelled by a simple three-component ODE model [31] as shown in Figure 4

6

Figure 2: The repressilator model and the set of possible experiments. (A) Illustration of the wildtype repressilator model. The model consists of 3 mRNA species and their corresponding proteins(shown in the same colour). (B) The ordinary di�erential equations which describes the evolutionof the concentration of the mRNAs and proteins over time. (C) FOur potential modifications of thewild-type model. For each experimental intervention the modified parameters are listed (colours are asin A). The modifications of the wild-type model consist of decreasing one or several of the parametersof the system: in sets 2, 3 and 5, the regime of the parameter � is changed; in sets 3, 4 and 5,respectively, parameters �0, � and � are modified for only one gene which breaks the symmetry ofthe system.

data we perform parameter inference using an approximate Bayesian computation approach [29] foreach experiment separately and compare the posterior distributions shown in Figure 3. We observethat using the data generated from set 1 (wild type) only 2 parameters can be inferred: h and �0. Bycontrast, the data generated by set 2 and set 5 allow us to estimate all 4 parameters. In addition, foreach experiment we compute the reduction of uncertainty from the prior to the posterior distribution.The results are consistent with the results using mutual information and confirm that we should chooseexperiment 2 or 5 for parameter inference. In practice not all molecular species may be experimentallyaccessible and it is therefore also of interest to decide which species carries most information aboutthe parameters. We can estimate the mutual information between the parameter and each speciesindependently, and, for example, for experimental set 5 we observe that mRNA m1 and m2 as well asprotein p1 carry equally high information; see Figure S2.

Sometimes we are interested in estimating only some of the parameters, e.g. those that have a directphysiological meaning or are under experimental control. To investigate this aspect we consider theHes1 transcription factor that plays a number of important roles, including in the cell di�erentiationand segmentation of vertebrate embryos. In 2002 oscillations in the Hes1 system were observed by [30]and such oscillations might be connected with formation of spatial patterns during development. TheHes1 oscillator can be modelled by a simple three-component ODE model [31] as shown in Figure 4

6

Liepe et al., PLOS Comp Biol.

Michał Komorowski Experimental design Mutual Information 06 Feb 2013 7 / 7

Page 29: Optimal design of experiments

Repressilator - selecting best available experiment

Figure 2: The repressilator model and the set of possible experiments. (A) Illustration of the wildtype repressilator model. The model consists of 3 mRNA species and their corresponding proteins(shown in the same colour). (B) The ordinary di�erential equations which describes the evolutionof the concentration of the mRNAs and proteins over time. (C) FOur potential modifications of thewild-type model. For each experimental intervention the modified parameters are listed (colours are asin A). The modifications of the wild-type model consist of decreasing one or several of the parametersof the system: in sets 2, 3 and 5, the regime of the parameter � is changed; in sets 3, 4 and 5,respectively, parameters �0, � and � are modified for only one gene which breaks the symmetry ofthe system.

data we perform parameter inference using an approximate Bayesian computation approach [29] foreach experiment separately and compare the posterior distributions shown in Figure 3. We observethat using the data generated from set 1 (wild type) only 2 parameters can be inferred: h and �0. Bycontrast, the data generated by set 2 and set 5 allow us to estimate all 4 parameters. In addition, foreach experiment we compute the reduction of uncertainty from the prior to the posterior distribution.The results are consistent with the results using mutual information and confirm that we should chooseexperiment 2 or 5 for parameter inference. In practice not all molecular species may be experimentallyaccessible and it is therefore also of interest to decide which species carries most information aboutthe parameters. We can estimate the mutual information between the parameter and each speciesindependently, and, for example, for experimental set 5 we observe that mRNA m1 and m2 as well asprotein p1 carry equally high information; see Figure S2.

Sometimes we are interested in estimating only some of the parameters, e.g. those that have a directphysiological meaning or are under experimental control. To investigate this aspect we consider theHes1 transcription factor that plays a number of important roles, including in the cell di�erentiationand segmentation of vertebrate embryos. In 2002 oscillations in the Hes1 system were observed by [30]and such oscillations might be connected with formation of spatial patterns during development. TheHes1 oscillator can be modelled by a simple three-component ODE model [31] as shown in Figure 4

6

Figure 2: The repressilator model and the set of possible experiments. (A) Illustration of the wildtype repressilator model. The model consists of 3 mRNA species and their corresponding proteins(shown in the same colour). (B) The ordinary di�erential equations which describes the evolutionof the concentration of the mRNAs and proteins over time. (C) FOur potential modifications of thewild-type model. For each experimental intervention the modified parameters are listed (colours are asin A). The modifications of the wild-type model consist of decreasing one or several of the parametersof the system: in sets 2, 3 and 5, the regime of the parameter � is changed; in sets 3, 4 and 5,respectively, parameters �0, � and � are modified for only one gene which breaks the symmetry ofthe system.

data we perform parameter inference using an approximate Bayesian computation approach [29] foreach experiment separately and compare the posterior distributions shown in Figure 3. We observethat using the data generated from set 1 (wild type) only 2 parameters can be inferred: h and �0. Bycontrast, the data generated by set 2 and set 5 allow us to estimate all 4 parameters. In addition, foreach experiment we compute the reduction of uncertainty from the prior to the posterior distribution.The results are consistent with the results using mutual information and confirm that we should chooseexperiment 2 or 5 for parameter inference. In practice not all molecular species may be experimentallyaccessible and it is therefore also of interest to decide which species carries most information aboutthe parameters. We can estimate the mutual information between the parameter and each speciesindependently, and, for example, for experimental set 5 we observe that mRNA m1 and m2 as well asprotein p1 carry equally high information; see Figure S2.

Sometimes we are interested in estimating only some of the parameters, e.g. those that have a directphysiological meaning or are under experimental control. To investigate this aspect we consider theHes1 transcription factor that plays a number of important roles, including in the cell di�erentiationand segmentation of vertebrate embryos. In 2002 oscillations in the Hes1 system were observed by [30]and such oscillations might be connected with formation of spatial patterns during development. TheHes1 oscillator can be modelled by a simple three-component ODE model [31] as shown in Figure 4

6

Figure 2: The repressilator model and the set of possible experiments. (A) Illustration of the wildtype repressilator model. The model consists of 3 mRNA species and their corresponding proteins(shown in the same colour). (B) The ordinary di�erential equations which describes the evolutionof the concentration of the mRNAs and proteins over time. (C) FOur potential modifications of thewild-type model. For each experimental intervention the modified parameters are listed (colours are asin A). The modifications of the wild-type model consist of decreasing one or several of the parametersof the system: in sets 2, 3 and 5, the regime of the parameter � is changed; in sets 3, 4 and 5,respectively, parameters �0, � and � are modified for only one gene which breaks the symmetry ofthe system.

data we perform parameter inference using an approximate Bayesian computation approach [29] foreach experiment separately and compare the posterior distributions shown in Figure 3. We observethat using the data generated from set 1 (wild type) only 2 parameters can be inferred: h and �0. Bycontrast, the data generated by set 2 and set 5 allow us to estimate all 4 parameters. In addition, foreach experiment we compute the reduction of uncertainty from the prior to the posterior distribution.The results are consistent with the results using mutual information and confirm that we should chooseexperiment 2 or 5 for parameter inference. In practice not all molecular species may be experimentallyaccessible and it is therefore also of interest to decide which species carries most information aboutthe parameters. We can estimate the mutual information between the parameter and each speciesindependently, and, for example, for experimental set 5 we observe that mRNA m1 and m2 as well asprotein p1 carry equally high information; see Figure S2.

Sometimes we are interested in estimating only some of the parameters, e.g. those that have a directphysiological meaning or are under experimental control. To investigate this aspect we consider theHes1 transcription factor that plays a number of important roles, including in the cell di�erentiationand segmentation of vertebrate embryos. In 2002 oscillations in the Hes1 system were observed by [30]and such oscillations might be connected with formation of spatial patterns during development. TheHes1 oscillator can be modelled by a simple three-component ODE model [31] as shown in Figure 4

6

Figure 3: Experiment choice for parameter inference in the repressilator model. Top: The mutualinformation I(✓, X) between the parameters ✓ and the output of each set of experiment (in darkgreen), and the entropy di�erence between the prior distribution and the posterior distribution fora data obtained from simulation of the system for each experiment (in light green). The error barson the mutual information barplots show the variance of the mutual information estimations over 3independent simulations. Bottom: For each set of experiment we show the histogram of the marginalof the posterior distribution of every parameters. The red line indicates the true parameter value.

A. This model contains 4 parameters, k1, P0, �, and h, and 3 species: Hes 1 mRNA, m, Hes 1 nuclearprotein, p1, and Hes 1 cytosolic protein, p2. It is possible to measure either the mRNA (using real-timePCR) or the total cellular Hes 1 protein concentration p1 + p2 (using Western blots). We investigatewhether protein or mRNA measurements provide more information about the model parameters.Thus we estimate the mutual information between mRNA and parameters, and between protein andparameters. Figure 4 B shows that mRNA measurements carry more information about all of theparameters.

This can again be further substantiated by simulations shown in Figure S5. We perform parameterinference based on such simulated data and compute the di�erence between the entropy of the priorand that of the resulting posterior distribution. The results shown in Figure 4 C are consistentwith the predictions based on mutual information: mRNA measurements carry more information forparameter inference. Interestingly, however, although the mutual information computation indicatesthat the protein measurements should contain more information about parameter k1 than about theother parameters, this is not confirmed by the di�erence in entropy result for this simulated data set.This divergence is due to the fact that the mutual information measures the amount of informationcontained on average over all the possible behaviours of the system whereas Figure 4 C represents the

7

Liepe et al., PLOS Comp Biol.

Michał Komorowski Experimental design Mutual Information 06 Feb 2013 7 / 7