52
Graphs and polytopes: learning Bayesian networks with LP relaxations Tommi Jaakkola MIT CSAIL based on joint work with David Sontag, MIT Amir Globerson, HUJI Marina Meila, U Washington Wednesday, March 10, 2010

Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Graphs and polytopes: learning Bayesian networks with LP

relaxations

Tommi JaakkolaMIT CSAIL

based on joint work withDavid Sontag, MIT

Amir Globerson, HUJIMarina Meila, U Washington

Wednesday, March 10, 2010

Page 2: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

• Bayesian networks are widely used as modeling tools across applied areas, including computational biology

• Learned structures (especially causal) can lead to useful insights about the domain

Learning Bayesian networks

sample in this data set consists of quantita-tive amounts of each of the 11 phosphorylatedmolecules, simultaneously measured from sin-gle cells [data sets are downloadable (8)]. Forpurposes of illustration, examples of actualfluorescence-activated cell sorter (FACS) dataplotted in prospective corelationship form areshown in fig. S1. In most cases, this reflects theactivation state of the kinases monitored, or inthe cases of PIP3 and PIP2, the levels of thesesecondary messenger molecules in primary cells,under the condition measured. Nine stimula-tory or inhibitory interventional conditions wereused (Table 1) (8). The complete data sets wereanalyzed with the Bayesian network structureinference algorithm (6, 9, 17).

A high-accuracy human primary T cellsignaling causality map. The resulting denovo causal network model was inferred(Fig. 3A) with 17 high-confidence causal arcsbetween various components. To evaluate thevalidity of this model, we compared the mod-el arcs (and absent potential arcs) with thosedescribed in the literature. Arcs were catego-rized as the following: (i) expected, for con-nections well-established in the literature thathave been demonstrated under numerous con-ditions in multiple model systems; (ii) reported,for connections that are not well known, butfor which we were able to find at least oneliterature citation; and (iii) missing, which indi-cates an expected connection that our Bayesiannetwork analysis failed to find. Of the 17 arcsin our model, 15 were expected, all 17 wereeither expected or reported, and 3 were missed(Fig. 3A and table S1) (8, 18–22). Table 3enumerates the probable paths of influencecorresponding to model arcs determined bysurveying published reports.

Several of the known connections fromour model are direct enzyme-substrate rela-tionships (Fig. 3B) (PKA to Raf, Raf to Mek,Mek to Erk, and Plc-g to PIP2), and one has arelationship of recruitment leading to phos-phorylation (Plc-g to PIP3). In almost all cases,

the direction of causal influence was correctlyinferred (an exception was Plc-g to PIP3, inwhich case the arc was inferred in the reversedirection). All the influences are containedwithin one global model; thus, the causal di-rection of arcs is often compelled so that theseare consistent with other components in themodel. These global constraints allowed de-tection of certain causal influences from mole-cules that were not perturbed in our assay.For instance, although Raf was not perturbedin any of the measured conditions, the meth-od correctly inferred a directed arc from Rafto Mek, which was expected for the well-characterized Raf-Mek-Erk signal transduc-tion pathway. In some cases, the influence ofone molecule on another was mediated by in-termediate molecules that were not measuredin the data set. In the results, these indirectconnections were detected as well (Fig. 3B,panel b). For example, the influence of PKAand PKC on the MAPKs p38 and Jnk likelyproceeded via their respective (unmeasured)MAPK kinase kinases. Thus, unlike someother approaches used to elucidate signalingnetworks [for example, protein-protein inter-action maps (23, 24)] that provide static bio-chemical association maps with no causallinks, our Bayesian network method can de-tect both direct and indirect causal connectionsand therefore provide a more contextual pic-ture of the signaling network.

Another feature demonstrated in our mod-el is the ability to dismiss connections thatare already explained by other network arcs(Fig. 3B, panel c). This is seen in the Raf-Mek-Erk cascade. Erk, also known as p44/42,is downstream of Raf and therefore dependenton Raf, yet no arc appears from Raf to Erk,because the connection from Raf to Mek andthe connection from Mek to Erk explain thedependence of Erk on Raf. Thus, an indirectarc should appear only when one or moreintermediate molecules is not present in thedata set, otherwise the connection will proceed

via this molecule. The intervening moleculemay also be a shared parent. For example,the phosphorylation statuses of p38 and Jnkare correlated (fig. S2), yet they are not di-rectly connected, because their shared parents(PKC and PKA) mediate the dependence be-tween them. Although we cannot know wheth-er an arc in our model represents a direct orindirect influence, it is unlikely that our modelcontains an indirect arc that is mediated byany molecule observed in our measurements.Correlation exists between most moleculepairs in this data set [per Bonferroni correctedP value (fig. S2)], which can occur with close-ly connected pathways. Therefore, the relativelack of arcs in our model (Fig. 3A) contrib-uted greatly to the accuracy and interpret-ability of the inferred model.

A more complex example is the influenceof PKC on Mek, which is known to be me-diated by Raf (Fig. 3B, panel d). PKC isknown to affect Mek through two paths ofinfluence, each mediated by a different ac-tive phosphorylated form of the protein Raf.Although PKC phosphorylates Raf directlyat S499 and S497, this event is not detectedby our measurements, because we use onlyan antibody specific to Raf phosphorylationat S259 (Table 2) (16). Therefore, our algo-rithm detects an indirect arc from PKC toMek that is mediated by the presumed un-measured intermediate Raf phosphorylatedat S497 and S499 (18). The PKC-to-Raf arcrepresents an indirect influence that proceedsvia an unmeasured molecule, presumed to beRas (19, 20). We discussed above the abilityof our approach to dismiss redundant arcs. Inthis case, there are two paths leading fromPKC to Mek, because each path correspondsto a separate means of influence from PKCto Mek: one via Raf phosphorylated at S259and the other through Raf phosphorylated atS497 and S499. Thus, neither path is redun-dant. This result demonstrates the distinctionthat this analysis is sensitive to specific phos-

Fig. 3. Bayesian network inferenceresults. (A) Network inferred fromflow cytometry data represents ex-pected outcomes. This network rep-resents a model average from 500high-scoring results. High-confidencearcs, appearing in at least 85% ofthe networks, are shown. For clarity,the names of the molecules are usedto represent the measured phospho-rylation sites (Table 2). (B) Inferrednetwork demonstrates several fea-tures of Bayesian networks. (a) Arcsin the network may correspond todirect events or (b) indirect influ-ences. (c) When intermediate mol-ecules are measured in the dataset, indirect influences rarely appear as an additional arc. No additionalarc is added between Raf and Erk because the dependence between Rafand Erk is dismissed by the connection between Raf and Mek, and be-tween Mek and Erk (for instance, see Fig. 1C). (d) Connections in themodel contain phosphorylation site–specificity information. Because Raf

phosphorylation on S497 and S499 was not measured in our data set,the connection between PKC and the measured Raf phosphorylation site(S259) is indirect, likely proceeding via Ras. The connection between PKCand the undetected Raf phosphorylation on S497 and S499 is seen as anarc between PKC and Mek.

R E S E A R C H A R T I C L E S

22 APRIL 2005 VOL 308 SCIENCE www.sciencemag.org526

on

Ap

ril 1

3,

20

09

w

ww

.scie

nce

ma

g.o

rgD

ow

nlo

ad

ed

fro

m

(Sachs et al. 2005)

themselves transcriptionally regulated. Theseregulation mechanisms often involve feed-forward loops and feedback mechanisms(18, 31) that change the mRNA expression

level of regulators when their protein activ-ity changes. As a consequence, we candetect coordinated changes in the expres-sion levels of regulators and their targets.

This hypothesis is supported by an analysisof the discovered regulatory relations againsta database of protein-DNA and protein-protein interactions (26).

Fig. 3. Different regulatory network architectures. (A) An uncon-strained acyclic network where each gene can have a different regu-lator set. This is a fragment of a network learned in the experimentsof Pe’er et al. (24 ). (B) A summary of direct neighbor relations amongthe genes shown in (A) based on bootstrap estimates. Degrees ofconfidence are denoted by edge thickness. We automatically identifya subnetwork of genes, with high-confidence relations among them,that are involved in the yeast-mating pathways. The colors highlightgenes with known function in mating, including signal transduction(yellow), transcription factors (blue), and downstream effectors(green). (C) A fragment of a two-level network described by Pe’er etal. (25). The top level contains a small number of regulators; the

bottom level contains all other genes (targets). Each gene has differ-ent regulators from among the regulator genes. (D) Visualization ofsignificant Gene Ontology (42) annotations of the targets of differentregulators. Each significant annotation for the targets of a regulator(or pairs of regulators) is shown with the hypergeometric p-value. (E)A fragment of the module network described by Segal et al. (26). Eachmodule contains several genes that share the same set of regulators andshare the same conditional regulation program given these regulators. (F)Visualization of the expression levels of the 55 genes in Module 1 (b) andtheir regulators (a). Significant Gene Ontology annotations (c) and cis-regulatory motifs in promoter regions of genes in the module (d) are shown.[See figure 3 of (26); reproduced with permission]

M A T H E M A T I C S I N B I O L O G Y

6 FEBRUARY 2004 VOL 303 SCIENCE www.sciencemag.org804

SPECIALSECTION

on

Fe

bru

ary

27

, 2

00

8

ww

w.s

cie

nce

ma

g.o

rgD

ow

nlo

ad

ed

fro

m

(Friedman 2004)

MATING_TYPE

FUS3

STE50FUS1

TUP1

STE7

MFA1

STE2 AGA2

STE20

GPA1

BAR1FAR1

SAG1STE18

STE6

STE3

SST2

MFA2

AGA1 SIN3STE4

SNF2

MFALPHA1

SWI1TEC1 KSS1MCM1

STE5

MFALPHA2

KAR3

STE12

STE11

MATING_TYPE

FUS3

STE50

TUP1

STE7

MFA1

STE2

AGA2

GPA1

STE12

BAR1

SAG1

STE18

STE6

STE3

SST2

MFA2

AGA1

SIN3

SNF2 MFALPHA1

SWI1

TEC1

FUS1 KSS1 STE20 MCM1

STE5

MFALPHA2

KAR3

STE4

FAR1

STE11

Figure 2. Bayesian network models learned by model averaging over the 500 highest scoringmodels visited during the unconstrained and constrained simulated annealing search runs,respectively. Edges are included in the figure if and only if their posterior probabilityexceeds 0.5. Node and edge color descriptions are included in the text.

(Hartemink et al. 2002)

...

Wednesday, March 10, 2010

Page 3: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Structure learning: basics

themselves transcriptionally regulated. Theseregulation mechanisms often involve feed-forward loops and feedback mechanisms(18, 31) that change the mRNA expression

level of regulators when their protein activ-ity changes. As a consequence, we candetect coordinated changes in the expres-sion levels of regulators and their targets.

This hypothesis is supported by an analysisof the discovered regulatory relations againsta database of protein-DNA and protein-protein interactions (26).

Fig. 3. Different regulatory network architectures. (A) An uncon-strained acyclic network where each gene can have a different regu-lator set. This is a fragment of a network learned in the experimentsof Pe’er et al. (24 ). (B) A summary of direct neighbor relations amongthe genes shown in (A) based on bootstrap estimates. Degrees ofconfidence are denoted by edge thickness. We automatically identifya subnetwork of genes, with high-confidence relations among them,that are involved in the yeast-mating pathways. The colors highlightgenes with known function in mating, including signal transduction(yellow), transcription factors (blue), and downstream effectors(green). (C) A fragment of a two-level network described by Pe’er etal. (25). The top level contains a small number of regulators; the

bottom level contains all other genes (targets). Each gene has differ-ent regulators from among the regulator genes. (D) Visualization ofsignificant Gene Ontology (42) annotations of the targets of differentregulators. Each significant annotation for the targets of a regulator(or pairs of regulators) is shown with the hypergeometric p-value. (E)A fragment of the module network described by Segal et al. (26). Eachmodule contains several genes that share the same set of regulators andshare the same conditional regulation program given these regulators. (F)Visualization of the expression levels of the 55 genes in Module 1 (b) andtheir regulators (a). Significant Gene Ontology annotations (c) and cis-regulatory motifs in promoter regions of genes in the module (d) are shown.[See figure 3 of (26); reproduced with permission]

M A T H E M A T I C S I N B I O L O G Y

6 FEBRUARY 2004 VOL 303 SCIENCE www.sciencemag.org804

SPECIALSECTION

on

Fe

bru

ary

27

, 2

00

8

ww

w.s

cie

nce

ma

g.o

rgD

ow

nlo

ad

ed

fro

m

complete data

highest scoringacyclic graph

Wednesday, March 10, 2010

Page 4: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Structure learning: basics

themselves transcriptionally regulated. Theseregulation mechanisms often involve feed-forward loops and feedback mechanisms(18, 31) that change the mRNA expression

level of regulators when their protein activ-ity changes. As a consequence, we candetect coordinated changes in the expres-sion levels of regulators and their targets.

This hypothesis is supported by an analysisof the discovered regulatory relations againsta database of protein-DNA and protein-protein interactions (26).

Fig. 3. Different regulatory network architectures. (A) An uncon-strained acyclic network where each gene can have a different regu-lator set. This is a fragment of a network learned in the experimentsof Pe’er et al. (24 ). (B) A summary of direct neighbor relations amongthe genes shown in (A) based on bootstrap estimates. Degrees ofconfidence are denoted by edge thickness. We automatically identifya subnetwork of genes, with high-confidence relations among them,that are involved in the yeast-mating pathways. The colors highlightgenes with known function in mating, including signal transduction(yellow), transcription factors (blue), and downstream effectors(green). (C) A fragment of a two-level network described by Pe’er etal. (25). The top level contains a small number of regulators; the

bottom level contains all other genes (targets). Each gene has differ-ent regulators from among the regulator genes. (D) Visualization ofsignificant Gene Ontology (42) annotations of the targets of differentregulators. Each significant annotation for the targets of a regulator(or pairs of regulators) is shown with the hypergeometric p-value. (E)A fragment of the module network described by Segal et al. (26). Eachmodule contains several genes that share the same set of regulators andshare the same conditional regulation program given these regulators. (F)Visualization of the expression levels of the 55 genes in Module 1 (b) andtheir regulators (a). Significant Gene Ontology annotations (c) and cis-regulatory motifs in promoter regions of genes in the module (d) are shown.[See figure 3 of (26); reproduced with permission]

M A T H E M A T I C S I N B I O L O G Y

6 FEBRUARY 2004 VOL 303 SCIENCE www.sciencemag.org804

SPECIALSECTION

on

Fe

bru

ary

27

, 2

00

8

ww

w.s

cie

nce

ma

g.o

rgD

ow

nlo

ad

ed

fro

m

complete data

decomposable scoring function

highest scoringacyclic graph

Wednesday, March 10, 2010

Page 5: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Structure learning as inference

1

2 3

Wednesday, March 10, 2010

Page 6: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Structure learning as inference

1

2 3

Wednesday, March 10, 2010

Page 7: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Structure learning as inference

1

2 3

Wednesday, March 10, 2010

Page 8: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Structure learning as inference

1

2 3

Wednesday, March 10, 2010

Page 9: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

1

2 3

Structure learning as inference

Wednesday, March 10, 2010

Page 10: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Score based approaches (briefly)• Local search methods

- stochastic search (e.g., Heckerman et al., 1995)- over equivalence classes (e.g., Chickering 2002)- order based search (e.g., Teyssier et al., 2005)

• Exact search methods- dynamic programming (e.g., Koivisto et al., 2004, Singh et al.,

2005, Silander et al., 2006)- partial order covers (Parviainen et al., 2009)- branch and bound (e.g., de Campos et al., 2009 )

Wednesday, March 10, 2010

Page 11: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

• Dynamic programming methods work well for small structure learning problems ...

20 25 30 35 4010

0

101

102

103

104

number of variables

seconds

Exact methods

2 sec

24 sec

32 min

dynamicprogramming

Wednesday, March 10, 2010

Page 12: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

• Dynamic programming methods work well for small structure learning problems ...

20 25 30 35 4010

0

101

102

103

104

number of variables

seconds

Exact methods

2 sec

24 sec

32 min

dynamicprogramming

it should be possible tosolve some larger

problems much faster...

Wednesday, March 10, 2010

Page 13: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

• We reduce the search over graph structures to a linear program over a polytope representing acyclic graphs- each vertex corresponds to an acyclic graph- interior points correspond to distributions over graphs

• Any solution obtained at a vertex is guaranteed to be optimal (“certificate of optimality”)

score

Overview of our approach

Wednesday, March 10, 2010

Page 14: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Graphs and vectors

1

2 3

Wednesday, March 10, 2010

Page 15: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Graphs and vectors

100000010100

parent set selectionfor variable 1

parent set selectionfor variable 2

parent set selectionfor variable 3

1

2 3

Wednesday, March 10, 2010

Page 16: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Graphs and vectors

100000010100

parent set selectionfor variable 1

parent set selectionfor variable 2

parent set selectionfor variable 3

1

2 3

Wednesday, March 10, 2010

Page 17: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Graphs and polytopes1

2 3

100000010100

Wednesday, March 10, 2010

Page 18: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Graphs and polytopes1

2 3

1

2 3

100000010100

010000101000

Wednesday, March 10, 2010

Page 19: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Graphs and polytopes1

2 3

1

2 3

100000010100

010000101000

Wednesday, March 10, 2010

Page 20: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Graphs and polytopes1

2 3

1

2 3

100000010100

010000101000

Wednesday, March 10, 2010

Page 21: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Graphs and polytopes1

2 3

1

2 3

100000010100

010000101000

1

2 3

Wednesday, March 10, 2010

Page 22: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

• Maximize

subject to

• Integral solution is optimal (“certificate of optimality”)

LP for structure learning

“expected” score

vertices are binaryvectors corresponding

to acyclic graphs

Wednesday, March 10, 2010

Page 23: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

• Maximize

subject to

• Integral solution is optimal (“certificate of optimality”)

• But the polytope has exponentially many facets...

LP for structure learning

“expected” score

vertices are binaryvectors corresponding

to acyclic graphs

Wednesday, March 10, 2010

Page 24: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

A cutting plane approach• We only need to fully characterize the polytope (linear

constraints) near the actual solution

parent set selectionprobabilities

scoresimple

constraints

Wednesday, March 10, 2010

Page 25: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

A cutting plane approach• We only need to fully characterize the polytope (linear

constraints) near the actual solution- solve first with the current constraints

parent set selectionprobabilities

current solutionscoresimple

constraints

Wednesday, March 10, 2010

Page 26: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

A cutting plane approach• We only need to fully characterize the polytope (linear

constraints) near the actual solution- solve first with the current constraints- find a violated constraint (separation problem)

parent set selectionprobabilities

current solutionscoresimple

constraints

violated simple constraint

Wednesday, March 10, 2010

Page 27: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

A cutting plane approach• We only need to fully characterize the polytope (linear

constraints) near the actual solution- solve first with the current constraints- find a violated constraint (separation problem)- resolve

parent set selectionprobabilities

current solution

new solution

score

violated simple constraint

simpleconstraints

Wednesday, March 10, 2010

Page 28: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

A cutting plane approach• We only need to fully characterize the polytope (linear

constraints) near the actual solution- solve first with the current constraints- find a violated constraint (separation problem)- resolve

parent set selectionprobabilities

current solutionscore

violated simple constraint

simpleconstraints

What are the linear constraints?How to find a violated constraint?How to solve the resulting (simpler) LP?

new solution

Wednesday, March 10, 2010

Page 29: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

The separation problem1

2 3

Wednesday, March 10, 2010

Page 30: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

The separation problem1

2 3

010001000010

parent set selectionfor variable 1

parent set selectionfor variable 2

parent set selectionfor variable 3

100010001000

In an acyclic graph, any subset (cluster) C must have

at least one variable withall its parents outside C

Wednesday, March 10, 2010

Page 31: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

LP relaxation for structure learning• Maximize

subject to parent set selections

cluster constraints

“expected” score

These constraints are facet defining but not sufficient to fully specify the polytope

Wednesday, March 10, 2010

Page 32: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Dual LP for structure learning • Minimize

subject to

local scores adjusted based on clusters

Wednesday, March 10, 2010

Page 33: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Dual LP for structure learning • Minimize

subject to

•Why the dual?- simpler constraints, closed

form coordinate updates

- clusters can be ranked by their effect on value

- any dual feasible point upper bounds the LP value (and the optimal score)

local scores adjusted based on clusters

Wednesday, March 10, 2010

Page 34: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Simple example problem• RNAi gene silencing experiments in C. elegans

• 25 phenotypic indicators/markers that characterize each experimental outcome

• We seek to build a Bayesian network model over the phenotypic markers in order to capture their coordinate variation across experiments

• Technical constraint: each variable can have at most four parents

Wednesday, March 10, 2010

Page 35: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

• The LP is tight if we can find a structure (integral solution) that attains the dual value

0 50 100 150 200−6300

−6200

−6100

−6000

−5900

−5800

−5700

−5600

−5500

Dual solution

dual parameter updates

score

cluster constraintsadded

Wednesday, March 10, 2010

Page 36: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

• We can use adjusted local scores from the dual solution to find good candidate structures (decoding)

Minimum regret decoding

local scores adjusted based on clusters

find an ordering of variables

select parents based on the ordering

candidate structure

Wednesday, March 10, 2010

Page 37: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!6300

!6200

!6100

!6000

!5900

!5800

!5700

!5600

!5500

seconds

score

Dual solution with decoding

• A structure is proved optimal if it attains the dual value

• The simple LP relaxation is (almost) tight for this problem

best integral reconstruction

dual value

Wednesday, March 10, 2010

Page 38: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

• We can further tighten the approximation by iteratively partitioning the space of possible solutions and using the LP relaxation separately for each partition

• The dual LP (upper bound) is particularly effective for determining how the tree of partitions should be expanded

Branch and Bound

Wednesday, March 10, 2010

Page 39: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Partitions

parent set selectionprobabilities

current solution

score

cluster constraints

• We can tighten the approximation further by partitioning the polytope into segments and solving each segment separately

Wednesday, March 10, 2010

Page 40: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

• We can tighten the approximation further by partitioning the polytope into segments and solving each segment separately

Partitions

parent set selectionprobabilities

current solution

score

(1)

(2)cluster constraints

Wednesday, March 10, 2010

Page 41: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

• We can tighten the approximation further by partitioning the polytope into segments and solving each segment separately

parent set selectionprobabilities

Partitions

score

(1)

(2)

new solution

cluster constraints

Wednesday, March 10, 2010

Page 42: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

parent set selectionprobabilities

Partitions

score

(1)

(2)

new solution

cluster constraints

• We can tighten the approximation further by partitioning the polytope into segments and solving each segment separately

Wednesday, March 10, 2010

Page 43: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Solution...• The highest scoring structure is found after a small

number of branch and bound partitions

0 50 100 150 200 250!6095

!6090

!6085

!6080

!6075

!6070

!6065

!6060

branch and bound expansions

score

LP value

best integral reconstruction

8 seconds

certificate ofoptimality

Wednesday, March 10, 2010

Page 44: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

• Dynamic programming methods work well for small structure learning problems ...

20 25 30 35 4010

0

101

102

103

104

number of variables

seconds

Recall: exact methods

2 sec

24 sec

32 min

dynamicprogramming

Wednesday, March 10, 2010

Page 45: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

LP relaxation with BB• LP relaxation combined with branch and bound

performs similarly to dynamic programming methods

20 25 30 35 4010

0

101

102

103

104

number of variables

seconds

2 sec

24 sec

32 min

8 sec

21 mindynamic

programminglinear

programming withbranch and bound

Wednesday, March 10, 2010

Page 46: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

The alarm network• 37 variables, 1000 data points

case # x1 x2 x3 x37

1

2

3

4

10,000

3

2

1

3

2

3

2

3

2

2

2

2

3

3

2

4

3

3

1

3

!

!

" #

17

25

6 5 4

19

27

20

10 21

37

31

11 32

33

22

15

14

23

13

16

29

8 9

2812

34 35 36

24

30

72618

321 (a)

(b)

17

25 18 26

3

6 5 4

19

27

20

10 21

35 3736

31

11 32 34

1224

33

22

15

14

23

13

16

29

30

7 8 9

28

21

(c)

17

25

6 5 4

19

27

20

10 21

37

31

11 32

33

22

15

14

23

13

16

29

8 9

2812

34 35 36

24

30

72618

321 (d)

deleted

(Heckerman et al. 1995)

Wednesday, March 10, 2010

Page 47: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

The alarm network

20 25 30 35 4010

0

101

102

103

104

number of variables

seconds

2 sec

24 sec

32 min

8 sec

21 min

40 hours

dynamicprogramming

linearprogramming withbranch and bound

• Dynamic programming methods are no longer practical without additional constraints ...

Wednesday, March 10, 2010

Page 48: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

22 24 26 28 30 32 34 36 3810

0

101

102

103

104

number of variables

seconds

The alarm network

2 sec

24 sec

32 min

8 sec

21 min

40 hours

7 min

dynamicprogramming

linearprogramming withbranch and bound

• Dynamic programming methods are no longer practical without additional constraints ...

Wednesday, March 10, 2010

Page 49: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

22 24 26 28 30 32 34 36 3810

0

101

102

103

104

number of variables

seconds

The alarm network

2 sec

24 sec

32 min

8 sec

21 min

40 hours

7 min

dynamicprogramming

linearprogramming withbranch and bound

300x

• Dynamic programming methods are no longer practical without additional constraints ...

Wednesday, March 10, 2010

Page 50: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

0 500 1000 1500 2000 2500 3000 3500 4000 4500!10000

!9950

!9900

!9850

!9800

!9750

branch and bound expansions

score

Anytime solution

LP value

best integral reconstruction

Wednesday, March 10, 2010

Page 51: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

0 500 1000 1500 2000 2500 3000 3500 4000 4500!10000

!9950

!9900

!9850

!9800

!9750

branch and bound expansions

score

Anytime solution

LP value

best integral reconstruction

optimal solutionfound

certificate ofoptimality

2 min 7 min

Wednesday, March 10, 2010

Page 52: Graphs and polytopes: learning Bayesian networks with LP ...people.csail.mit.edu/tommi/papers/BNstructure_slides.pdf“Fundaci´on Rafael del Pino” Fellow. References [1] Adam Arkin,

Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out

Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./

0#,12(#'

3-142(#

5&1-6#

7&8.6,

OR3 OR1OR2cI cro

cI RNAp2

Arkin et al. 1998

• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.

50

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

3

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(a) OR3

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

2

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(b) OR2

10!2

100

102

0

0.2

0.4

0.6

0.8

1

Binding in OR

1

frepressor

/fRNA

Bin

din

g F

req

ue

ncy (

tim

e!

ave

rag

e)

RepressorRNA!polymerase

(c) OR1

Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.

became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.

7 Discussion

We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.

The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.

Acknowledgments

This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.

References

[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.

[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.

[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.

[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.

[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.

10

• Predictions are again qualitatively correct

52

Summary• Finding the highest scoring Bayesian network structure

from data is a hard combinatorial problem... but the “hard instances” may not be typical

• Our “anytime” approach to structure learning is based on linear programming relaxations that are iteratively refined in a cutting plane fashion

• The approach relies fundamentally on understanding the facets of the polytope corresponding to acyclic graphs

Wednesday, March 10, 2010