50
1 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005 Using BioCyc to reconstruct constraint-based models of metabolism : the Acinetobacter ADP1 case study François Le Fèvre & Vincent Schachter Genoscope, Evry, France

Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

1 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Using BioCyc to reconstruct constraint-based models ofmetabolism : the Acinetobacter ADP1 case study

François Le Fèvre & Vincent SchachterGenoscope, Evry, France

Page 2: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

2 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Genoscope’s scientific focus

Genoscope :– French national genome center– Research institute

Themes :• Genome annotation, genome evolution

– Procaryote genome annotation– Paramecium annotation and genome evolution : reconstructing ancestral

genomes• Procaryote metabolism

– Wet : efficient mutagenesis and phenotyping platform, biochemistry, adaptiveevolution experiments

– Dry : modeling and analysis of regulatory and metabolic networksSome ongoing projects :

« Metabolic Thesaurus » : elucidating the metabolism of a model bacterium,Acinetobacter ADP1

Adaptive evolution experiments on bacteria « Project Biodiversity - Cloaca maxima » : analysis of a large bacterial

metagenome (samples from water retreatment facility) -> bacterial diversity andbiodegradation

Page 3: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

3 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Outline

• Context : the Metabolic Thesaurus Project

• Reconstructing a global model of Acinetobacter ADP1 metabolism– Ab-initio reconstruction of a global metabolic model– Initial model refinement using phenotype experiments and predictions– Use of BioCyc as a “middleware”

• Ongoing development of OOcyc– an API (+ RDB + O/R mapping) aimed at querying/ manipulating

BioCyc information in an object oriented manner

• Conclusions & future work

Page 4: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

4 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Outline

• Context : the Metabolic Thesaurus Project

• Reconstructing a global model of Acinetobacter ADP1 metabolism– Ab-initio reconstruction of a global metabolic model– Initial model refinement using phenotype experiments and predictions– Use of BioCyc as a “middleware”

• Ongoing development of OOcyc– an API (+ RDB + O/R mapping) aimed at querying/ manipulating

BioCyc information in an object oriented manner

• Conclusions & future work

Page 5: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

5 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

• Bacterial metabolism is very diverse and not well known outside of E.coli– What can we learn of the metabolism of a new model bacterium ?– Can metabolic modeling help ?

• Experimental context :– Reliable genome annotation– Medium/high throughput mutant construction capability– Medium/high throughput phenotyping capability– Additional experiments on an ad-hoc basis :

• Enzyme kinetics• Gene expression data• Metabolic concentrations

• Modeling goals :– Reconstruction of global metabolic network model

• Initial model reconstruction• Correct/complete model iteratively using phenotype experimental results• Reach best fit between phenotype experiments and phenotype predictions

– Improve modeling framework to account for regulatory effects– Assess information content and structure of large-scale phenotype data– Help with experiment design

The Metabolic Thesaurus Project

WORK IN PROGRESS

Page 6: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

6 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Choosing a model organism : Acinetobacter sp. ADP1

• Acinetobacter sp. ADP1– γ-proteobacteria, Pseudomonales group– Gram negative– Soil bacterium– Non flagellated

• A good model for the study of procaryote metabolism– Nutritionally versatile– Strictly aerobic– Fast growing– Highly competent (natural transformation)– Non-pathogenic– Fairly close to E.coli, closer to Pseudomonas putida– Recent evidence of xenobiotic degradation capabilities

Page 7: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

7 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Known genes : 1150 genesPutative genes : 907 genesConserved Hypothetical protein : 686 genes (21%)Hypothetical proteins: 458 genes (14%)

36% of CDS haveunknown functions

Expert annotation of Acinetobacter sp. ADP1 : 3201 CDS Barbe et al., 2004

Genome Annotation

Page 8: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

8 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Collection of gene replacement mutants of Acinetobacter sp. ADP1:

Systematic gene knockout project using a gene replacement procedure consisting of an excision of the targetgene via homologous recombination and insertion of a kanamycin resistance marker (Metzgar et al., 2004)

Known genes : 684 KO (59% of all known genes)Putative genes : 736 KO (81% of all putative genes)Conserved hypothetical protein : 482 KO (70% of all CHPs)Hypothetical proteins: 247 KO (54% of all HPs)

~300 KO could not be obtained on succinate.Almost all are known to be essential in E. coli

Systematic phenotyping of the collection on carbon sources

Currently :~25000 phenotype data points 18 carbon sources + 2 nitrogen source on 1350 KO

2150 KO

Under development :• high-throughput phenotyping on the collection for ~50 C/N/S sources• 1200 KO/plate

Status of the collection:

Page 9: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

9 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

• Variables of interest : metabolite fluxes– State of the system : distribution of fluxes through all reactions

• Steady-state assumption

• Key ideas :– Reason on sets of metabolic states rather than on unique « solution » states

• Allows for incompleteness or inaccuracies in both the reaction network structure andthe experimental constants

• The feasible solution space can be progressively reduced using constraints(thermodynamic, input/output, regulatory…)

– Allows prediction of optimal (or good) flux distributions given• an optimization principle• a defined metabolic objective (or several objectives)

– Characterize structural features of flux distributions resulting from constraints• Evolutionary studies• Correlation with regulatory effects

The constraint-based modeling framework(aka global stoichiometric models)

Page 10: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

10 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Acinetobacter’s genome annotation

Select a set ofreactions

Design a fluxmodel

Check modelconsistency

AcinetoCyc’s network representation

AcinetoCBM model v.n+1 beta

Compare model predictions

with experimental results

AcinetoCBM model v.n

AcinetoCBM model v.n+1

Adapt the set of reactions to the constraints of theflux models: mass balance, reactions’ reversibility.Build genes-reactions correspondence (detectcomplexes, isozymes).Define fluxes modeling the production of biomass.

Detect missing reactions.Improve gene-reaction correspondence.Test network against metabolic objectives to assessits relevance.

Iterativeprocess of

networkimprovement

Expert

Curation

Iterative building of Acinetobacter’s flux model

Page 11: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

11 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Reconstruction infrastructure

PGDB

JavaCyc

OoCyc

Pathway Tools

Flux Models

FluxAnalyzer +

ExtensionsFluxDB

LIMSPhenotypes,biochemistry

Caphe

Genome

MaGe PkGDB

Matlab Cytoscape R

API

Page 12: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

12 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Pathways added in the curation process– Entner-Doudoroff,– Entner-Doudoroff Neoglucognesis Pentose Phosphate Superpathway,– Degradation of glucarate and galactarate,– Ferulate catabolism,– Caffeate catabolism,– Superpathway of hydroxycinnamates catabolism,– …, with corresponding new compounds and reactions.

3428Genes

3377Polypeptides

3Transport Reactions

1198Enzymatic Reactions

234Pathways

76tRNAs2274Transcription Units864Compounds3 !Transporters825Enzymes8+276Protein Complexes

Acinetocyc curation

Page 13: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

13 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Reconstruction of Acinetobacter’s metabolic network

Centralmetabolism

Amino acidsbiosynthesis

Nucleotidesbiosynthesis

Transportreactions

Lipidsmetabolism

Alternate carbonsources

Cofactorsbiosynthesis

64

10059 819352

86

535 reactions using 498 metabolites linked with 480 genesdispatched in 7 functional sub-networks.

297 enzymatic complexes have been inferred by naïve comparison with E. coli. (Draftversion: expert curation to be performed)

Page 14: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

14 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Reconstruction of Acinetobacter’s metabolic network

– Simulate growth by a reaction using the precursor metabolites ofbiomass

– Optimize the flux distribution to maximize this growth flux

Amino acids

PhospholipidsLipopolysaccharidesPeptidoglycan, other glycans

Nucleotides

ATP production

CoA , THFTHF, NAD ...

Proteins

Membrane components

DNA, RNA

Energy

Cofactors

Biomass

Global growth reactionPrecursors Biomass componentsPre-assembly reactions

Page 15: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

15 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Constraint-based models are not deduced trivially from PGDBs(& vice-versa !)…

Well definedreactions

AcinetoCyc

• Transport reactions• Biomass reaction(s)• Precursor assembly reactions• Polymerization reactions

Flux modelWell definedreactions

AcinetoCyc

• Reactions with no corresponding gene• Unbalanced reactions• Non-metabolic reactions• Reactions with undefined compound(s)• Reactions with generic compound• Reactions with undefined reversibility

Flux model

Page 16: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

16 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Prediction of mutant phenotypes

• Principle: for each mutant phenotype “experiment”,– determine which reaction to block (simulate the knock-out),– determine exchange fluxes to open (simulate the growth medium),– optimize biomass flux.

Objective flux1 I + 1 J

Objective flux1 I + 1 J

• Growth on B alone is possible. • Growth on B alone is impossible

• Growth on A and B is possible.

Deleted reaction

Wild type network Mutant network

Page 17: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

17 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Prediction of mutant phenotypes

• Discretization of predicted biomass flux values :» no growth (-),» slow growth relative to the wild type (+),» normal growth (++).

• Current dataset :• 480 in silico KO (metabolic genes)• 1300 experimental KO + 195 genes suspected to be essential overlap of 186 genes.

Page 18: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

18 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Analysis of local inconsistencies

34 inconsistencies with first set of experiments x predictions

1/01/01/01/01/01/0P -P antetheineadenylyltrans ferase2893

1/11/11/11/01/11/1G3P dehydrogenase0546

1/11/01/01/01/01/0O xoglutaratedehydrogenase2876

1/11/01/01/01/01/0O xoglutaratedehydrogenase2875

1/01/01/01/01/01/0Glutamate--tRNA ligase3371

1/01/01/11/01/11/1Glutamate--tRNA ligase0272

0/11/11/11/10/10/1Succ inate dehydrogenase2879

0/10/10/10/10/10/1A sparaginase0476

Succ.Pyru.Lact.Gluc.Acet.akG

lInitial enzymatic activityGene

Missing isozyme, later confirmed

Hypothesis: pair of isozymes, constitutive versusregulated expression

Evidence for the existence of a complex ratherthan isozymes

CoA had to be added to biomass production,forcing completion of co-factor synthesispathways

Possible explanations

Detection of experimental error: gene was notdeleted

Detection of experimental error: gene in Entner-Doudoroff operon

IS: in silico / IV: in vitro.

0 : 0 no growth 1 : WT value normal growth

0/1

1/0 1/1

False negative

False positive True positive

Page 19: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

19 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Predictive quantitative growth : preliminary results

Experiment :– aerobic growth of wt Acinetobacter on glutamate + nh4 + minimal media– carbon source consumption and biomass production

• Preliminary but encouraging results• More experimental data coming…• Possible improvements :

– Refine energy consumption-related pathways– Refine biomass composition

In silico vs in vivo growth rates

0

5

10

15

20

25

30

35

40

45

50

0 1 2 3 4 5 6 7 8 9 10

Time (h)

Glu

tam

ate

up

tak

e r

ate

(m

mo

l/h

/gD

W)

0

0,5

1

1,5

2

2,5

Gro

wth

ra

te (

1/h

)

Exp. Glutamate Up. Rate

Exp. Growth Rate

Sim. Growth Rate

Page 20: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

20 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Visualizing the progress of growth phenotype assays

Page 21: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

21 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Ongoing related work

• Development of methods to learn metabolic models from experimental data– Use phenotype profiles systematically to refine models– Extend modeling framework to exploit information on metabolite concentration

• Metabolism & regulation ?• Assessment of flux coupling graph variability with environmental change

– Most coupling relationships are either very specific or very generic– Some coupling relationships exhibit more suprising patterns of occurrence

• Identify new enzymatic functions from a wastewater retreatmentmetagenome– species inventory , enzymatic activities inventory, gene inventory

• Estimate : ~1000 species, mostly uncultured– assessment (some) metabolic processes in wastewater retreatment– those enzymatic functions that can’t be found by straigthforward homology…

Page 22: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

22 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Outline

• Context : the Metabolic Thesaurus Project

• Reconstructing a global model of Acinetobacter ADP1 metabolism Ab-initio reconstruction of a global metabolic model Initial model refinement using phenotype experiments and predictions Use of BioCyc as a “middleware”

• Ongoing development of OOcyc an API (+ RDB + O/R mapping) aimed at querying/ manipulating

BioCyc information in an object oriented manner

• Conclusions & future work

Page 23: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

23 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Existing means to query/manipulate BioCyc information(as far as we know !)

• Graphical User Interface– Web-based– PathwayTools Interface

• Through the Pathway tools server– Lisp API– PerlCyc– JavaCyc

• Exports– text files, attribute-value or tab-delimited– SBML format– BioPAX format

• Biowarehouse– SQL queries

Page 24: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

24 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Accessing BioCyc information : the Lisp API

• Need to know Lisp• Low-level mySQL database schema SQL queries are…tedious-

• Lisp is the native language of Pathway-tools,• Multi-user editing (RDBMS versions)• Access to a transaction log of all PGDB edits

+

Ocelot

Oracle

MySQL

Lisp

Pathwaytools

server Oct 2005

Page 25: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

25 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Accessing BioCyc information : the JavaCyc and PerlCyc APIs

Ocelot

Oracle

MySQL

Lisp

Pathwaytools

server

JavaCyc

PerlCyc

• No representation of GFP entities in Perl or Java• « Read-only » (JavaCyc)• Need to run a special socket server• Only one such connection can be opened• OS dependant : Unix only

-

+ • Solutions for Lisp function access through java or perl.

Page 26: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

26 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Accessing BioCyc information : the Biowarehouse API

Ocelot

Oracle

MySQL

BioCyc

loader

Biowarehouse

MySQL

• « read-only »• loss of information wrt BioCyc model (transcription unit, citation, compound-

pathway classification, functional category …)• querying is a bit tedious

• relational schema closer to biological concepts• a solution to query BioCyc info integrated with external data• simple installation• fast data upload

-

+

SQL

Page 27: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

27 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Motivations for OOcyc development

Within the context of Acinetocyc curation and Acinetobacter’s CBMreconstruction :

• Overcome some limitations of BioCyc + Pathway tools– OS-dependency (JavaCyc : Unix)– no concurrent access (until recent MySQL version)– memory management (load entire PGDB in memory)

• Manipulate objects close to natural biological representations via (non-LISP) queries

• Avoid loss of information with respect to Biocyc’s model

• Add import / export features

• Benefit from existing Java tools / packages

before release 9.5…

Page 28: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

28 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

What is OoCyc ?

• OoCyc is an Object Oriented API aimed at accessing andmanipulating BioCyc information

• OoCyc stores information in a relational database in (your favorite)RDBMS

• OoCyc maps :– the FRS model onto an OO representation (& vice-versa)– frame instances onto objects (& vice-versa)

• OoCyc imports / exports Biocyc information (frame instances)from/to XML files (serialized objects)

Page 29: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

29 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

From Frames … … to Java classes

OoCycBioCycFrame

SlotValue

Annotation1

Annotation1.Value

RXN-1

Left

Glucose

Compartment

Cytoplasm

SlotValue

Parents

GeneralizedReactions

Left

Value=GlucoseCompartment=Cytoplasm…

GeneralizedReactions

Id=RXN-1…

Chemicals

Id=glucose…

Slot …

Top level parent ClassInheritance path association classesSlot Association classSlot Value Attribute valueAnnotation Attribute nameAnnotation.value Attribute value

Mapping rules

Page 30: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

30 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

BioCyc RDBMS

Schema Generator

Mappingrules

JaxB

XmlTools

XMLSchema

Hibernate

O/Rmapper

FrameClassesFrame

ClassesFrame

Classes

JAVAClasses

OoCyc’s pipeline - initial step: code and database generation

Page 31: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

31 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

BioCyc RDBMSJAVAObject

Hibernate

O/Rmapper

OO / FRSTranslator

Bio2Oo

Oo2Bio

FrameInstanceFrame

InstanceFrame

Instance

OoCyc’s pipeline - daily step: data synchronization

Page 32: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

32 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

OoCyc’s software architecture

RDBMS

Hibernate

XML

Jaxb

GenoscopeSoftware, toolsData, Files

MySQL /DB2 / PostgreSQLOracle / Sybase / HSQLMicrosoft SQL server …

Ocelot Oracle MySQL

JavaCyc

Lisp - Pathway Tools

Object - relational mapping Object - xml mapping

SchemaGenerator

OO / FRSTranslator

ObjectModel

XMLTools

OoCyc

Page 33: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

33 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Beneficits of using OoCyc :Java API matching biological concept

In your favorite IDE, one click access to your OOCyc class-specific functions…

Page 34: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

34 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Beneficits of using OoCycQuery in OQL (Object Query Language)

1 (defun atp-inhibits () ;; We check every instance of the class2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) ;; Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP) ;; Whenever the test is positive, we collect the value of the

slot ENZYME The collected values are returned as a list, oncethe loop terminates.

4 collect (get-slot-value x ’ENZYME))5 ) ;; invoking the query:6 (select-organism :org-id ’ECOLI)7 (atp-inhibits)

Lisp : Find all enzymes for which ATP is an inhibitor

Page 35: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

35 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

1 use perlcyc;2 my $cyc = perlcyc -> new("ECOLI");3 my @enzrxns = $cyc -> get_class_all_instances("|Enzymatic-

Reactions|"); ## We check every instance of the class4 foreach my $er (@enzrxns) { ## We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -> member_slot_value_p($er,"Inhibitors-

All", "Atp");6 if ($bool) { ## Whenever the test is ≥0, we collect the value of the slot ENZYME. The results are printed in theterminal7 my $enz = $cyc -> get_slot_value($er, "Enzyme");8 print STDOUT "$enz\n";9 }10 }

Beneficits of using OoCycQuery in OQL (Object Query Language)

PerlCyc : Find all enzymes for which ATP is an inhibitor

Page 36: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

36 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Beneficits of using OoCycQuery in OQL (Object Query Language)

1 import java.util.*;2 public class JavacycSample {3 public static void main(String[] args) {4 Javacyc cyc = new Javacyc("ECOLI");5 ArrayList enzrxns = cyc.getClassAllInstances("|Enzymatic-

Reactions|");6 for (int i = 0; i < enzrxns.size(); i++) {7 String er = (String)enzrxns.get(i);8 boolean bool = cyc.memberSlotValueP(er, "Inhibitors-

All","Atp");9 if (bool) {10 String enz = cyc.getSlotValue(er, "Enzyme");9 System.out.println(enz);11 }12 }13 }14 }

JavaCyc : Find all enzymes for which ATP is an inhibitor

Page 37: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

37 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Beneficits of using OoCycQuery in OQL (Object Query Language)

1 SELECT DISTINCT DBID.xid2 FROM DBID, Protein, EnzymaticReaction, Chemical, DataSet,

EnzReactionInhibitorActivator3 WHERE DataSet.name=EcoCyc4 AND DataSet.wid=EnzymaticReaction.datasetwid5 AND EnzymaticReaction.proteinwid = Protein.wid6 AND EnzymaticReaction.wid = EnzReactionInhibitorActivator.enzymaticreactionwid7 AND EnzReactionInhibitorActivator.compoundwid=Chemical.wid8 AND EnzReactionInhibitorActivator.inhibitoractivate=I9 AND Chemical.name=ATP10 AND DBID.otherwid = Protein.wid

Biowarehouse : Find all enzymes for which ATP is an inhibitor

Page 38: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

38 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Beneficits of using OoCycQuery in OQL (Object Query Language)

• OQL is a powerful and easy-to-use SQL-like query language– object-oriented, understanding notions like inheritance, polymorphism and

association.

1 SELECT reaction FROM ENZYMATICREACTIONS reaction2 LEFT JOIN reaction.INHIBITORSALLInternal inhibitors3 WHERE reaction.ORGANISM.VALUE LIKE :organism

AND4 "ATP" = inhibitors.VALUE

OoCyc : Find all enzymes for which ATP is an inhibitor

Page 39: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

39 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Summary of representation & query solutions

YesYesNoIndependent fromPathway tools

Yes / YesYes / No-Import / Exportto BioCyc

Full(but not perfect…)

Partial-BioCycCompatibilty

Java-Lisp, Perl, JavaAPI for :

OQL SQLJava

SQLSQLLispQuery

mySQL , OracleSybase,PostgreSQL …

mySQLmySQLoracle

OcelotStorage

Object modelRelational modelFRSRepresentationSystem

OoCycBiowarehouseBioCyc

Page 40: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

40 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

What did we use OoCyc for ?

• Extract data from BioCyc– to integrate them with other data types

• clusters of transcription units, pathways, complexes, Regulon DB..– to automate more steps of constraint-based building :

• group together Pathways for map design in Flux Analyzer,• assemble elementary bricks of the metabolic network, based on reactions / genes /

compounds found in BioCyc,– to manipulate and visualize metabolic graphs (Cytoscape)– also : export gene-reaction correspondances

• Link Biocyc to other DBs, develop BioCyc-related algorithms– Genoscope’s LIMS use BioCyc compounds repository as a language of media

representation,– Protein complexes comparative inference

• Modify or implement new housekeeping functions for a PGDB– add / update data from a given annotation database (e.g. multifun, cellular

localization of genes products, pubmed links...)– delete / annotate pathways as part of curation process

• keep that information even after annotation or BioCyc updates

Page 41: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

41 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Ongoing & future work

• Improve translation scheme between FRS and Java classes– Map complete inheritance hierarchy– Generate associations directly instead of using Ids– cardinality constraints

• Software platform dedicated to multi-CBM management– Basic constraint-based model management– Representation of sets of environments– Representation of perturbation sets, generation of corresponding

models– Pluggable analysis modules (topological analyses, phenotype

predictions…)

Page 42: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

42 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Open questions about OOCyc

• Would OOCyc be useful outside of Genoscope ?

• If so, what would be the primary uses ?– Java access to Biocyc entities– store PGDBs in relational databases– manipulate classes representing biological concept– …

• Packaging– jar files ?– eclipse project ?– eclipse plugin ?– Sourceforge project ?

Page 43: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

43 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Acknowledgements

Metabolic Thesaurus

Alain PerretValérie BarbeGorges CohenAnnett KreimeyerNuria FonknechtenVeronique De BernardinisMarcel SalanoubatJean Weissenbach

Computational Biology Group

Pierre-Yves BourguignonCyril CombeMaxime DurotFrancois Le FevreMatthias HeinigBenjamin PinaudSerge Smidtas

AGCDavid VallenetClaudine Mé[email protected]

SRIPeter Karp& group

IT group

Ali BoudaniLaurent Sainte-MartheClaude Scarpelli

Page 44: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

44 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

OoCyc:the development of OoCyc, an additional API

to query, manipulate BioCyc information in anObject Oriented manner

Part two

Page 45: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

45 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

References• Flux balance analysis

•Metabolic modelling of microbial strains in silico ; Covert et al ; Trends in Biochemical Sciences 2001,26(3):179-186.•Genome-Scale Metabolic Model of Helicobacter pylori 26695 ; Schilling et al ; Journal of Bacteriology 2002,184(16):4582-459•Metabolic pathways in the post-genome era ; Papin et al ; Trends in Biochemical Sciences 2003, 28:250-258.•Theory for the Systemic Definition of Metabolic Pathways and their use in Interpreting Metabolic Function froma Pathway-Oriented Perspective ; Schilling et al ; J. theor. Biol. 2000, 203:229-248.•Assessment of Metabolic capabilities of Haemophilus influenzae Rd through a genome-scale pathway analysis ;Schilling et al ; J. theor. Biol. 2000, 203:249-283.•Metabolic network structure determines key aspects of functionality and regulation ; Stelling et al ; Nature 2002,420:190-193.•Characterizing the Metabolic Phenotype: A Phenotype Phase Plane Analysis ; Edwards et al ; Biotechnology andBioengineering 2002, 77:27-36.•In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data ; Edwards etal ; Nature 2001, 19:125-130.

• Biocyc /MetaCyc /EcoCyc•Karp P.D., Arnaud M., Collado-Vides J., Ingraham J., Paulsen I.T., Saier M.H. Jr. (2004). "The E. coli EcoCycDatabase: No Longer Just a Metabolic Pathway Database." ASM News 70(1): 25-30.•Cynthia J. Krieger, and Peter D. Karp, MetaCyc: a multiorganism database of metabolic pathways and enzymes,

Nucleic Acids Research, 32(1):D438-42 2004.• P. Karp, S. Paley, and P. Romero, « The Pathway Tools Software », , Bioinformatics 18:S225-32 2002.• S. Paley and P. Karp, "Evaluation of computational metabolic-pathway predictions for H. pylori,"Bioinformatics 18(5):705-14 2002.• Krummenacker M, Paley S, Mueller L, Yan T, Karp PD. , Abstract Querying and computing with BioCycdatabases. Bioinformatics. 15;21(16):3454-5. 2005.

Page 46: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

46 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Our general strategyAcinetoCyc

Model

Thesaurus

V.n

V.n+1

Building Acineto’metabolic network

BuildingAcineto’s

metabolic model

FBA anaysis

Annotationcomplements

Hypothesis formulationExperiment design

Metabolic NetworkRefinement

Static metabolicnetwork representation

Experimental results

Annotation database

Metabolic Flux Models

PkGDB

Model perturbationgene deletion, fluxcontraints, regulationModel topology analysis

Page 47: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

47 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

BioCyc and OoCyc’s architecture

OoCyc

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software, tools

Data, Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 / PostgreSQLOracle / Sybase / HSQLMicrosoft SQL server …

HyperJaxb

supplementaryslides

Page 48: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

48 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

OoCyc’s architecture

GenoscopeSoftware, toolsData, Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

OoCyc

Flux Analyser

JungCytoscape Modulo

MetabolicModel

…+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

Page 49: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

49 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

True query in OQLpublic List getAllErInhibitedBy(String org, String inhibitor) {

try {Query hql = HibernateUtil.getSession(keySession).createQuery("select

distinct er fromfr.cns.genoscope.nemo.hypercyc.impl.ENZYMATICREACTIONSImpl er "+

" join er.INHIBITORSALLInternal inhibitors" +" where " +“ er.ORGANISM.VALUE like :organism and " +“ :inhibitor = inhibitors.VALUE" +“ ");

hql.setString("organism", org);hql.setString("inhibitor", inhibitor);return hql.list();

} catch (HibernateException e) {e.printStackTrace();

}return null;

}

Page 50: Using BioCyc to reconstruct constraint-based models of ...sergi5.com/cyclone/genova2005.pdf · –Ab-initio reconstruction of a global metabolic model –Initial model refinement

50 OOcyc - F.Lefèvre & V.Schachter, Genoscope, - Pathway Tools User’s Group meeting, SIB, 12/2005

Java APIpackage fr.cns.genoscope.nemo.hypercyc.cycs.acineto;

public class Acineto {public static void main(String[] args) {

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO("hypercyc");List r = erdao.getAllErInhibitedBy("ecoli","ATP");ENZYMATICREACTIONS er;for(int i=0; i < r.size(); i++){

er = (ENZYMATICREACTIONS)r.get(i);System.out.println(

er.getFRAME()+"\t"+((ENZYME)er.getENZYME().get(0)).getVALUE());}

}