31
Representing Biological Processes: The Reactome Database Gopal Gopinathrao 1 & Peter D’Eustachio 1,2 1 Cold Spring Harbor Laboratory 2 NYU School of Medicine [email protected] [email protected]

Representing Biological Processes: The Reactome Database

  • Upload
    ianthe

  • View
    48

  • Download
    0

Embed Size (px)

DESCRIPTION

Representing Biological Processes: The Reactome Database Gopal Gopinathrao 1 & Peter D’Eustachio 1,2 1 Cold Spring Harbor Laboratory 2 NYU School of Medicine [email protected] [email protected]. Reactome is - PowerPoint PPT Presentation

Citation preview

Page 1: Representing Biological Processes: The Reactome Database

Representing Biological Processes:The Reactome Database

Gopal Gopinathrao1 & Peter D’Eustachio1,2

1Cold Spring Harbor Laboratory2NYU School of Medicine

[email protected]@med.nyu.edu

Page 2: Representing Biological Processes: The Reactome Database

Reactome is- reductionist. All of biology can be represented as events that convert input physical entities into output physical entities.- a generic parts list. Tissue and state specificity of events are not captured.- qualitative. Kinetic parameters and data are not captured.- human-centric. Experiments can use reagents from diverse sources, but most biological processes take place in single species, and our focus is on human biological processes.- manually curated. Events are annotated by expert curators, and linked to published data.- open source. All data and software are freely downloadable and reusable.

Page 3: Representing Biological Processes: The Reactome Database

Data model in a nutshell

Pathway

Pathway Reaction Reaction

CatalystActivity

Output 1

Reaction

Input 1

Input 2 Output 2

Regulation

Page 4: Representing Biological Processes: The Reactome Database

Annotating more details- post-translational modifications of proteins- exact locations of entities and eventsAnnotating more ambiguities- sets of entities - defined, open, and candidate- incompletely specified entities- “black box” reactions

Page 5: Representing Biological Processes: The Reactome Database

A geometrical compartment set for locating molecules in human cells

Page 6: Representing Biological Processes: The Reactome Database

Hemo-stasis

Apop-tosis

Insulinsignal-ing

Notchsignal-ing

Glucagonsignaling

Cell cycle& DNAreplication DNA

repairTranscription

Translation

Posttransla-tional modifi-cations

TCAcycle

Lipid metabolismAmino acidmetabolism

Nucleotidemetabolism

Xenobioticmetabolism

Carbohydratemetabolism

The starry sky view of all of Reactome

HIV & Influenzalife cycles

Sterol metab-olism

Page 7: Representing Biological Processes: The Reactome Database

Reactome Home Page

http://brie8.cshl.edu/cgi-bin/frontpage?DB=gk_central

Page 8: Representing Biological Processes: The Reactome Database

Reactome Event Page

http://brie8.cshl.edu/cgi-bin/eventbrowser?DB=gk_central&ID=163767&

Page 9: Representing Biological Processes: The Reactome Database

Export Formats

<owl:Ontology rdf:about=""> <owl:imports rdf:resource="http://www.biopax.org/release/biopax-level2.owl" /> <rdfs:comment rdf:datatype="http://www.w3.org/2001/XMLSchema#string">BioPAX pathway converted from "DNA Replication" in the Reactome database.</rdfs:comment> </owl:Ontology> <bp:pathway rdf:ID="DNA_Replication"> <bp:PATHWAY-COMPONENTS rdf:resource="#Regulation_of_DNA_replicationStep" /> <bp:PATHWAY-COMPONENTS rdf:resource="#DNA_strand_elongationStep" /> <bp:PATHWAY-COMPONENTS rdf:resource="#DNA_replication_initiationStep" /> <bp:PATHWAY-COMPONENTS rdf:resource="#Switching_of_origins_to_a_post_replicative_stateStep" /> <bp:PATHWAY-COMPONENTS rdf:resource="#DNA_Replication_Pre_InitiationStep" /> <bp:ORGANISM rdf:resource="#Homo_sapiens" /> <bp:NAME rdf:datatype="http://www.w3.org/2001/XMLSchema#string">DNA Replication</bp:NAME> <bp:SHORT-NAME rdf:datatype="http://www.w3.org/2001/XMLSchema#string">DNA Replication</bp:SHORT-NAME> <bp:XREF rdf:resource="#Reactome69306" /> <bp:XREF rdf:resource="#REACT_383.2" /> <bp:COMMENT rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Studies in the past decade have suggested that the basic mechanism of DNA replication initiation is conserved in all kingdoms of life. Initiation in unicellular eukaryotes, in particular Saccharomyces cerevisiae (budding yeast), is well

Page 10: Representing Biological Processes: The Reactome Database

Bioinformatics Access

• BioMart API• MySQL/Perl API• MySQL/Java API• SOAP/WSDL Interface (multiple languages)• Flat files• Database dumps• Local site install (instructions going into

CPBI)

Page 11: Representing Biological Processes: The Reactome Database

Inference Statistics

direct curation underway

Page 12: Representing Biological Processes: The Reactome Database

Validation of inference

• Comparison of manually curated yeast reactions from YBP with inferred reactions from human Reactome

• Sensitivity: 72%• Specificity: 78%

Page 13: Representing Biological Processes: The Reactome Database

Inferring chicken reactions from curated human ones

Page 14: Representing Biological Processes: The Reactome Database
Page 15: Representing Biological Processes: The Reactome Database

Gaps in Reactome

Gopal Gopinathrao, PhDReactome, CSHL

Page 16: Representing Biological Processes: The Reactome Database

1) Gaps in Reactome annotation

2) Gaps in annotate-able information

3) What a network / pathway ontology can do to fill this gap?

Page 17: Representing Biological Processes: The Reactome Database

Information

Cellular

Pathogens

Page 18: Representing Biological Processes: The Reactome Database

Pathogens

Information

Metabolism

Page 19: Representing Biological Processes: The Reactome Database

Signaling Signaling

Information

Page 20: Representing Biological Processes: The Reactome Database

Protozoan/Host interactions

Developmental pathwaysTranscriptional regulationFeedback loops

Neuroscience topicsDegenerative diseasesSynaptic processes

Cancer processes

OMIM-functional (biochemical)

Complex diseases

Cellular differentiation, Regulation

Domains of Biology waiting to be Reactomized

Page 21: Representing Biological Processes: The Reactome Database

Metabolism

Cellular housekeeping

Information

Signaling

Pathogens/ Host interactions

476

755

376

414

600

Unique human proteins used in pathways (in March 2008)

2500

Swissprot section of UniProt~16,000

Page 22: Representing Biological Processes: The Reactome Database

0

1000

2000

3000

4000

5000

6000

10 20 30 40

release

totalproteins

unique proteinsunique + isoforms

Gaps in Reactome annotation

Page 23: Representing Biological Processes: The Reactome Database
Page 24: Representing Biological Processes: The Reactome Database

Some pathway/Int dbs are more equal?

Page 25: Representing Biological Processes: The Reactome Database

Are all Swissprot proteins annotatable forpathways/interactions?

Can all interactions can be placed in a biologically relevant‘pathway’ or even sub-graphs of a network?

If yes, who is going to validate and how, the biological ‘truth’ of any subgraphs derived from a network?

[Terms of biological truth - tissue, regulation, developmental stage, expression …]

Mind what gets filled in…

Page 26: Representing Biological Processes: The Reactome Database

Watching the gap…

Source Type Protein(SwissProt) Coverage(SwissProt) InteractionPathways 5283 (3847) 21% (27%) 118867

PPIs 10674 (6298) 42% (44%) 43797Total 13318 (7590) 53% (53%) 162664

Data Source Protein(SwissProt) Coverage (SwissProt) Interaction CitationReactome 1229 (1194) 5% (8%) 21394 Vastrik et al , 2007

Panther 2997 (1670) 12% (12%) 75694 Mi et al , 2007CellMap 567 (567) 2% (4%) 1195 cancer.cellmap.org

INOH 719 (711) 3% (5%) 11759 Kushida et al , 2006NCI-Nature 593 (592) 2% (4%) 2900 pid.nci.nih.gov

NCI-BioCarta 936 (936) 4% (6%) 4752 pid.nci.nih.govKEGG 2033 (1947) 8% (13%) 11144 Kanehisa et al , 2004Total 5283 (3847) 21% (27%) 118867

Adding in pathway data decomposed to interactions …

Adding PPI data to the above …

Page 27: Representing Biological Processes: The Reactome Database

NBC Predictions in Reactome

Page 28: Representing Biological Processes: The Reactome Database

How a network / pathway ontology may help to fill the gap in pathway annotations..

Page 29: Representing Biological Processes: The Reactome Database

A<----->B

C<----->B

A<----->D

C<----->D

C<----->A

A<-----| B

Known

Novel

New regulatory event

A+B+C+D

Feedback loop

ABCD complex

Page 30: Representing Biological Processes: The Reactome Database

ABCD complex

1. A+B+C+D

2. Interaction of C and D may regulate ABCD complex formation

Updated model for curation would be:

C<----->DA<----->B Novel

Feedback loop

3. Post-translational inhibition of B by A may result in down regulation of A, there by affecting the stability of complex ABCD

Evidence from a network ontology

A<-----| B

New regulatory event

Evidence from a network ontology in a model organism

Page 31: Representing Biological Processes: The Reactome Database

The Team

• CSHL– Lincoln Stein (PI)– Gopal Gopinathrao (managing editor)– Marc Gillespie, Lisa Matthews, Bruce May, Mike Caudy

(curators)– Guanming Wu, Alex Kanapin (developer)

• EBI– Ewan Birney (coPI)– Esther Schmidt, Imre Vastrik, David Croft (developers)– Bernard de Bono, Bijay Jassal, Phani Garapati (curators)

• NYU– Peter D’Eustachio (co-PI; editor-in-chief)– Shahana Mahajan (curator)

P41 HG003751