42
Goal and Status of the OBO Foundry Barry Smith

Goal and Status of the OBO Foundry Barry Smith. 2 Semantic Web, Moby, wikis, crowd sourcing, NLP, etc. let a million flowers (and weeds) bloom to

Embed Size (px)

Citation preview

Goal and Status of the OBO Foundry

Barry Smith

2

Semantic Web, Moby, wikis, crowd sourcing, NLP, etc.

let a million flowers (and weeds) bloom

to create integration rely on (automatically generated?) post hoc mappings

The result is noisy

How create broad-coverage semantic annotation systems for biomedicine?

Perhaps even deadly

3

4

for science

develop high quality annotation resources in a collaborative, community effort

creating an evolutionary path towards improvement of terminologies of the sort we find elsewhere in science

Foundry alternative:prospective standardization

5

6

what makes GO so wildly successful ?

7

science basis of the GO: trained experts curating peer-reviewed literature

different model organism databases employ scientific curators who use the experimental observations reported in the biomedical literature to associate GO terms with gene products in a coordinated way

The methodology of annotations

8

cellular locations

molecular functions

biological processes

used to annotate the entities represented in the major biochemical databases

thereby creating integration across these databases and making them available to semantic search

A set of standardized textual descriptions of

9

and also

need to extend the GO by engaging ever broader community support for the addition of new terms and for the correction of errors

need to extend the methodology to other domains, including clinical domains

10

this requires that weestablish common rules governing best practices for creating ontologies and for using these in annotations

apply these rules to create a complete suite of orthogonal interoperable biomedical reference ontologies

11

shared portal + low regimentation

http://obo.sourceforge.net NCBO BioPortal

2003

12

The OBO Foundryhttp://obofoundry.org/

2006

13

A prospective standarddesigned to guarantee interoperability of ontologies from the very start (contrast to: post hoc mapping)

established March 2006

12 initial candidate OBO ontologies – focused primarily on basic science domains

several being constructed ab initio

by influential consortia who have the authority to impose their use on large parts of the relevant communities.

14

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy?)

Anatomical Entity

(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULE Molecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)Building out from the original GO

15

OBO Foundry = a subset of OBO ontologies, whose developers have agreed in advance to accept a common set of principles reflecting best practice in ontology development designed to ensure

tight connection to the biomedical basic sciences

compatibility

interoperability, common relations

formal robustness

support for logic-based reasoning

The OBO Foundry http://obofoundry.org/

16

CRITERIA

The ontology is OPEN and available to be used by all.

The ontology is in, or can be instantiated in, a COMMON FORMAL LANGUAGE.

The developers of the ontology agree in advance to COLLABORATE with developers of other OBO Foundry ontology where domains overlap.

CRITERIA

The OBO Foundry http://obofoundry.org/

17

CRITERIA UPDATE: The developers of each ontology

commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement.

ORTHOGONALITY: They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single controlled vocabulary.

The OBO Foundry http://obofoundry.org/

18

for science

if we annotate a database or body of literature with one high-quality biomedical ontology, we should be able to add annotations from a second such ontology without conflicts

AND WITHOUT THE NEED FOR MAPPINGS

orthogonality of ontologies implies additivity of annotations

The OBO Foundry http://obofoundry.org/

19

CRITERIA

IDENTIFIERS: The ontology possesses a unique identifier space within OBO.

VERSIONING: The ontology provider has procedures for identifying distinct successive versions to ensure BACKWARDS COMPATIBITY with annotation resources already in common use

The ontology includes TEXTUAL DEFINITIONS and where possible equivalent formal definitions of its terms.

CRITERIA

20

CLEARLY BOUNDED: The ontology has a clearly specified and clearly delineated content.

DOCUMENTATION: The ontology is well-documented.

USERS: The ontology has a plurality of independent users.

CRITERIA

The OBO Foundry http://obofoundry.org/

21

COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.*

* Smith et al., Genome Biology 2005, 6:R46

CRITERIA

The OBO Foundry http://obofoundry.org/

Anatomy Ontology(FMA*, CARO)

Environment

Ontology(EnvO)

Disease, Disorder and

Treatment (OGMS)

Biological Process

Ontology (GO*)

Cell Ontology

(CL)

CellularComponentOntology

(FMA*, GO*) Phenotypic Quality

Ontology(PaTO)

CHEBI

Sequence Ontology (SO*) Molecular

Function(GO*)Protein Ontology

(PRO*) Extension Strategy – Downward Population 22

top level

mid-level

domain level

Information Artifact Ontology

(IAO)

Ontology for Biomedical

Investigations(OBI)

Spatial Ontology

(BSPO)

Basic Formal Ontology (BFO)

OGMS

Downward Population+

Hub-Spokes Strategy

OGMS

Cardiovascular Disease OntologyGenetic Disease OntologyCancer Disease OntologyGenetic Disease OntologyImmune Disease OntologyEnvironmental Disease OntologyOral Disease OntologyInfectious Disease Ontology…

OGMS

Cardiovascular Disease OntologyGenetic Disease OntologyCancer Disease OntologyGenetic Disease OntologyImmune Disease OntologyEnvironmental Disease OntologyOral Disease OntologyInfectious Disease Ontology…

BFO, OGMS, and IDO

• Material Entity• Disposition• Process

• Disorder• Disease• Disease Course

• Infection• Infectious Disease• Infectious Disease Course

OGMS

Cardiovascular Disease OntologyGenetic Disease OntologyCancer Disease OntologyGenetic Disease OntologyImmune Disease OntologyEnvironmental Disease OntologyOral Disease OntologyInfectious Disease Ontology

IDO Staph Aureus IDO MRSA IDO Australian MRSA IDO Australian Hospital MRSA …

How IDO evolvesIDOCore

IDOSa

IDOHumanSa

IDORatSa

IDOStrep

IDORatStrep

IDOHumanStrep

IDOMRSA

IDOHumanBacterial

IDOAntibioticResistant

IDOMAL IDOHIVCORE and SPOKES:Domain ontologies

SEMI-LATTICE:By subject matter experts in different communities of interest.

IDOFLU

28

Status

Successes New ontologies being added to the OBO libraryAdvance in cross-product methodology

31

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy?)

Anatomical Entity

(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULE Molecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)Building out from the original GO

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

COMPLEX OFORGANISMS

Family, Community, Deme, Population

Population Phenotype

Population

Process

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO)

Phenotypic Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(CHEBI, SO,RNAO, PRO)

Molecular Function

(GO)

Molecular Process

(GO)

Population-level ontologies 32

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

COMPLEX OFORGANISMS

Family, Community, Deme, Population

Population Phenotype

PopulationProcess

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO)

Phenotypic Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(CHEBI, SO,RNAO, PRO)

Molecular Function

(GO)

Molecular Process

(GO)Population-level ontologies 33

Environm

ent (EnvO

, EO

)

Successes

The OBO Foundry strategy for ontology collaboration and reuse is being replicated in major grant-funded projects

OBO Foundry approach extended into other domains

35

NIF Standard Neuroscience Information Framework

ISF Ontologies Integrated Semantic Framework for Clinical and Translational Science

ImmPort Immunology Database and Analysis Portal

OGMS and Extensions Ontology for General Medical Science

IDO Consortium Infectious Disease OntologycROP Common Reference

Ontologies for Plants

FUNDED

Successes

Huge and continuing expansion in the awareness of the need for re-using ontologies

Huge and continuing expansion in ontology software created to support Foundry efforts (Ontobee, Mireot, …)

Immunology Database and Analysis Portal (ImmPort)

Current status

Coordinating editors:Michael AshburnerChris MungallSuzanna LewisAlan RuttenbergRichard ScheuermannBarry Smith

New operations committee• https://

code.google.com/p/obo-foundry-operations-committee/wiki/OutreachWG

Mathias BrochhausenMelanie CourtotMelissa HaendelJanna HastingsChris MungallAlan RuttenbergRamona Walls

Ontologies admitted to full membership afte first phase of reviews

• CHEBI: Chemical Entities of Biological Interest

• GO: Gene Ontology• PATO: Phenotypic Quality Ontology• PRO: Protein Ontology• XAO: Xenopus Anatomy Ontology• ZFA: Zebrafish Anatomy Ontology

Current statusNext round of candidates for reviewOGMS: Ontology for General Medical ScienceOBI: Ontology for Biomedical InvestigationsCL: Cell OntologyIDO: Infectious Disease Ontology

Ontology for General Medical Science

Jobst Landgrebe (former Co-Chair of the HL7 Vocabulary Group, now Head of Datamining at Allianz Healthcare):

• “the best ontology effort in the whole biomedical domain by far”

43