87
Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Embed Size (px)

Citation preview

Page 1: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Linking Multiple Ontologies:

The OBO Foundry ApproachChris Mungall

NIAID Cell Ontology WorkshopMay 2008

Page 2: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Outline

• Introduction to ontologies– The OBO perspective– Case study in the Gene Ontology

• The OBO Foundry: goals and principles• The OBO relation ontology• Organization of ontologies in OBO• Modularity

– An example from CL

• Linking CL to the OBO Foundry

Page 3: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

What is an ontology?

• A computable representation of some domain– What kinds of

things exists– What are the

relations that hold between them?

Mitral valve Aortic valve

Heart

Cavitated organCardiovascular

System

part_of part_of

part_ofis_a

Page 4: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Aspects of an ontology

• Identifiers– Uniquely identify a class / term

• E.g. CL:0000037 is ID for the term “hematopoietic stem cell”

– Identifier metadata

• Terminological aspects– Names and synonyms/alternate labels

• CL:0000037 has “hemopoietic progenitor cell” as a related synonym and “hemopoietic stem cell” as exact synonym

• Logical aspects– Relations– Definitions Provenance

Page 5: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Some ontologies and their uses

• The Gene Ontology– Annotation of gene products– Analyzing high-throughput datasets

• Anatomical ontologies (including CL)– Experimental metadata– Image annotation– Indicating location of gene expression– Creating Phenotypic descriptions

• Others– NLP– Annotating information models– Database integration

Page 6: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Origins of OBO: The Gene Ontology (GO)

• 3 ontologies for annotating genes and gene products

• These ontologies are organised as a collection of related terms, constituting nodes in a graph– Gradually incorporating other logical axioms

Ontology # terms # links

Molecular function 7889 9225

Biological process 13978 25065

Cellular component 2034 3894

Page 7: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Annotation and GO

• GO Annotations:– Associations between genes and GO terms, with

evidence– Met17 : “methionine metabolism” GO:0006555

• 222,000 genes and gene products have high quality annotations to GO terms– 3.4m including automated predictions– 66,000 publications curated

• Variety of analysis tools– http://www.geneontology.org/GO.tools.shtml#micro

Page 8: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

GO::TermFinderSherlock et al

GO and high-throughput biology:Over-representation of GO terms for

gene sets

Page 9: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

GO and the need for OBO

• GO terms implicitly reference kinds of entities outwith the scope of GO– Methionine biosynthesis– Neural crest cell migration– Cardiac muscle morphogenesis– Regulation of vascular permeability

• OBO was born from the need to create source ontologies for GO term ‘cross-products’– Define composite classes in terms of simpler ones

chemicalcell

anatomyquality

Page 10: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

The Open Biomedical Ontologies (OBO) Foundry

• A collection of orthogonal reference ontologies in the biological/biomedical domain

• The OBO Foundry: Each is committed to an agreed upon set of principles governing best practices in ontology development

Page 11: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Some OBO ontologies

• Gene Ontology• ChEBI - chemical entities• OBI - investigations• PATO, MP - phenotypes• CL - cells• ENVO - environment and

habitat• DO - Human diseases• CARO - common anatomy• FMA - human anatomy

• SO - sequence features• Model organism

anatomy– ZFA– Fly_anat– Dicty_anat– Mouse_anat– …

• OBO Relation Ontology

Page 12: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

OBO Foundry: criteria, v1

• Open• Well-defined exchange format

E.g. OBO or OWL

• Uses identifiers according to OBO ID policy• Ontology Life-cycle / versioning• Has clearly specified and delineated content• Has unambiguous definitions• Uses or extends relations in the OBO Relation Ontology• Well documented• Has a plurality of users (and a mail list & issue tracker)• Developed collaboratively• Orthogonal, modular

http://obofoundry.org/

Page 13: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

OBO Relation Ontology

• Edges can link nodes…– Within ontologies– Across ontologies

• The precise meaning of the relation is important– Relations have formal definitions– Rules for composing relations together

– http://obofoundry.org/ro/

Page 14: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Is_a

• X is_a Y– If something is an instance of X (at

time t), then it is also an instance of Y (at t)

• Transitive– B1 B cell is_a B cell– B cell is_a lymphocyte– Therefore B1 B cell is_a lymphocyte

Page 15: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Part_of

• Instance level part_of relation is primitive• Between classes:

– X part_of Y :• Every instance of X is part_of some instance of Y• Paneth cell part_of intestine : YES• Nucleus part_of Cell : YES• Neuron part_of brain : NO

– (there are some neurons that are part of others parts of the nervous system)

• Transitive– X part_of Y, Y part_of Z

• Therefore, X part_of Z

Page 16: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Has_part

• Instance level inverse of part_of• X has_part Y

– Every X has some Y as part– Cell has_part nucleus : NO– Nucleate erythrocyte has_part

nucleus : YES

Page 17: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Develops_from

• X develops_from Y– Every instance of X was once a Y, or inherited a

significant portion of its matter from a Y• Example: erythrocyte develops_from reticulocyte

• Transitive– erythrocyte develops_from reticulocyte– reticulocyte develops_from orthochromatic

erythroblast• =>

– erythrocyte develops_from orthochromatic erythroblast

Page 18: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Transformation and derivation

• Develops_from relation can be refined into two cases:– Transformation_of

• X transformation_of Y :– Any instance of X was previously an instance of Y– Example: erythrocyte transformation_of reticulocyte

– Derives_from• X derives_from Y :

– Holds between distinct instances where Y inherits matter from X

• Most OBO ontologies just use the develops_from relation

Page 19: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Other relations

• Inherence– Between a quality and an object– E.g. between a specific shape and a

cell

• Participation– Between a process and an object– E.g. between a B cell and an immune

process

Page 20: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Definitions state necessary and sufficient

conditions• Links in the ontology graph state necessary

conditions for a class• E.g. erythroid progenitor cell develops_from

megakaryocyte erythroid progenitor

– These characteristics may not be unique

• A definition should state necessary and sufficient conditions for a class– The characteristics must be unique to the defined class

• E.g. “progenitor cell that is committed to the erythroid lineage”

• Definition should be precise and (as far as possible) translated / translatable to logical computable form

Page 21: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Genus differentia definitions

• Of the form– An X is a G that D– G should be in the same ontology– D is discriminating characteristics that

differentiate (in the classification sense) Xs from other Gs.

• Relations to terms in an ontology (the same ontology or a different one)

• Example:– A B cell is a lymphocyte that expresses an

immunoglubulin complex

Page 22: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Orthogonality of ontologies

• No two ontologies should represent the same kind of entity– E.g. “B-cell” should only be represented in one

ontology– Related entities should be coordinated across

ontologies• GO: “B-cell differentiation”

• Exceptions:– The term “cell” connects GO Cellular Component (cell

parts) and CL (cells)

• Advantages:– Reduces redundancy and work– Easier to make the union consistent

Page 23: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

oenocyte

hepatocyte

liverfat body

glycogenglucose

hepaticartery

bile

insulin

obesity

carbohydratemetabolism

liverdevelopment

increased circulating glucose level

oenocytedifferentiationhepatoma

Some OBO terms..

Page 24: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

oenocyte

hepatocyte

liverfat body

glycogenglucose

hepaticartery

bile

insulin

obesity

carbohydratemetabolism

liverdevelopment

increased circulating glucose levelCHEBI

FBbt

CLPRO

MA(mouse)(fly)

FMA(adulthuman)

MP(mammalphenotype)

GO(biologicalprocess)

oenocytedifferentiationhepatoma

DO

Page 25: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

oenocyte

hepatocyte

liverfat body

glycogenglucose

hepaticartery

bile

insulin

obesity

carbohydratemetabolism

liverdevelopment

increased circulating glucose levelCHEBI

FBbt

CLPRO

MA(mouse)(fly)

FMA(adulthuman)

MP(mammalphenotype)

GO(biologicalprocess)

oenocytedifferentiationhepatoma

DO

Page 26: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

oenocyte

hepatocyte

liverfat body

glycogenglucose

hepaticartery

bile

insulin

obesity

carbohydratemetabolism

liverdevelopment

increased circulating glucose levelCHEBI

FBbt

CLPRO

MA(mouse)(fly)

FMA(adulthuman)

MP(mammalphenotype)

GO(biologicalprocess)

oenocytedifferentiationhepatoma

DO

How should we organize this?

Page 27: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Top-level organisation (BFO: Basic Formal

Ontology)• General categories– 3D things (continuants)

• Independent– Cells, organs,

molecules• Dependent

– Shapes, sizes, concentrations, …

– 4D things (processes)• Processes

• Useful organisational principle for OBO

• is_a and part_of should not cross top level categories

• Levels of granularity (scale)– Population– Organism– Organ– Cell– Molecule

• part_of relations can cross levels

Page 28: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

oenocyte

hepatocyte

liverfat body

glycogenglucose

hepaticartery

bile

insulin

obesity

carbohydratemetabolism

liverdevelopment

increased circulating glucose levelCHEBI

FBbt

CLPRO

MA(mouse)(fly)

FMA(adulthuman)

MP(mammalphenotype)

GO(biologicalprocess)

oenocytedifferentiationhepatoma

DO

Objects Qualities etc Processes

Page 29: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

CONTINUANT OCCURRENT RELATION TO

TIME GRANULARITY INDEPENDENT DEPENDENT

ORGAN AND ORGANISM

Organism (NCBI

Taxonomy)

Anatomical Entity (FMA, CARO)

Organ Function (FMP, CPRO)

Organism-Level Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell (CL)

Cellular Component (FMA,GO)

Cellular Function

(GO)

Phenotypic Quality (PaTO)

Cellular Process (GO)

MOLECULE Molecule

(ChEBI, SO, RnaO, PrO)

Molecular Function (GO)

Molecular Process (GO)

Page 30: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

The OBO Foundry can help with modular ontology

design• Biology is complex

– So our ontologies will be complex– Multiple purposes– Multiple means of classifying

• Separate out different aspects– Modular approach– Avoid multiple inheritance (>1 is_a parent)

• Don’t over-use is_a• Don’t cross aspects with is_a

• Make complex descriptions from simpler parts– Polyhierarchies arise from composition

Page 31: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Cysteine biosynthesis(trimmed)GO

Tangled polyhierarchy

Page 32: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Cysteine biosynthesis(trimmed)

Process axis

Page 33: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Cysteine biosynthesis(trimmed)

Chemical structure axis

Page 34: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Cysteine biosynthesis(trimmed)

ChEBI(trimmed)

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 35: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Cysteine biosynthesis(trimmed)

ChEBI(trimmed)

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 36: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Cysteine biosynthesis(trimmed)

ChEBI(trimmed)

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 37: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Cysteine biosynthesis(trimmed)

ChEBI(trimmed)

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

We can do more than simply link terms:

Cross-products (aka logical definitions,Computable genus-differentia definitions)

Page 38: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Cysteine biosynthesis(trimmed)

ChEBI(trimmed)

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Cysteine biosynthesisGO:0019344

=

a biosynthetic process GO:0009058that

results_in_creation_of cysteine CHEBI:13536

} genus

differentia}

Page 39: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 40: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Cysteine biosynthesitic process = biosynthetic process that results_in_change_to cysteine

results_in_change_to

Page 41: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Let the computerdo the work..

Given cross-products,A reasoner can addall links

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Underlying representation is normalized

Page 42: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Example of is_a-overloading: OBO Cell

Ontology(current)

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

CL

Page 43: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

•Try not to assert too many is_a parents

X

CL

Page 44: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

•Reuse existing ontologies•Non-is_a relation

XQuickTime™ and a

TIFF (LZW) decompressorare needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

?

CL GO

Hasfunction

Page 45: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

How CL can use other OBO ontologies

• GO Cellular component– Mononuclear phagocyte– B cell (expresses immunoglubulin complex)

• GO Biological process– Photosynthetic cell

• PATO Qualities– Spiny neuron

• CHEBI Chemical entities– X secreting cell

• Anatomy Ontologies– CNS neuron

Molecular function, PRO - CD4 positive cell

Page 46: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

How CL is used by other ontologies

Ontology Example Genus Differentia

GO-BP T cell differentiation

Cell differentiation

Results_in_acquisition_of_features_of

T cell

GO-CC Germ cell nucleus Nucleus Part_of

germ cell

MP Abnormal macrophage morphology

Abnormal morphology

Inheres_in

macrophage

ZFA (zebrafish)

erythrocyte erythrocyte In_organism DanioHas_part nucleus

OBI

DO (disease)

Ontology Example Relationship

Fly anatomy R8 photoreceptor cell

Part_of ommatidium

Page 47: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Results

• Biological process x CL• http://wiki.geneontology.org/index.php?XP:biologi

cal_process_xp_cell

– Uncovered inconsistencies between GO and CL

– Oenocyte differentiation is_a columnar/cuboidal epithelial cell differentiation

• MP x CL• http://wiki.geneontology.org/index.php/XP:mamm

alian_phenotype_xp

– Resulted in various fixes to MP

Page 48: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

OBD: Ontology Annotation Database

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 49: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Summary

• The cell ontology is a representation of the types of cell that exist

• The OBO Foundry provides– Principles– A framework for connecting ontologies

• There are many points of coordination between CL and other OBO ontologies

• CL could benefit from the gradual introduction of a modular approach

Page 50: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008
Page 51: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

The Gene Ontology; and beyond

• Curation of genes and gene products– Molecular function– Biological process– Cellular component

GO

Multiple databases using the same ontology

Page 52: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

The Gene Ontology; and beyond

• Curation of genes and gene products– Molecular function– Biological process– Cellular component

• What about curation of other data types?– Expression, transcriptomics– Genetics, phenotypes and

disease– Many others..

• OBO– Open Bio-Ontologies– Arose partly in response to

requirements outside scope of GO

GO

Page 53: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Islands of biological data

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

GOAnatomyontologies Phenotype

ontologies

Page 54: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Connecting the islands

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 55: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Connecting the islands

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 56: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Bada et al : GO to ChEBI

http://www.berkeleybop.org/obol

Amino acid cross-products in GO:

Page 57: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

http://www.berkeleybop.org/obol

Page 58: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

• GO approach is retrospective– Text based approaches to ‘decompose’ terms

• Obol• Bada/Hunter

– Born of necessity• OBO did not exist when GO started

– Hard work

• New ontologies should take the prospective approach– Separate out aspects from the outset– No heuristic parsing necessary

Page 59: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Prospective approach: Sequence Ontology

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Separate hierarchies created from the outset- cross-products made from the beginning

Page 60: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 61: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

OBI: Ontology for Biomedical Investigations

• Successor to MGED/FuGO• Represents the realm of investigations

– Biomaterials– Equipment– Protocols– Data transformations

• Makes maximal use of OBO– PATO:– ChEBI:

• Primary representation language is OWL– Uses OWL translations at http://purl.org/obo/

Page 62: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Social Insect Behavior Ontology

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

• 4 distinct hierarchies– Anatomical entity– Behavior– Chemical entity– Species

• Links– derives_from, between

chemical and anatomical entity

• Future plans– Submit chemical

terms to ChEBI– Upper level behavior

ontology?

Page 63: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Anatomy

• GO is relevant for all kingdoms of life• Development of anatomical ontologies has

been less coordinated– Cell & subcellular: one ontology applicable to all– Gross Anatomy: multiple ontologies

• Vertebrate:– MA + EMAP: Mouse– FMA: Human (adult)– EHDA: Human– ZFA: Zebrafish– TAO: teleost anatomy– XAO: Xenopus

•Invertebrate:–FBbt: Drosophila anatomy–Tick anatomy–Mosquito anatomy

Page 64: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Anatomy: Ongoing work

• CARO– Upper level shared anatomical ontology– Very general terms

• Teleost anatomy ontology– Broader than zebrafish anatomy ontology– Will include homology links

• Linking cells to gross anatomical entity– Purkinje cell part_of cerebellum– Spans ontologies (CL + ssAO)

• BIRNLex• Stages and development

poster

poster

poster

talk

Page 65: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Using multiple ontologies: Pre vs post composition

• Complex descriptions (aka cross-products) can be composed from 2 or more terms– By ontology editors (pre)– By curators (post)

• Example:– Liver hyperplasia

• Precomposed phenotype ontology– MP:0005141 “liver hyperplasia” increased size of liver due to

increased hepatocyte cell number

• Post-composition at time of genotype curation– PATO:0000644 “hyperplastic”– MA:0000358 “liver”

• Which strategy to choose?

Page 66: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

• Either strategy can be used• Or mixed and matched

– Caveat:• Pre-composed terms must have computable

definitions (cross-products)• Currently created retrospectively

• Current progress : – MP (Mammalian Phenotype):

• 4136/5760 xp defs, partially vetted• Caveat: species-specificity

– WormPhenotype:• 350/1569 xp defs

– PlantTrait:• 340/765 xp defs, partially vetted

Page 67: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Other ontologies

• Envo + GAZ– Environmental ontology and gazetteer– Habitats:

• Host (anatomy)• Geographical features (eg hydrothermal vents)

– Qualities, chemical entities

• BIRNLex• Protein Ontology

– Links to/from GO• Complexes• Functions of ancestral proteins

Page 68: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Envo-based annotation in Phenote

Page 69: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Technical consequences of modular approach

• Dependencies– Technical issues

• Dependence on network?• Formats - converters

– Social & management issues– Change and versioning

• http://www.bioontologies.org/

• Managing dependencies• http://obofoundry.org/wiki/index.php/Mappings

– Stable URLs for downloading ontologies in obo or owl http://purl.org/obo/

– OBO Identifier policy• http://obofoundry.org/wiki/index.php/Identifiers

Page 70: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Conclusions

• Be modular– Distinct hierarchies– Avoid is_a overloading– Link to existing ontologies

• Rewards– Standards– Increases value of curated data– Reduces duplication of effort and maximises

curation effort– Ontologies are long term infrastructure

• It’s worth getting them right

Page 71: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Learning more

• http://www.bioontology.org– National Center for Biomedical Ontology– Browse and search OBO– Coming soon: inter-ontology links

• http://obofoundry.org– Principles and recommendations– Participation

• Mailing lists• Trackers

Page 72: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Restructuring Cell.obo

Page 73: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

OBO Cell Ontology

• Current version– Overloading of is_a hierarchy– Difficult to maintain– Leads to “true path” violations

• Refactoring– Replace is links with has_function– Keep main axis structure-based (but not religiously

so)

Page 74: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

• For every term immediately under cell-by-function, we made a new function term

• propagation of genome • to circulate• to secrete• to metabolise• to contract• Electrical absorption• Barrier• Motility• Structural• to accumulate stuff• signaling (mitogenic)• to die• Defense• Transport• to photosynthesize• to support• Valve• to fix nitrogen

• Also create grouping terms

Page 75: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008
Page 76: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008
Page 77: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

• Replaced is_a links to cell-by-function terms with has_function links to corresponding function terms

Page 78: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008
Page 79: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

• What do we do about the old cell-by-function terms?

• We can eliminate them..

• OR we can support them, but infer the ‘tangled DAG’

• Requires xp defs:– Nitrogen fixing cell = cell THAT has_function nitrogen-fixing

Page 80: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008
Page 81: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

• Future work / ongoing issues:

• Redundancy between cell functions & GO biological process?

• Cell-by-lineage

Page 82: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Synchronizing ssAOs and CL

• Fly_anat, zfa, plant_anat all represent cell types– Part_of links from cells to gross anatomy

• E.g. purkinje_cell part_of cerebellum

• Methodology– Xrefs from ssAOs to CL IDs– Treat as ss subtypes– Use reasoner to stay in sync– http://www.bioontology.org/wiki/index.php/CL:Alignin

g_species-specific_anatomy_ontologies_with_CL– Examples:

• http://www.berkeleybop.org/obol/#fly_anatomy_xp_cell-obol

Page 83: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Transformation_of

• Class-level relation between continuant types• Transitive

• Relation between two classes, in which instances retain their identity yet change their classification by virtue of some kind of transformation. Formally: C transformation_of C' if and only if given any c and any t, if c instantiates C at time t, then for some t', c instantiates C' at t' and t' earlier t, and there is no t2 such that c instantiates C at t2 and c instantiates C' at t2

Page 84: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008
Page 85: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

Derives_from

• Holds between continuants• transitive

• Derivation on the instance level (*derives_from*) holds between distinct material continuants when one succeeds the other across a temporal divide in such a way that at least a biologically significant portion of the matter of the earlier continuant is inherited by the later

• We say that one class C derives_from class C' if instances of C are connected to instances of C' via some chain of instance-level derivation relations.

• Examples:– osteocyte derives_from osteoblast

Page 86: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008
Page 87: Linking Multiple Ontologies: The OBO Foundry Approach Chris Mungall NIAID Cell Ontology Workshop May 2008

CONTINUANT OCCURRENT RELATION TO

TIME GRANULARITY INDEPENDENT DEPENDENT

ORGAN AND ORGANISM

Organism (NCBI

Taxonomy)

Anatomical Entity (FMA, CARO)

Organ Function (FMP, CPRO)

Organism-Level Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell (CL)

Cellular Component (FMA,GO)

Cellular Function

(GO)

Phenotypic Quality (PaTO)

Cellular Process (GO)

MOLECULE Molecule

(ChEBI, SO, RnaO, PrO)

Molecular Function (GO)

Molecular Process (GO)