44
1 Ontology (Science) Barry Smith University at Buffalo http:// ontology.buffalo.edu/smith

1 Ontology (Science) Barry Smith University at Buffalo

  • View
    217

  • Download
    1

Embed Size (px)

Citation preview

Page 1: 1 Ontology (Science) Barry Smith University at Buffalo

1

Ontology (Science)

Barry SmithUniversity at Buffalo

http://ontology.buffalo.edu/smith

Page 2: 1 Ontology (Science) Barry Smith University at Buffalo

]

 

Buffalo, NY

Tutorials and Classes: July 20-23, 2009Conference: July 24-26, 2009

http://icbo.buffalo.edu

International Conference on Biomedical Ontology

2

Page 3: 1 Ontology (Science) Barry Smith University at Buffalo

MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDV

How to do biology across the genome?

Page 4: 1 Ontology (Science) Barry Smith University at Buffalo

MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPIPSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDSFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVVAGEAASSNHHQKISRVTRKRPREPKSTNDILVAGQKLFGSSFEFRDLHQLRLCYEIYMADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPYTVLLANCMIRLGRRGCLNVAPVRNFIEEGYDGVTDLYVGIYDDLASTNFTDRIAAWENIVECTFRTNNVKLGYLIVDEFHNFETEVYRQSQFGGITNLDFDAFEKAIFLSGTAPEAVADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMFNLIKEKSEVPLGHVHKIRKKVESQPEEALKLLLALFESEPESKAIVVASTTNEVEELACSWRKYFRVVWIHGKLGAAEKVSRTKEFVTDGSMQVLIGTKLVTEGIDIKQLMMVIMLDNRLNIIELIQGVGRLRDGGLCYLLSRKNSWAARNRKGELPPKEGCITEQVREFYGLESKKGKKGQHVGCCGSRTDLSADTVELIERMDRLAEKQATASMSIVALPSSFQESNSSDRYRKYCSSDEDSNTCIHGSANASTNASTNAITTASTNVRTNATTNASTNATTNASTNASTNATTNASTNATTNSSTNATTTASTNVRTSATTTASINVRTSATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSATTTASINVRTSATTTESTNSSTSATTTASINVRTSATTTKSINSSTNATTTESTNSNTNATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSAATTESTNSNTSATTTESTNASAKEDANKDGNAEDNRFHPVTDINKESYKRKGSQMVLLERKKLKAQFPNTSENMNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQKMFELCVCWAGQKVSYRRIAWEALAVERMLRNDEEYKEYLEDIEPYHGDPVGYLKYFSVKRREIYSQIQRNYAWYLAITRRRETISVLDSTRGKQGSQVFRMSGRQIKELYFKVWSNLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVVVEALEKGGVFQRLRSMTSAGLQGPQYVKLQFSRHHRQLRSRYELSLGMHLRDQIALGVTPSKVPHWTAFLSMLIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANVEVLAADDTRVPLYMLMVAVHKELDSDDVPDGRFDILLCRDSSREVGE

4

Page 5: 1 Ontology (Science) Barry Smith University at Buffalo

To successfully navigate through such data,

biomedicine needs help from ontologies

5

See Smith, et al. “The OBO Foundry: Coordinated Evolution of Ontologies to Support Biomedical Data Integration”, Nature Biotechnology, 25 (11), November 2007.http://www.nature.com/nbt/journal/v25/n11/pdf/nbt1346.pdf

Page 6: 1 Ontology (Science) Barry Smith University at Buffalo

Uses of ‘ontology’ in PubMed abstracts

6

Page 7: 1 Ontology (Science) Barry Smith University at Buffalo

MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPIPSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDSFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVVAGEAASSNHHQKISRVTRKRPREPKSTNDILVAGQKLFGSSFEFRDLHQLRLCYEIYMADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPYTVLLANCMIRLGRRGCLNVAPVRNFIEEGYDGVTDLYVGIYDDLASTNFTDRIAAWENIVECTFRTNNVKLGYLIVDEFHNFETEVYRQSQFGGITNLDFDAFEKAIFLSGTAPEAVADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMFNLIKEKSEVPLGHVHKIRKKVESQPEEALKLLLALFESEPESKAIVVASTTNEVEELACSWRKYFRVVWIHGKLGAAEKVSRTKEFVTDGSMQVLIGTKLVTEGIDIKQLMMVIMLDNRLNIIELIQGVGRLRDGGLCYLLSRKNSWAARNRKGELPPKEGCITEQVREFYGLESKKGKKGQHVGCCGSRTDLSADTVELIERMDRLAEKQATASMSIVALPSSFQESNSSDRYRKYCSSDEDSNTCIHGSANASTNASTNAITTASTNVRTNATTNASTNATTNASTNASTNATTNASTNATTNSSTNATTTASTNVRTSATTTASINVRTSATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSATTTASINVRTSATTTESTNSSTSATTTASINVRTSATTTKSINSSTNATTTESTNSNTNATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSAATTESTNSNTSATTTESTNASAKEDANKDGNAEDNRFHPVTDINKESYKRKGSQMVLLERKKLKAQFPNTSENMNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQKMFELCVCWAGQKVSYRRIAWEALAVERMLRNDEEYKEYLEDIEPYHGDPVGYLKYFSVKRREIYSQIQRNYAWYLAITRRRETISVLDSTRGKQGSQVFRMSGRQIKELYFKVWSNLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVVVEALEKGGVFQRLRSMTSAGLQGPQYVKLQFSRHHRQLRSRYELSLGMHLRDQIALGVTPSKVPHWTAFLSMLIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANVEVLAADDTRVPLYMLMVAVHKELDSDDVPDGRFDILLCRDSSREVGE

7

Page 8: 1 Ontology (Science) Barry Smith University at Buffalo

8

what cellular component?

what molecular function?

what biological process?

Gene Ontology

Page 9: 1 Ontology (Science) Barry Smith University at Buffalo

9

what cellular component?

what molecular function?

what biological process?

GO aids information retrieval via curation of data and literature

Page 10: 1 Ontology (Science) Barry Smith University at Buffalo

10

GO as Common Controlled Vocabulary

MouseEcotope GlyProt

DiabetInGene

GluChem

Holliday junction helicase complex

Page 11: 1 Ontology (Science) Barry Smith University at Buffalo

11

GO promotes integration of data

MouseEcotope GlyProt

DiabetInGene

GluChem

sphingolipid transporter

activity

Page 12: 1 Ontology (Science) Barry Smith University at Buffalo

12

Page 13: 1 Ontology (Science) Barry Smith University at Buffalo

Ontology engineers:

“LET’S GENERALIZE THESE BENEFITS BY BUILDING MANY MANY TINY ONTOLOGIES IN OTHER AREAS”

13

Page 14: 1 Ontology (Science) Barry Smith University at Buffalo

The standard engineering methodology

• Pragmatics (‘usefulness’) is everything

• Usefulness = we get to write software which runs on our machines

14

Page 15: 1 Ontology (Science) Barry Smith University at Buffalo

• It’s easier to write useful software if we work with a simplified model

• (“…we can’t know what reality is like in any case; we only have our ‘concepts’…”)

• Engineer A: This looks like a useful model to me

• (One week later:) Engineer B: This other thing looks like a useful model to me

The standard engineering methodology

Page 16: 1 Ontology (Science) Barry Smith University at Buffalo

The standard engineering methodology

Result:

Data in Pittsburgh does not interoperate with data in Vancouver

Science is siloed

16

Page 17: 1 Ontology (Science) Barry Smith University at Buffalo

Scientific theories must be common resources

1. they cannot be bought or sold

2. they must use open publishing venues

3. they must constantly evolve to reflect results of scientific experiments (“evidence-based”)

4. must be synchronized– use common system of units– common terminologies

17

Page 18: 1 Ontology (Science) Barry Smith University at Buffalo

Why build scientific ontologies

Multiple ontologies only make our data silo problems worse

Just as bad scientific theories must die, so also bad ontologies must die

Ontologies should be relatively independent of tools, implementations and applications*

*Need to clearly separate the Science Domain Knowledge from the Software Programming Knowledge

18

Page 19: 1 Ontology (Science) Barry Smith University at Buffalo

Scientific ontologies must be constrained so that they converge

Q: What is to serve as constraint in order to avoid silo creation ?

A: Reality, as revealed, incrementally, by experimentally-based science

19

Page 20: 1 Ontology (Science) Barry Smith University at Buffalo

Ontological realism

• Find out what the world is like (= by doing science)

• Build representations adequate to this world, not to some simplified model in your laptop

• … this strategy is being realized by the Gene Ontology and an expanding community of biomedical scientists

20

Page 21: 1 Ontology (Science) Barry Smith University at Buffalo

The Open Biomedical Ontologies (OBO) Foundry

• Goal: to provide a suite of controlled structured vocabularies for the callibrated annotation of data to support integration and reasoning across the entire domain of biomedicine

• as biomedical science advances, these ontologies must be evolved in tandem

21

Page 22: 1 Ontology (Science) Barry Smith University at Buffalo

22

Ontology Scope URL Custodians

Cell Ontology (CL)

cell types from prokaryotes to mammals

obo.sourceforge.net/cgi-

bin/detail.cgi?cell

Jonathan Bard, Michael Ashburner, Oliver Hofman

Chemical Entities of Bio-

logical Interest (ChEBI)

molecular entities ebi.ac.uk/chebiPaula Dematos,Rafael Alcantara

Common Anatomy Refer-

ence Ontology (CARO)

anatomical structures in human and model

organisms(under development)

Melissa Haendel, Terry Hayamizu, Cornelius

Rosse, David Sutherland,

Foundational Model of Anatomy (FMA)

structure of the human body

fma.biostr.washington.

edu

JLV Mejino Jr.,Cornelius Rosse

Functional Genomics Investigation

Ontology (FuGO)

design, protocol, data instrumentation, and

analysisfugo.sf.net FuGO Working Group

Gene Ontology (GO)

cellular components, molecular functions, biological processes

www.geneontology.org

Gene Ontology Consortium

Phenotypic Quality Ontology

(PaTO)

qualities of anatomical structures

obo.sourceforge.net/cgi

-bin/ detail.cgi?attribute_and_value

Michael Ashburner, Suzanna

Lewis, Georgios Gkoutos

Protein Ontology (PrO)

protein types and modifications

(under development)Protein Ontology

Consortium

Relation Ontology (RO)

relationsobo.sf.net/

relationshipBarry Smith, Chris

Mungall

RNA Ontology(RnaO)

three-dimensional RNA structures

(under development) RNA Ontology Consortium

Sequence Ontology(SO)

properties and features of nucleic sequences

song.sf.net Karen Eilbeck

Page 23: 1 Ontology (Science) Barry Smith University at Buffalo

Orthogonality

• one ontology for each domain

• no need for ‘mappings’ (too expensive, too fragile, too difficult to keep up-to-date as mapped ontologies change)

http://obofoundry.org23

Page 24: 1 Ontology (Science) Barry Smith University at Buffalo

Orthogonality

• is our best (perhaps our only) hope of solving the data silo problem

• ontologists need to be trained to seek orthogonality

• to seek reuse with a vengeance

24

Page 25: 1 Ontology (Science) Barry Smith University at Buffalo

Ontologies like the GO are part of science

True, they must be associated with computer implementations (with engineering artifacts)

But the ontologies are not themselves engineering artifacts

The same ontology can be associated with multiple engineering artifacts

25

Page 26: 1 Ontology (Science) Barry Smith University at Buffalo

Benefits of orthogonality

• ensures that those new to ontology to find the common, tested resources they need

• and to find examplars of good practice

• ensures mutual consistency of ontologies (trivially)

• thereby ensures additivity of annotations

26

Page 27: 1 Ontology (Science) Barry Smith University at Buffalo

More benefits of orthogonality

• it rules out simplification and partiality

• brings an obligation on the part of ontology developers to commit to scientific accuracy and domain-completeness

27

Page 28: 1 Ontology (Science) Barry Smith University at Buffalo

More benefits of orthogonality

• helps to eliminate redundancy

• serves the division of ontological labor: allows experts to focus on their own domains of expertise

• makes possible the establishment of clear lines of authority

28

Page 29: 1 Ontology (Science) Barry Smith University at Buffalo

The goal of orthogonality is a basic goal of science

it is a pillar of the scientific method that scientists should strive always to resolve conflicts between competing theories

29

Page 30: 1 Ontology (Science) Barry Smith University at Buffalo

Is there a problem with orthogonality?

• what if I need my own ontology of cellular membranes to meet my own special purposes?

• strategy of application ontologies should be developed from the start using terms whose definitions employ the resources of orthogonal ontologies like those within the Foundry

• any other approach creates silos30

Page 31: 1 Ontology (Science) Barry Smith University at Buffalo

Better to have one consensus ontology serving multiple purposes

imperfectly

because multiple ontologies addressing the same domain, whether they are good ones or bad ones, create silos

31

Page 32: 1 Ontology (Science) Barry Smith University at Buffalo

For engineers, ontologies1. can be bought and sold

2. need have no well-demarcated scientific domains

3. need not be subject to further maintenance

4. can be stand-alone products

5. are typically tied to one specific implementation

Ontology (engineering) thereby makes the silo problem worse

32

Page 33: 1 Ontology (Science) Barry Smith University at Buffalo

Ontologies created to serve scientific purposes

1. are developed to be common resources (thus they cannot be bought or sold)

2. for representation of well-demarcated scientific domains

3. subject to constant maintenance by domain experts

4. designed to be used in tandem with other, complementary ontologies

5. maximally independent of format and implementation

33

Page 34: 1 Ontology (Science) Barry Smith University at Buffalo

Some obvious truths

• Scientific hypotheses should be formulated by scientists

• Scientific experiments should be carried out by scientists

• Scientific databases should be developed and maintained by scientists

• Scientific textbooks and journal articles should be written by scientists

34

Page 35: 1 Ontology (Science) Barry Smith University at Buffalo

An obvious conclusion:

• Scientific ontologies should be built by scientists

35

Page 36: 1 Ontology (Science) Barry Smith University at Buffalo

Problems to be addressed

• How should ontologist-scientists be trained?

• How do we create a career path for scientific ontologists?

• How do we assign credit to those who contribute to ontology creation and maintenance?

36

Page 37: 1 Ontology (Science) Barry Smith University at Buffalo

Ontologies like the GO are comparable to

– scientific theories

– scientific databases

– scientific journal publications

37

Page 38: 1 Ontology (Science) Barry Smith University at Buffalo

Ontologies like the GO are being used experimentally by scientific

journal publishers

– to provide more useful access to data and other sorts of content via controlled structured keyword lists

– to provide a basis for creating formally structured versions of journal articles

38

Page 39: 1 Ontology (Science) Barry Smith University at Buffalo

The OBO Foundry is working with journal publishers

to create a methodology for expert peer review of ontologies

as articles are peer reviewed

so keyword lists are peer reviewed

so an author’s use of keyword lists is peer reviewed

39

Page 40: 1 Ontology (Science) Barry Smith University at Buffalo

Benefits of peer review

1. provides a gigantic impetus to the improvement of scientific knowledge over time

2. brings benefits to readers, since they need only absorb and collate vetted results

(contrast what happens where vetting is not allowed e.g. on the Semantic Web)

40

Page 41: 1 Ontology (Science) Barry Smith University at Buffalo

Scientific ontology analogous to open source software

S. Weber, The Success of Open Source, Cambridge, MA: Harvard University Press, 2004.

Ontologies should be more like Linux and less like the Semantic Web

41

Page 42: 1 Ontology (Science) Barry Smith University at Buffalo

Weber’s six criteria for success

1. Disaggregated contributions can be derived from knowledge that is not proprietary.

2. The product is perceived as valuable to a critical mass of users.

3. The product benefits from widespread peer attention and review, and can improve through error correction.

4. There are strong positive network effects.

5. An individual or a small group can take the lead and generate a substantive core that promises to evolve into something truly useful.

6. A voluntary community of iterated interaction can develop around the process of building the product.

42

Page 43: 1 Ontology (Science) Barry Smith University at Buffalo

OBO Foundry peer review creates incentives for investment of effort in ontology work

• It gives career-related credit to both authors and reviewers (university promotions and funding are based on peer review credit)

• Supports creation of a professional career path for ontologists

• It gives credit to scientific experts for investment of scientific expertise in ontology development

• It allows measurement of citations of ontologies• It magnifies the motivating potential of the factor of

influence – scientists help to determine what ontology resources exist in their discipline

43

Page 44: 1 Ontology (Science) Barry Smith University at Buffalo

THE END

44