Collaborative ontology development by scientists Melissa
Haendel
Slide 2
Setting the stage 1.Who we are and what do we need 2.What are
our bottlenecks: Getting info from the domain experts Ontology
tools Synchronizing ontologies 3. Modularizing anatomy ontologies
4. Ideas for collaborative ontology editing
Slide 3
Who are we? What do we want? Domain Experts: Anatomists,
comparative morphologists, developmental biologists, immunologists,
neuroscientists, etc. Ontologists: Biologists-gone-informatics,
computer scientists and logicians Engineers: Our tool builders
Ontologies and tools to develop them Domain experts: want to query
for gene expression and phenotypes across species Ontologists: have
to be able to interpret and represent domain knowledge
computationally Engineers: have to build tools that can consume
ontologies and give the Domain Experts the right results
Slide 4
Anatomy and phenotype ontologies have work hard for us
Ontologies must be intelligible to: HumansMachines Enable
comparison of structures across different organisms Standardization
of vocabulary among communities Integration across databases Query
across large amount of data Automatic reasoning to infer related
classes Error checking Annotation consistency
Slide 5
Term needed for annotation Ontology development workflow and
bottlenecks reconcile
Slide 6
Term requested Ontology development workflow and bottlenecks
reconcile
Slide 7
Term discussed by community Ontology development workflow and
bottlenecks reconcile
Slide 8
Ontology development workflow and bottlenecks reconcile
Slide 9
GO CL CARO TAOAAO XAOZFA MA MP UBERON Ontology development
workflow and bottlenecks reconcile Synchronize?
Slide 10
1) Extracting domain knowledge into an ontology efficiently 2)
Multiple ontology editing tools, each with pros and cons, neither
easily used by domain experts 3) Synchronization across
interoperable ontologies Three bottlenecks
Slide 11
How can we increase the efficiency of extracting knowledge from
domain experts? An example of what has worked well so far: 1862
Christian Schussele Familiar tooling: Google docs, Phenote, Excel
Visualization: Cmap, Vue, GraphViz Need too merge different sources
of information Need a way to get this information into a computable
form
Slide 12
Two ontology editors (and viewers) commonly used by the
biomedical community http://oboedit.org/ OBOEdit- OBO ontology
editor and viewer Protg - OWL ontology editor and viewer
http://protege.stanford.edu/ Both tools are non-trivial to learn to
use Neither have a lot of bulk operations, import/export different
formats easily, or deal with synchronization readily There is a
barrier for domain experts to contribute knowledge, and a
bottleneck for editors to get this knowledge into ontologies
efficiently More biologist-friendly (thank you John!) Tool used by
broader community
Slide 13
How to synchronize ontologies Mapping (bioportal set,..) Direct
reconciliation (TAO and ZFA) Synchronization using imports Three
approaches:
Slide 14
Ontology mappings are often not useful FMA (human) tibiaFBbt
(fruitfly) tibia FMA extensor retinaculum of wrist MA retina GAZ
(geography) ColonFMA (human) Colon ZFA (zebrafish) aortic archMA
(mouse) arch of aorta GAZ (geography) SerpentineCHEBI (chemistry)
serpentine Dictyostelium giant cellFMA giant cell ZFA (zebrafish)
blastodermFbbt blastoderm stage PATO (quality) maleChebi (chemical)
maleate 2(-) (For anatomy, you may want to remove the mappings that
NCBO Bioportal creates for your ontology and/or ask not to allow
mapping)
Slide 15
Zebrafish terms are is_a subtypes of teleost terms is_a
Zebrafish Anatomy Teleost Anatomy Ontology Reconciliation and
linking between TAO and ZFA Logic implemented via Xrefs- difficult
to keep synchronized Xrefs logic can be less clear and more
difficult to use
Slide 16
Synchronization by import across ontologies One can import a
whole ontology or just portions of another ontology MIREOT: Minimum
information to reference an external ontology term This strategy
requires better facilities while editing CARO VAO Present
TAOModularized ontology
Slide 17
OntoFox: a Web Server for MIREOTing Good things: Based on
MIREOT principle Web-based data input and output Output OWL file
can be directly imported in your ontology No programming needed
Programmatically accessible Improvements: Integration into ontology
editing tools More customizable http://ontofox.hegroup.org
Slide 18
We need synchronization solutions that are integrated within
ontology editing tools
Slide 19
What IS the anatomy ontology landscape? How can we efficiently
build our anatomy ontologies to be most interoperable? We could
have built: A single ontology for ontology editors and consumers
Different editors have editing rights to different ontology
partitions - by taxon - by domain (e.g. neuroscience, skeletal
anatomy) No taxon-specific subtypes - use structure, function etc.
as differentia Dynamic views according to user needs
Slide 20
Ontology landscape model view celltissue muscle tissue
mesonephros limb antenna weberian ossicle mammary gland nervous
system mollusc foot tentacle mantle pupal DN3 period neuron
mushroom body brachial lobe pons vertebra vertebral column
circulatory system appendage mesoder m gut tibia gland bone
skeletal tissue parietal bone fin gonad trachea respiratory airway
link (small sample) tibiafibula larva user/editor view
metencephalon neuro view skeletal view mammalian view ventral nerve
cord mollusc view neuro view skeletal view
Slide 21
Proposed model moving forward Maintain series of ontologies at
different taxonomic levels - euk, plant, metazoan, vertebrate,
mollusc, arthropod, insect, mammal, human, drosophila Each ontology
imports/MIREOTs relevant subset of ontology above it - this is
recursive Subtypes are only introduced as needed Work together on
commonalities at appropriate level above your ontology
Slide 22
zebrafish caro / uberon/all celltissue metazoa muscle tissue
vertebrata mesonephros limb arthropoda antenna teleost weberian
ossicle mammalia mammary gland nervous system mollusca foot
cephalopod tentacle mantle drosophila neuron types XYZ mushroom
body brachial lobe NO pons vertebra vertebral column circulatory
system appendage mesoderm gut tibia gland bone skeletal tissue
parietal bone fin gonad trachea respiratory airway cross-ontology
link (sample) amphibia tibiafibula larva shell cuticle skeleton
import mousehuman Model view
Slide 23
Idealized protocol for new AOs 1.Collect draft list of terms
2.Subdivide roughly into applicability at taxonomic levels
3.Request new terms from existing AOs above you 4.Is a new
mid-level AO required? - yes collaborate and create, go to 1.
5.Import pre-reasoned subset from next AO above 6.Build your
ontology (David will take it from here in his talk later
today)
Slide 24
Modularizing ontologies- positive reinforcement Identify key
points of integration between ontologies Modularize based on domain
or taxon Import and reuse rather than cross- referencing or
aligning Let the reasoner help do the work Work together to
distribute work
Slide 25
To get the imports working well To have distributed social
responsibility assigned Design patterns to ensure we are all doing
the same thing To check for consistency and errors across multiple
ontologies using reasoners to get correct results for all users
-These ontologies are supposed to be orthogonal but arent always
Visualization tools that can aid non-ontology experts in
identifying errors across multiple ontologies Modularizing
ontologies We need:
Slide 26
Returning to the bottlenecks in our processLooking for
solutions Need easy-to-use tools for information capture Ideally
based on existing familiar tools Auto-populated from/to ontologies
Social management - who is responsible for what Need better
import/export functionality: - into/out of ontology editors from
simple collection tools - from a myriad of ontology sources Need
better interoperability between editors/formats Need enhanced bulk
operations Need to know specific requirements for building tools
and user feedback Need money and opportunities to interact (like
this one!)
Slide 27
Existing tools for collaborative ontology editing dont quite
get us there Google Refine has nice features for manipulating data,
including RDF exports, but isnt collaborative Mapping Master for
Protg enables generation of OWL from spreadsheets, but is not
collaborative and requires ontology knowledge Web Protg isnt
fully-fledged and is not useful for non-technical contribution
Slide 28
Ideas for collaborative ontology editing Extracted from
ontology with perl script Need to be edited by domain experts, and
then converted back in OWL Need to be merged with existing OWL file
Example: File extracted from ontology for this meeting: There is a
better way..
Slide 29
Ideas for using Google Docs Enable creation of Google
spreadsheets that curators and domain experts can edit with the
following features: Tell Google spreadsheet which columns are which
from ontology input file: labels, parents, URIs, xref, class, etc
Live-updated with latest external ontology versions using SPARQL
Export OBO/ RDF/ OWL serialization Enable search on external
ontologies via autocomplete Track changes This will solve some of
the sync problems because the queries are executed whenever the doc
is open or updated
Slide 30
Ideas for using Google Docs Enable creation of Google Drawings
that curators and domain experts can edit with the following
features: Import of external ontologies Have relations and classes
exported out from Google Drawing Export OBO/ RDF/ OWL serialization
Linked to Google Spreadsheet Track changes
Slide 31
Ontology editor dreams A truly collaborative web-based editing
platform (a la Web Protg) compatible with OWL and OBO Supporting:
Import and export of customizable spreadsheets from Google Docs
Creation of live templates (spreadsheet in synch with SPARQL
endpoints) Supports MIREOT import Users roles and permission Web
based versioning