Upload
diana-flanagan
View
228
Download
0
Tags:
Embed Size (px)
Citation preview
www.sofg.org
The SOFG Anatomy Entry List - SAEL
Helen Parkinson, EBI
On behalf of the Standards and Ontologies for Functional Genomics SAEL Working Group
www.sofg.org
Some History
• SOFG 1 Hinxton 2002, Anatomists had a breakout group to discuss integration of Anatomy ontologies.
• Outcome website set up listing known anatomy resources and ‘view’, and ‘intent to integrate expressed’
www.sofg.org
Gratuitous Advertising – SOFG2
www.sofg.org
SOFG/SAEL Workshop
• In April 2004 an international workshop was held in Edinburgh to consider issues raised by the SOFG discussion
• Participants representing users of Anatomy Ontologies– ArrayExpress, RAD, GXD, EMAGE
• Participants representing Anatomy Ontologies:– Foundational Model of Anatomy (FMA)– GALEN, Mouse Adult Anatomy– Mouse Developmental Anatomy (EMAP)– Edinburgh Human Developmental Anatomy– CBIL Controlled Vocabulary for Anatomy
•David Shotton – “what if the ontologies are not orthogonal?”
www.sofg.org
Integration issues
• Tissues• Things made of the tissues• Cells making up the tissues (scale)• Correspondences Homologies/Orthologies – mouse
tail/C.elegans tail• Stages• Developmental derivation• Relationship types, part-of, is-a, etc and how these are used
differently• Considered a specific use case:
– The use of anatomy ontologies in the functional genomics domain
Slide: Alan Rector, Jeremy Rogers
www.sofg.org
Functional Genomics Experiments
• Sample-based functional genomics experiments are usually limited to what is obtainable by conventional dissection
• High level terms are useful• Detailed ontologies are also useful where
experimenters need these, for example when laser capturing samples
• There are excellent resources already
•Truth: Biologists are resistant to data sharing, annotation, standardisation, ….
www.sofg.org
Standard terms are needed for querying
www.sofg.org
MGED Ontology
• Supports MIAME, provides terms for annotation of experiments (where they do not exist externally)
• Creates a framework to reference external ontologies – therefore we need external resources
• Requires that external terms be identifiable• Is implemented in data capture applications• An anatomy list for this domain needs to be simple
and be flexible• Annotation needs are diverse• Multiple resources can be confusing
www.sofg.org
A multiplicity of resources
• Many: resources, formats, philosophies, purposes, variable content,
Ontology Accessibility Content
CBIL Controlled Vocab. For Anatomy CBIL browser ~600
Adult Mouse Anatomy DAG-Edit/Jax Viewer ~2400
EMAP- Mouse development DAG-Edit, Anatomy Browser
~8000
GALEN (Anatomy Only) Protégé, Demo-GCE or OpenKnoME
~10,000
FMA FM explorer 70,000 concepts >110,000 terms
• Compare the FMA vs. the adult mouse
www.sofg.org
Anatomy Terminologies and Ontologies
Slide: Cornelius Rosse
NCI
MeSH
GALEN SNOMED
Jax
is-a
Hollow viscus (body structureHeart structure (body structure)Entire thoracic viscus (body structure)
TubularCardiovascularComponentIntrathoracicCardiovascularComponentIntraMediastinalStructure
Thoracic cavity structure
FMA
Organ with cavitated organ partsBody OrganCardiac Structure
Cardiopulmonary SystemCardiovascular System
CardiovascularSystem
-part-
www.sofg.org
Foundational Model of Anatomy (FMA)
• FMA uses a frame-based formalism• concerned with the representation of concepts and
relationships in a form that is understandable to humans and machine readable
• Human/vertebrate• Definition: structural attributes• Content: organism to biological macromolecule• Serves as a reference ontology
Slide: Cornelius Rosse
www.sofg.org
FM Explorer
www.sofg.org
Adult Mouse @ Jax Vocab browser
• Anatomical structures are organized spatially and functionally, using 'is a' and 'part of' relationships
• For TS28• Purpose, encoding and
integration of mouse gene
expression data
www.sofg.org
SAEL ..
• is a simple list of ~120 terms• is for low-resolution descriptions of sample origin • terms have ids: SAEL:1• contains vertebrate terms at present• is NOT a new anatomy ontology• does NOT have defined relationship types • terms do not have definitions• is a first step to considering the relationships between the
SAEL source ontologies• is NOT intended to replace deeper integration efforts
www.sofg.org
The SAEL current version 1.0
• Download from www.sofg.org/sael/index.html• In plain text/OBO format• Will be maintained by MGED Ontology working gp +
Terry Hayamizu (Jax)• Suggestions through MGED Ontology sourceforge
tracker• Report on the workshop is available• Review publication from this workshop in CFG
www.sofg.org
Testing the content
• SAEL maps to ~ 80% of current terms tested – So far been tested vs. current annotation in
• ArrayExpress/MIAMExpress – OrganismPart –82 terms
• HGMP – microarray mouse and human only –97 terms
• SMD - free text microarray sample annotations –22 terms
• GXD - * 80% for blot and cDNA data only• RAD – microarray, uses CBIL
www.sofg.org
Implementation
• MIAMExpress – ArrayExpress data capture tool uses SAEL
• Data in ArrayExpress will be mapped to SAEL
• Future submissions will use SAEL
• We will encourage users of the MGED ontology to use SAEL where appropriate
www.sofg.org
COBrA
XSPAN demo takes place on Wednesday, August 4th at 11.30am in room Alsh 2 – Albert Burger
www.sofg.org
Mapping source ontologies to SAEL
• Adult mouse anatomy, FMA mapped to date• Mouse Developmental, GALEN, CBIL + others to do• Using COBrA from the XSPAN project - www.xspan.org
– Allows manual mapping between ontologies– Creates an OWL format mapping file– Reads: DAGEdit flat file format, GO XML/RDF, GO RDFS
and OWL– Mappings available from www.sofg.org
www.sofg.org
Web services
• A WSDL has been defined for SAEL:– dev_stage, is_tissue, is_cell_type, is_organ, is_system,
superclass, subclass, part, part_of, uri, definition, authority, history, name, synonym
• Source ontologies will provide a web service supporting queries and returning the attribute list
• WSDL has been tested vs. Adult mouse anatomy, FMA and developmental mouse anatomy, will be tested further
www.sofg.org
Proposed web services architecture
•SAEL and will be made available via user interface and programmatically
•Querying multiple ontologies will be supported by the central SAEL web service
•The SAEL Portal provides a graphical user interface for researchers to look up the mappings between the SAEL list of anatomical entities and the target ontologies.
•Will be implemented in 2 phases, SAEL portal first, then local ws
www.sofg.org
Future
• We welcome collaboration and mapping of other relevant ontologies proceeds e.g. EVOC
• Building web services architecture, XSPAN• Refining SAEL, completing mapping• Inclusion in MGED Extended ontology (v1.2)• Deeper integration of anatomy ontologies• Decisions on handling of null mappings• Modification of relationship types in current version• Protégé version from Alan Rector
www.sofg.org
Acknowledgements
• SAEL workshop participants: Stuart Aitken, Albert Burger, Richard Baldock, Jonathan Bard, Duncan Davidson, Terry Hayamizu, Helen Parkinson, Alan Rector, Jeremy Rogers, Martin Ringwald, Cornelius Rosse, Chris Stoeckert
• John Gennari• Niran Abeygunawardena, EBI Website,
MIAMExpress implementation• Funders: MRC-HGU, MGED, EU – TEMBLOR• Jeremy Gollub -SMD, Naran Hirani/Tom Freeman –
HGMP
www.sofg.org
Gratuitous Advertising – SOFG2
www.sofg.org
Bio-Ontologies Panel Discussion
• Michael Ashburner, Dept of Genetics Univeristy of Cambridge
• Crispin Miller, Bioinformatics and Onco-informatics Group
• Jeremy Rogers, Medical Informatics Group, University of Manchester
• Barry Smith, Institute for Formal Ontology and Medical Information Science, University of Leipzig and Buffalo, State University of NY
www.sofg.org
SWOT analysis
• 131,000 hits on Google
ontology
bioinformatics
swot geek
S10
1,000,0002,000,000
3,000,000
4,000,000
5,000,000
Series1
michael crispinbarry jeremy
S10
500,000
1,000,000
1,500,000
2,000,000
2,500,000
Series1
britneykylie
S102,000,0004,000,0006,000,0008,000,000
10,000,00012,000,000
Series1
www.sofg.org
SWOT 2
• Strengths• Weaknesses
– Orientated towards the internal aspects of bio-ontologies or an individual ontology
• Opportunities• Threats
– Those factors external to bio-ontologies or an individual ontology
www.sofg.org
Panel Perspective• Michael Ashburner
– “Pragmatic ontologies for real utility in biology ”• Jeremy Rogers
– “Pragmatic economic or user-led development risks the lowest common denominator or the mediocre: the Trabant or the Ford Mondeo. Theory-led development may ignore practicality and expense: the Formula One race car. How can such extremes be avoided in ontology engineering ? ”
• Crispin Miller – “There is generally a difference between what we would like to
say, and what our computers are capable of interpreting. How do we build ontology-based systems that can successfully resolve the tensions arising from this ”
• Barry Smith – “Physics has pure mathematics as its formal backbone. What is
the counterpart of pure mathematics in biology? Answer: formal ontology. ”
www.sofg.org
Criteria for future chairs
• “combination of School Mistress sternness, eloquence, and opinionation” Phil Lord
• “ability to pick people out of the audience” Robert Stevens
www.sofg.org
Michael
• GO– Strengths:wide uptake, community project, designed for
a single problem – gene product attributes, dev of ontol pragmatic (weakness?), open, to sw
– Weakness – Pragmatic design and build, qc mechanism, and implementation issues in DB, lack of formalism, no idea would be universal
– Opp. - world domination, achieving greater integration across species
– Threats – long term stability – academia, funding models, diversion into philoso
www.sofg.org
Crispin
• S - structure info• W - • O – abstraction level problematic • T - Knowing where to stop, ontology is linked to
tools, cultural issues,
www.sofg.org
Jeremy
• Strengths: community of eager users – scope, where to start
• Open source • T – semantic web hype, succession of curators
www.sofg.org
Helen
• S – collaboration• W – legacy data management, costs, maintenance,
user uptake• O – improved data retrieval, query, formalisation,• T – reinvention of the wheel
www.sofg.org
Barry Smith
• MA – “completion/failure”• S – exists• O – data, make it interoperable• W – pragmatic decisions become entrenched, • T – fools paradise – OWL is not expressive enough • expressive power – ‘cheating’ is-a overloading, need a top
level of ontology – philosophical qu. way that instance is used • Ontology – reality – knowledge – describing knowledge, -
assay ontology ? Ontology of scientific expts • Perfection is the enemy of the good ! • Solution: ‘more precison’