29
NIF Vocabulary Server Maryann Martone, Ph. D.

NIF Vocabulary Server

  • Upload
    svea

  • View
    65

  • Download
    0

Embed Size (px)

DESCRIPTION

NIF Vocabulary Server. Maryann Martone, Ph. D. NIF Technical Team. Perry Miller, Yale Luis Marenco, Yale Yuli Li, Yale Arun Rangarajun, Cal Tech Hans-Michael Muller, Cal Tech Sredevi Polavarum, George Mason Jeff Grethe, UCSD Brian Sanders, UCSD Vadim Astakhov, UCSD Amarnath Gupta, UCSD - PowerPoint PPT Presentation

Citation preview

Page 1: NIF Vocabulary Server

NIF Vocabulary Server

Maryann Martone, Ph. D.

Page 2: NIF Vocabulary Server

NIF Technical Team

Perry Miller, Yale Luis Marenco, Yale Yuli Li, Yale Arun Rangarajun, Cal Tech Hans-Michael Muller, Cal Tech Sredevi Polavarum, George Mason Jeff Grethe, UCSD Brian Sanders, UCSD Vadim Astakhov, UCSD Amarnath Gupta, UCSD Xufei Qian, UCSD Bill Bug, UCSD Maryann Martone, UCSD

Page 3: NIF Vocabulary Server

Basic Architecture

•The same architecture and workflow applies to the registration process

Page 4: NIF Vocabulary Server

Role of NIF Terminologies

NIF terminologies provide a shared vocabulary for annotation of neuroscience data

NIF terminologies provide the shared semantics for accessing resources and data through the NIF interface Semantic enrichment of terms to enable more targeted

and meaningful queries

Ultimately, NIF terminologies are critical for data and database interoperability

Page 5: NIF Vocabulary Server

Building the NIF Terminologies

NIF Basic: Daniel Gardner held a series of workshops with

neuroscientists to obtain sets of terms that are useful for neuroscientists

NIFSTD (NIF Standardized) Bill Bug built a set of expanded vocabularies using the

structure of the BIRNLex and the import of existing terminological resources

Provides enhanced coverage of domains in NIF Basic Provides coverage of domains not included in NIF but covered by

existing resources, e.g., molecules Encoded in OWL/RDF Provides mapping to source terminologies, including NIF Basic Provides synonyms, lexical variants, abbreviations

Page 6: NIF Vocabulary Server

Registering a Resource to NIF

Level 1 NIF Registry: high level descriptions from NIF vocabularies

supplied by human curators

Level 2*** Discovery mechanism for hidden content (Disco or

SiteMaps.org)

Level 3 Direct query of web accessible database Automated registration Mapping of database content to NIF vocabulary by human

***Not yet implemented

Page 7: NIF Vocabulary Server

Level 1 Registration

•Sites are entered by curators

•Annotation with NIF basic vocabulary + free text

•May be searched with NIFSTD terms

Page 8: NIF Vocabulary Server

Level 2 Registration

Automated or semi-automated discovery and indexing of web sites Index of web sites registered to NIF registry Web content is indexed against the NIFSTD

vocabularies Discovery mechanism planned (Luis)

XML will utilize NIFSTD

Page 9: NIF Vocabulary Server

Level 3: NIF Data Federation• Allows deep query of database content through a single interface

•Limited number of resources registered for Phase 2: proof of concept and demonstration of deep search via database mediation

•Registration process:

•Create wrapper to allow remote NIF mediator query

•Map content to NIFSTD

•Semi-automatic process based on high-level mapping of fields and data values:

•e.g., SumsDB geography maps to NIFSTD regional part of brain

Page 10: NIF Vocabulary Server

Mapping to Level 3: Concept Mapping Tool

•Java webstart application

•Retrieves database schema + data from mediator registry

•Maps data to NIFSTD values

•Provides term mapping to mediator Term Index Source (TIS)

Page 11: NIF Vocabulary Server

Why is this done by a human at the moment? •Abbreviations,

ambiguous terms, non-standard names, e.g.,

•LPF: (**if this is mapped as an abbreviation to NIFSTD, then it wouldn’t be a problem)

•Anterior cingulate: Gyrus? Sulcus?

•Frontal subgyral =frontal subgyral white matter?

Page 12: NIF Vocabulary Server

Your definition-My definition?

Hippocampus (SUMS)= hippocampus (NIFSTD)?

•can’t tell just by the string; must look at the definition

Page 13: NIF Vocabulary Server

BIRNLex ComponentsBIRNLex

Common AnatomyReferece Ontology

(CARO)

PhenotypicQualities (PATO)

SubcellularAnatomy Ontology

OBOCell Type

NIFNerve Cell

OBI

NIFMolecule

OBOSequence

Organism Taxonomy

SensoryBehaviorCognition

Disease

Investigation

Anatomy

Page 14: NIF Vocabulary Server

Building NIFSTD OBO Foundry principles and best practices NIFSTD is built from a set of modular ontologies

Anatomy: Neuronames (via BIRNLex) Taxonomy: NCBI taxonomy (via BIRNLex) Molecule: IUPHAR + PDPS Ki + SwissProt (neuro) Cell: NIF (Senselab, Neuromorpho, CCDB) Subcellular anatomy: GO + SAO Disease: MESH/UMLS + NINDS + OMIM (neuro) Resource descriptors: NIF, NITRC, NCBC, OBI Technique: NIF + Ontology for Biomedical Investigation (OBI) Behavior: NIF, BIRN, BrainMap Attributes: PATO

Each is mapped to a unique identifier Single inheritance with minimal assignment of properties Each file is imported separately, but integrated through the Basic Formal Ontology into a single

vocabulary

Imported using manual, semi-automated and automated means Degree of intervention dependent on the vocabulary At this point, large degree of manual intervention is often necessary Link back to source ID is maintained

Encoded in OWL/RDF

Page 15: NIF Vocabulary Server

Adding to and amending the BIRN lexically-enhanced ontology

Page 16: NIF Vocabulary Server

Batch modifications (alpha)

yes

no

prefLabelsynonymabbrevacronymtax scientific nametax common nameGENBANK common nameNCBI BLAST nameantiquated labelmisspellingIMSR standard name

Page 17: NIF Vocabulary Server

Batch modification example

IUPHAR V-gated Ion Channels (NIF)Row = class

col = related property(annotations & objects)

Parent prop: required to place in BIRNLex hierarchy

Page 18: NIF Vocabulary Server

Batch modification example

IUPHAR V-gated Ion Channels (NIF)

Page 19: NIF Vocabulary Server

Citations & Mappings

Maintain link back to external knowledge source

For terms/concepts and for definitions

Mappings provide parsable representation of cross terminology synonymies

Page 20: NIF Vocabulary Server

Citations & Mappings External IDs

Generic externalSourceId

Specific (for common sources) Neuroanatomy: neuronamesID/bamsID Organism taxonomy: ncbiTaxID/itisID/gbifID/jaxMiceID/tacMiceID Cells/Tissue: atccID Disease: UmlsCui/MeSH

URL templates Use IDs to link to external source URL references (when available) automatically add ref links in to tools using BIRNLex - TIS, BONFIRE, etc.

Definition citations as well including URIs & publication references

Page 21: NIF Vocabulary Server

Use Case: Cell Types

Existing cell type ontology, but poor coverage of neuronal cells and generally agreed by the community to be “problemmatical”

Senselab, CCDB, NeuroMorpho.org, NIF collated cell type terminologies

Produced master list on Excel spreadsheet with defined properties Neurotransmitter, anatomical location, morphology, molecular

constituent, circuit type

Using Jena code written by BB, imported contents directly into Protégé OWL, matching strings against existing content, e.g., anatomy, molecules

Page 22: NIF Vocabulary Server

Cerebellar Granule Cell

Purkinje Cell

Photoreceptor Cell

Chandelier Cell

Cerebellar Basket Cell

Double Bouquet Cell

Globular Bushy Cell

Medium Spiny Cell

Pyramidal Cell

Dentate Gyrus Granule Cell

Olfactory Granule Cell

Cortical Spiny Stellate Cell

Neuron

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

Cerebellar Granule Cell

Purkinje Cell

Photoreceptor Cell

Chandelier Cell

Cerebellar Basket Cell

Double Bouquet Cell

Globular Bushy Cell

Medium Spiny Cell

Pyramidal Cell

Dentate Gyrus Granule Cell

Olfactory Granule Cell

GABAergic Neuron

Spiny Cell

Granule Cell

Glutamatergic Neuron

Cortical Spiny Stellate Cell

Neuron

is-ais-a

is-a

is-a

is-ais-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

is-a

Page 23: NIF Vocabulary Server

Maintaining NIFSTD

Maintenance of NIFSTD at this point will probably require the use of a human curator, although several of the functions can be automated

Community can contribute to NIF Basic; human curator will be needed to migrate much of the content to NIFSTD

Page 24: NIF Vocabulary Server

Availability of NIFSTD

NIFSTD OWL file available from http://purl.org/nif/ontology/nif.owl

NIFSTD available through Bonfire (1 and 2) for programmatic access

Page 25: NIF Vocabulary Server

Bonfire NIF vocabularies are served by the vocabulary server built

by BIRN: Bonfire Oracle database Cross mappings between different vocabularies Basic graph queries (neighborhood, shortest path) Web services were developed for NIF Based on the structure of UMLS User interface for graph visualization and queries (not planned

for NIF delivery)

Bonfire 2 Optimized for NIF vocabularies Postgres RDMS + ontology access functions that we have built

e.g., Given a term, produce its ancestry graph by following the edge-label(subclass-of OR part-of)

Page 26: NIF Vocabulary Server

App. Configuration

Fed. DB Registry

NIF Application ArchitectureFor OntoQuest (Bonfire 2)

Application Logic

OntologyDatabase

OntoQuest

LuceneIndex

XML NIF Registry

Neuroscience Web sites

ExternalDatabase-1External

Database-1ExternalDatabase-1

DocsDocs

Docs

Text Engine

Ontologies

External web sites

Web Client

Term Mapper and Indexer

Page 27: NIF Vocabulary Server

What’s next

NIFSTD: Comprehensive “is a” hierarchy, but relations sparse - e.g., “part of”, “binds ligand”, “sequence of”, etc.

Continue to build pipeline from loosely structured to formal ontology

Continue to add domains Add relationships and definitions Generate additional hierarchies Incorporate more of the semantics into the NIF search

Page 28: NIF Vocabulary Server

Evolution of Terminologies

•NIF Basic vocabulary

•Contributed by panels of experts

•Coarse granularity but broad coverage

•Loose hierarchy

•XML

•NIF STD

•Imports existing terminologies developed by other communities

•Modular design

•Normalizes structure according to Basic Formal Ontology (BFO) Creates single inheritance “is a” tree

•Provides mapping between NIF and NIFSTD

•Provides synonyms, abbreviations and lexical variants

•OWL/RDF

•NIF Plus

•Relates classes through “part of” and other OBO relations

•Consistent human and machine-readable definitions

NIF Phase I and II

Page 29: NIF Vocabulary Server

Current Status and Future Work

Prototype interfaces built upon Bonfire I and II NIFSTD 1.0 in Bonfire 1 NIFSTD 1.1 in Bonfire 2

Will update Bonfire 1 content after this demonstration Implementation and testing of vocabulary services using Bonfire 2

Better use of lexical variants, synonyms etc.

Mapping of NIF Registry and NIF data federation with NIFSTD All resources registered will mark up more content

Coverage of behavior (sensory, motor) and behavioral assessments will be added

More lexical variants will be used in searches Improved access to annotation properties through Concept Mapper