The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics...

Preview:

Citation preview

The MGED Ontology: Providing Descriptors for

Microarray Data

Trish WhetzelDepartment of Genetics

Center for Bioinformatics

University of Pennsylvania

• CBIL– Chris Stoeckert– Angel Pizarro– Elisabetta Manduchi

• EBI– Helen Parkinson– Susanna Sansone

• TIGR– Joe White

• Stanford– Cathy Ball

Acknowledgements

• NCICB– Gilberto Fragoso– Liju Fan– Mervi Heiskanen

• Others– Paul Spellman– John Matese– Helen Causton

• Ontology Mailing List

MGED Society

• International organization• Comprised of biologists

computer scientists, and data analysts

• Aims to facilitate the sharing of functional genomics data generated by microarray and proteomics experiments– Establish standards for

microarray data annotation– Create microarray databases– Promote sharing of high

quality, well-annotated data

www.mged.org

MGED Standardization Efforts

• MIAME– The formulation of the minimum information required about a

microarray experiment in order to interpret and verify the results.

• MAGE– The establishment of a data exchange format (MAGE-ML) and

an object model (MAGE-OM) for microarray experiments.

• Ontololgy Working Group– The development of an ontology to describe microarray

experiments and in particular the biological material (biomaterial) used in these experiments.

• Transformations– The development of recommendations regarding microarray

data transformations and normalization methods.

Microarray Information to be Shared

Figure from:David J. Duggan et al. (1999) Expression Profiling using cDNA microarrays. Nature Genetics 21: 10-14

MGED Ontology (MO)

• Purpose– Provide standard terms for the annotation of

microarray experiments

• Benefits– Unambiguous description of how the

experiment was performed– Structured queries can be generated

• MGED Ontology concepts derived from the MIAME guidelines/MAGE-OM

MGED Ontology developmenthttp://mged.sourceforge.net/ontologies/MGEDontology

.php

• Oiled• File formats

– Html file– Daml file– NCI DTS Browser

MGED Ontology Class Hierarchy

• MGED CoreOntology– In synch with MAGE v.1– Stable class structure

• MGED ExtendedOntology– Classes for additional

terms as the usage of MO expands for genomics technologies

Relationship ofMO to MAGE-OM

• MO class hierarchy follows that of MAGE-OM– Association to OntologyEntry

• MO provides terms for these associations by: – Instances internal to MO– Instances from external ontologies

• Take advantage of existing ontologies

Relationship ofMO and MAGE-OM

MO and References to External Ontologies

MO and References to External Ontologies

Desirable Microarray Queries

• Return all experiments with species X examined at developmental stage Y– Sort by platform type– Which are untreated? Treated?

• Treated with what compound?• How comparable are these results?

• These questions can be asked of all experiments annotated using the MGED Ontology.

MO and Structured Queries

Future Work

• Convert to OWL– W3C standard ontology language– Expressivity

• Add terms to describe– Data transformation and normalization

methods– Protocol types used by the Protein Data

Bank

Future Work cont.

• Expand the MGED Extended Ontology by adding classes and terms to describe new domains and technologies– Toxicogenomics, ecotoxicogenomics and

pharmacogenomics …• A public forum for developing internationally

compatible and public infrastructure for reporting array-based toxicogenomics.

– Protein Standards Initiative• Defines community standards for data

representation in proteomics to facilitate data comparision, exchange and verification.

Links

• mged.org• http://mged.sourceforge.net/

ontologies/MGEDontology.php

The Computational View of Microarray Information

Need an ontology to unambiguously represent this information.

Issues to Discuss

• Burning Issues– Developing MO in synch with related efforts

(MAGE-OM v.2.0)– Use/presentation in annotation forms– Coverage of other technologies and

biological domains

• Flame retardant structure– ExtendedOntology

• Space to add new classes, terms and their relationship to one another

Relationship of MO and MAGE-OM

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Microarray Information to be Shared

Microarray Information to be Shared

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

ExperimentSample

RNA Extract

Labeled nucleic acid

Protocols

Hybridizations

Genes

Array Design

Microarray

Gene expression data matrix

normalization

integration

Recommended