14
Approaches for Integration of multiple ‘Omic’ Data Dmitry Grapov, PhD

Omic Data Integration Strategies

Embed Size (px)

DESCRIPTION

Discussion of 'Omic' (e.g. genomic, transcriptomic, proteomic and metabolimic) data integration approaches including: Gene ontology (GO) enrichment Genes + Metabolites functional enrichment Gene + protein + metabolite network mapping

Citation preview

Page 1: Omic Data Integration Strategies

Approaches for Integration of multiple ‘Omic’ Data

Dmitry Grapov, PhD

Page 2: Omic Data Integration Strategies

Examples

Nature Reviews Genetics 15, 107–120 (2014) doi:10.1038/nrg3643

FBA = flux-balance analysis

• Topological enrichment can give broad overview of impacted genes, proteins and metabolites

• Changes in biochemical domains corroborated by multi-Omic data sets can be used to identify robust candidates responsible for phenotypic variation between comparisons

• Gene-gene, protein-protein or gene-protein interaction networks can be used to deconvolute ambiguous metabolic pathways

Page 3: Omic Data Integration Strategies

Common Approaches

Nature Reviews Genetics 15, 107–120 (2014) doi:10.1038/nrg3643

Page 4: Omic Data Integration Strategies

Biochemical Domain Enrichment Analysis

• Genes/Proteins DAVID, AmiGo, etc GO:terms

• Genes/Proteins + Metabolites IMPaLA: Integrated Molecular Pathway Level Analysis (http://impala.molgen.mpg.de/) pathways

1. Classify all species domains (e.g. biological process, pathway, etc)

2. Calculate probability of observing changes in species by chance

Page 5: Omic Data Integration Strategies

IMPaLA: Gene + Metabolite pathway enrichment

Challenges:• Removal of redundant information• Preference of specific vs. generic pathways• Visualization of gene + metabolite + pathway relationships

Page 6: Omic Data Integration Strategies

Determining significance of the enrichment: Hypergeometric Test

How to calculate statistics to determine enrichment?

hit.num = 51 # number of significantly changed pathway metabolites set.num = 1455 # number of metabolites in pathway full = 3358 # all possible metabolites in organismq.size = 72 # number of significantly changed metabolites

phyper(hit.num-1, set.num, full-set.num, q.size, lower.tail=F)= 1.717553e-06

Page 7: Omic Data Integration Strategies

GO Enrichment analysis:Hierarchy of Redundancy (parents)

• GO is an ontology wherein enrichment is often shared by children and parents.

• Difficult to co-visualize term hierarchy and gene to term mapping

Page 8: Omic Data Integration Strategies

Enrichment networks: Removing the Hierarchy of Redundancy

Workflow:

1. If two nodes share all genes, drop least enriched (highest p-value)

2. Filter terms based on enrichment

3. Display term to gene/protein relationships as edges in a network

4. Map direction of change in genes/proteins to network node attributes

Page 9: Omic Data Integration Strategies

Enrichment NetworkMapping of parents through children

GO enrichment network displays:

• gene names associated with each overrepresented term

• Fold change in protein expression between two groups (can be extended k>2 groups)

• Can display enrichment p-value for each term

• Can incorporate metabolites as children of genes

Page 10: Omic Data Integration Strategies

Empirical Networks

• Correlation based networks (CN) (simple, tendency to hairball)

• GGM or partial correlation based networks (advanced, preference of direct over indirect relationships

• *Increase in robustness with sample size

10.1007/978-1-4614-1689-0_17

Page 11: Omic Data Integration Strategies

Topological Enrichment Networks

http://pubchem.ncbi.nlm.nih.gov//score_matrix/score_matrix.cgi

http://www.genome.jp/dbget-bin/www_bget?rn:R00975

Page 12: Omic Data Integration Strategies

Topological Enrichment Networks:genes + proteins + metabolites

Page 13: Omic Data Integration Strategies

MetaMapRBiological network generator

https://github.com/dgrapov/MetaMapR

Page 14: Omic Data Integration Strategies

[email protected] metabolomics.ucdavis.edu

This research was supported in part by NIH 1 U24 DK097154