85
Molecular Approaches for Translational Bioinformatics Maricel G. Kann, PhD Associate Professor University of Maryland Baltimore County

Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Molecular Approaches for

Translational Bioinformatics

Maricel G. Kann, PhDAssociate Professor

University of Maryland Baltimore County

Page 2: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Outline

Introduction Translational Bioinformatics

Biology Refresher

Integration of data Example of Protein Interactions and Disease

Grand Challenges: Analysis of Human Variants Databases

Tools for Predicting Deleterious Variants

A protein-domain approach for the analysis of human variants

Page 3: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Translational Bioinformatics1. Translational bioinformatics is an emerging field addressing the computational challenges in

biomedical research and the analysis of the vast amount of clinical data generated from it. It is difficult to define such a broad field and, due to the inherently interdisciplinary nature of the research, impossible to detach translational bioinformatics from other related fields.

2. “Bio” Informatics: concerns basic biological information. What is translational? Moving discoveries and innovations from lab to bedside

3. Challenge:While bioinformatics methodologies have been used to enable biological discoveries for decades, here the end product has to be translational, or applying to human health and disease.

AMIA: The development of storage, analytic, and interpretive methods to optimize the transformation of increasingly voluminous biomedical data, and genomic data in particular, into proactive, predictive, preventive, and participatory health. Translational bioinformatics includes research on the development of novel techniques for the integration of biological and clinical data and the evolution of clinical informatics methodology to encompass biological observations. The end product of translational bioinformatics is newly found knowledge from these integrative efforts that can be disseminated to a variety of stakeholders, including biomedical scientists, clinicians, and patients.

1. Kann, MG. Advances in translational bioinformatics: computational approaches for the hunting of disease genes. Brief Bioinf 11: 96 (2010)2. Altman, RB. Introduction to Translational Bioinformatics Collection, Ed. MG Kann & F Lewitter. PLOS Comp Biol (2012)3. Butte, AJ. Translational bioinformatics: coming of age. J Am Med Inform Assoc ;15:709 (2008)

Page 4: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

4

Page 5: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Sarkar IN. J Trans Med. 2010 Feb 26;8:22

Bench Bedside PolicyCommunity

Innovation Validation Adoption

Molecules & Cells

Tissues & Organs

Individuals Populations

T1 T2 T3

Bio-

informatics

Imaging

Informatics

Clinical

Informatics

Public Health

Informatics

Translational

Bioinformatics

Clinical Research

Informatics

Page 6: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Quick biology refresher

Image adapted from wikipedia

DNA

RNA

Protein

Interaction Networks

Transcriptomics

Genomics

Proteomics

Interactomics

Metabolomics

Page 7: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Genomics era

Transcriptomics

Proteomics

Interactomics

Metabolomics

Technology Data

Page 8: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Kilo

Mega

Giga

Tera

Peta

Exa

Zetta

Page 9: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human
Page 10: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Data Integration

What is the purpose of integration?

Will integrated data be interpretable?

How will you validate integration accuracy?

Page 11: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Why Integrate Data?

Provide complete picture

Systems view

Validate evidence through corroboration

Genotype-phenotype correlation

Standardization of Data

Machine readable (via XML)

Page 12: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

12

“Thanks to system biology we now have a

clear picture of complex diseases”

Page 13: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Image adapted from wikipedia

DNA

RNA

Protein

Interaction Networks

Transcriptomics

Genomics

Proteomics

Interactomics

Metabolomics

Page 14: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Importance of Protein Interactions

Aloy and Russell, 2006

Dobson, Nature 432, 2004

Page 15: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Experimental Methods (high-throughput)

I. Different Methods

i. Yeast two hybrid (Y2H)

ii. Mass spectroscopy (MS)

iii. Gene coexpression

iv. Synthetic lethality method

See Review by Shoemaker and Panchenko, PLOS CB, 2007

Page 16: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Protein interaction databases

HPRD: Human Protein Reference Database

One of the largest repositories for protein interactions

http://www.hprd.org/

Maricel G. Kann16

Page 17: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Protein-Protein interaction (PPI) data can be used

to map networks of interactions

A PPI network map…

Page 18: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Protein-Protein interaction (PPI) data can be used

to map networks of interactions

Protein A

Protein B

Nodes represent proteins

In PPI network graphs…

Connecting lines represent their interactions

Page 19: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Using PPI networks to study disease

• Neurodegenerative genetic disorder (enlargement of the lateral ventricle)

• Affects muscle coordination • Leads to cognitive decline and psychiatric problems

Example: Huntington’s disease

http://cueflash.com/decks/BASAL_GANGLIA_-_64

Page 20: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Using PPI networks to study disease

1872

1908

1993

2004

http://en.wikipedia.org

Consider how our understanding of Huntington’s disease (HD) has evolved

First scientific report by George

Huntington

Page 21: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Using PPI networks to study disease

1872

1908

1993

2004

http://en.wikipedia.org

Mendelianpatterns of inheritance

were documente

d

First scientific report by George

Huntington

Consider how our understanding of Huntington’s disease (HD) has evolved

Page 22: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Using PPI networks to study disease

1872

1908

1993

2004

https://www.stanford.edu/group/hopes/cgi-bin/wordpress/2010/06/the-heat-shock-response/

Mendelianpatterns of inheritance

were documente

d

First scientific report by George

Huntington

Learned HD was caused by the repeat expansion of a trinucleotide in the Huntingtin (Htt) gene causes aggregation

leads to neuronal degeneration.

Consider how our understanding of Huntington’s disease (HD) has evolved

Page 23: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Using PPI networks to study disease

1872

1908

1993

2004

Mendelianpatterns of inheritance

were documente

d

First scientific report by George

Huntington

Learned HD was caused by the repeat expansion of a trinucleotide in the Huntingtin (Htt) gene causes aggregation

leads to neuronal degeneration.

Consider how our understanding of Huntington’s disease (HD) has evolved

>100 years later key disease-causing gene in HD known. But the

mechanism for Httaggregation remained

unknown

Page 24: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

• In 2004 Goehler et al. mapped all the PPIs in HD• Found that interaction between Htt and GIT1, a GTPase-

activating protein, mediates Htt aggregation.• In 2006 GTI1 shown as a target for therapeutic strategies

against HD

Using PPI networks to study disease

Consider how our understanding of Huntington’s disease (HD) has evolved

Page 25: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Image adapted from wikipedia

DNA

RNA

Protein

Interaction Networks

Transcriptomics

Genomics

Proteomics

Interactomics

Metabolomics

Page 26: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Grand Challenges in ScienceWhat don’t we know?”

July 1st, 2005, Science Journal

Question 1.

What Is the universe made of?

(…)

Question 4.

To what extent are genetic variation and

personal health linked?

(…)

Page 27: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Analysis of Disease Variants

Collect human variants from existing data or computational methods

Case/Control studies for finding disease biomarkers

Using bioinformatics for disease diagnosis, prognosis, and treatment and provide information about molecular mechanism in disease

Page 28: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Clinvar

Goal: freely

accessible, public archive of reports of the relationships among human variations and phenotypes along with supporting evidence

Page 29: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Clinvar

Page 30: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Coverage comparison of genes, exonic

missense variants, and indels

Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human Variants.. J. Mol. Biol. (2013)

Page 31: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Predictors of Deleterious Variants

Peterson et al. JMB, (2013)

Page 32: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Analysis of Disease Variants

Collect human variants from existing data or computational methods

Case/Control studies for finding disease biomarkers

Using bioinformatics for disease diagnosis, prognosis, and treatment and provide information about molecular mechanism in disease

Page 33: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human
Page 34: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

The European Genome-Phenome

Archive

34

Page 35: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

35

The European Genome-Phenome

Archive

Page 36: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

dbGaP

Page 37: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

TCGA

Page 38: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human
Page 39: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

UK10K

Page 40: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Analysis of Disease Variants

Collect human variants from existing data or computational methods

Case/Control studies for finding disease biomarkers

Using bioinformatics for disease diagnosis, prognosis, and treatment and provide information about molecular mechanism in disease

Page 41: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Biomarkers

Definition: A characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention

Can be one or more genes, proteins, metabolites, image features, etc.

Page 42: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Uses of Biomarkers Pharmaceutical research

Identify drug targets by comparing expression profile of drug-treated cells with those of cells containing mutations in genes encoding known drug targets

Disease Diagnoses and Treatments Distinguish morphologically similar cancers Therapy potential Pharmacogenomics

Prognosis Determining how aggressive to be with treatment

options

Risk Profiling Predicting risks of conditions with strong biomarker

correlation Direct to consumer testing

Page 43: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Outline

Introduction Translational Bioinformatics

Biology Refresher

Integration of data Example of Protein Interactions and Disease

Grand Challenges: Analysis of Human Variants Databases

Tools for Predicting Deleterious Variants

A protein-domain approach for the analysis of human variants

Page 44: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

44

Bioinformatics and Computational Biology Protein Domain Recognition and metrics to compare different

methods: SALTO, Kann et al. Bioinformatics 2005 & GLOBAL, Kann et al. NAR, 2007, Carol et al., Bioinformatics 2010

Systems Biology Computational approaches for predicting and reconstructing protein and protein

domain interactions networks: Jothi, Kann & Przytycka, Bioinformatics, 2005, Kann et al. Proteins, 2007, Kann Brief. in Bioinf. 2007, Kann et al. JMB 2009

Translational Bioinformatics Kann, Brief. in Bioinf. 2010, Capriotti et al.Brief. in Bioinf. 2012, Gonzalez & Kann, PLoS-CB (2012), Translational Bioinformatics, PLoS-CB, Ed. Kann (2012)

Domain-based analysis of disease mutations DMDM: Peterson et al, Bioinformatics 2010

DS-Score: Peterson et al, JAMIA 2012

Domain Network of Alzheimer’s and Parkinson: Regan et al, JAMIA 2012

Cancer domain Landscapes: Nehrt et al, BMC-Genomics 2012

Yeast & Human Domain mutations: Peterson et al, BMC-Genomics 2013

Text-mining of molecular information from biomedical literature Disease mutations: Doughty et al, Bioinformatics 2010

Pharmacogenomics: Rance et al, JBI 2012

Crowdsourcing: Burger et al, Lecture Notes in CS, Springer 2012

Protein-protein interactions: Doughty et al, in preparation

Gene-Drug relationships: Llamas et al, submitted

Research at the Kann Lab

Page 45: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

45

Page 46: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Individual Genome Sequence

46

An

aly

sis

of D

NA

varia

tion

s

All mutations in one protein

Mutations in different proteins

Is there any functional association between the mutated genes?

Drug target

Every unhappy family is unhappy in its own way. Leo Tolstoy

Patients

Page 47: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

47

Cancer Genome Landscapes

Wood, L.D. et al. (Vogelstein group), The genomic landscapes of human breast and colorectal cancers. Science, 2007. 318(5853): p. 1108-13.

APC

TP53

KRAS

Page 48: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

PI3K pathway mutations in breast and colorectal cancers

. Circled genes have somatic mutations in colorectal (red) and breast (blue) cancers.

L D Wood et al. Science 2007;318:1108-1113

Cancer Genome LandscapesIs there any functional association among mutated genes?

48

Page 49: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Doman-Centric Approach

49

We focus on mutations that result in changes in the coding proteins, in particular, we study mutations within the protein domain.

Is there any functional association between mutated proteins?

Domains are highly conserved regions of the protein that represent the structural functional units of proteins.

Why domains?

Page 50: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

50

The term protein domain (or domain) refers to a region of the protein with compact structure, usually with a hydrophobic core.

Page 51: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Protein Domains

51

Aloy and Russell, 2006

Protein Domains mediate 75% of protein-protein interactions

Most proteins are multi-domains (65% of Eukaryotic and 40% Prokaryotic), explains polyfunctionality of the gene

Most disease mutations are located inside domains

Page 52: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Different domains in the same

protein may play different rolesSPTB protein

Spectrin beta chain, erythrocytic

actin binding domains helix forming domains

Elliptocytosis mutationsSpherocytosis mutation

Zhong et al., Edgetic perturbation models of human inherited disorders,Mol. Syst. Biol. 5, 321 (2009) 52

Page 53: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Why a domain centric approach?

Current approaches are gene-centric approaches (whole-proteins)

They disregard specific location of the mutations

We know that mutations in different parts of the protein have different effect

Why treat mutations as they are identical?

53

By studying mutations at the domain level, we can account better for the polyfunctionality of the gene and cluster mutations with similar function.

Page 54: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

54

Outline

Protein Domain Approach for the Analysis of Disease Mutations:

Part I: Protein domain Landscape: a study of cancer somatic mutations

Part II: Statistics and visualization studies of known disease mutations

Page 55: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Genomic Landscapes

55

Wood, L.D. et al. (Vogelstein group), The genomic landscapes of human breast and colorectal cancers. Science, 2007. 318(5853): p. 1108-13.

11 colorectal and 11 breast cancer patient tumor genomes

Gene “Mountains” and “Hills”

Candidate genes (CAN genes) identified using local false discovery rate

Page 56: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

From Gene to Domain Landscape

56

Each point in the grid domain landscapes represents a domain, and the peaks are estimated by aggregating all mutations for all human proteins with such domain.

Nathan Nehrt Tom Peterson

Page 57: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Gene and Doman Landscapes

57

Nehrt LN, Peterson TA, Park DH and Kann MG, Domain Landscapes of somatic mutations in cancer. BMC Genomics. 13 (2012)

TCGA: 100 colon adenocarcinoma patients and 522 breast invasive carcinoma patients

Nathan Nehrt Tom Peterson

Page 58: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Selected domains highly mutated in

colon cancer tumors

58

Nehrt LN et al., Domain Landscapes of somatic mutations in cancer.BMC Genomics. 13 (2012)

FILIP1L is known to inhibit proliferation and migration and increase apoptosis in endothelial cells, it acts as a tumor suppressor and its loss of function has been implicated in ovarian cancer, head and neck squamous cell carcinoma and oligodendrogliomas.

Page 59: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

PIK3CA domains prevalence

59

Nehrt LN et al., Domain Landscapes of somatic mutations in cancer.BMC Genomics. 13 (2012)

Page 60: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Different domains in the same

protein may play different rolesSPTB protein

Spectrin beta chain, erythrocytic

actin binding domains helix forming domains

Elliptocytosis mutationsSpherocytosis mutation

Edgetic perturbation models of human inherited disordersZhong et al., Mol. Syst. Biol. 5, 321 (2009)

60

Page 61: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

PIK3CA domains prevalence

61

Nehrt LN, Peterson TA, Park DH and Kann MG, Domain Landscapes of somatic mutations in cancer.BMC Genomics. 13 (2012)

Page 62: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Summary Part I (mutations with

unknown significance to phenotype)

Domain landscape allows for the visualization of clusters of cancer somatic mutations at the domain level

Domain peaks cluster family of proteins sharing a domain mutated for that particular cancer

A gradient of mutation prevalence in cancer studies can be found across the different domains of a gene (PIK3CA)

62

Page 63: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

63

Outline

Protein Domain Approach for the Analysis of Disease Mutations:

Part I: Protein domain Landscape: a study of cancer somatic mutations

Part II: Statistics and visualization studies of known disease mutations

Page 64: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Domain Mapping of Disease Mutations

Shared Domain

Protein 2

DMDM Domain View

1 31

Protein 3

Protein 1

Mutation Count

64

Page 65: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

65

DMDM bioinf.umbc.edu/dmdm/

Page 66: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

ABCC_CFTR1 domain (nucleotide binding domain 1)

Significant hotspot at position 172

66

Page 67: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Protein Domain Disease Hotspots

Shared Domain

Protein 2

DMDM Domain View

1 31

Protein 3

Protein 1

Domain Disease Hotspot?

Mutation Count

67

Page 68: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

DS-ScoreDS-Score (domain significance score) is a statistical measure designed to identify significantly mutated domain positions

68

Cancer Non-Cancer

Number of Position-based Hotspots

986 2004

Mutations at Position-based hotspots (%)

13.7 % 6.4%

Position-based hotspots for Random Mutation set

7(±3) 10 (±3)

Peterson TA, Nehrt LN, Park DH and Kann MG, Incorporating molecular and functional context into the analysis and prioritization of human variants associated with cancer. J

Am Med Inform Assoc, 19, 275-83 (2012)

Nathan Nehrt Tom Peterson

Correlation coefficient between DS-Score and conservation:0.10 (cancer) 0.19(non-cancer)

Page 69: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

DS-Score Distributions

69

Peterson TA et al. J Am Med Inform Assoc, 19, 275-83 (2012)

Gene Type Function

ALK Oncogene Receptor kinase

BRAF Oncogene Protein kinase

EGFR Oncogene Receptor kinase

FLT3 Unknown Receptor kinase

GNAI2 Unknown GTPase

GNAS Oncogene GTPase

HRAS Oncogene GTPase

KIT Oncogene Receptor kinase

KRAS Oncogene GTPase

Page 70: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

70

DS-Score for Variant ClassificationNovel Variants

Likely Deleterious Likely Neutral

*

Map to Domain

DS-Score Hotspot

DS-Score: High precision (91%) but low Sensitivity (3.3%)

Page 71: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

71

1. Increase human disease mutation data

2. Change definition of domain hotspots

3. Use information from other species

How to increase the number of

domain hotspots?

Page 72: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

1. Text-mining of disease mutations

72

Attila Kertez-Farkas Emily Doughty

EMU: Extraction of Mutations

Our methods detect Mutations in text with 98% precision.We were able to double the number mutations compared with existing databases of breast and prostate cancer mutations.

Text-mining of molecular information from biomedical literature

Disease mutations: Doughty et al, Bioinformatics 2010

Pharmacogenomics: Rance et al, JBI 2012

Crowdsourcing: Burger et al, Lecture Notes in CS, Springer 2012

Protein-protein interactions: Doughty et al, in preparation

Gene-Drug relationships: Llamas et al, submittedRajuMishra

Eduardo Llamas

Page 73: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

2. Feature-based DS-Score

73

• Distribute highest position-based DS-Score for each functional feature to all positions constituting the feature in the domain

• Assumes entire functional feature is critical to the normal function of the protein

Page 74: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

74

1. Increase human disease mutation data

2. Change definition of domain hotspots

3. Use information from other species

How to increase the number of

domain hotspots?

Page 75: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Comparing orthologous genes and

common protein domains between

yeast and human

75

Page 76: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Multi-species DS-Score

76

Peterson, T.A., et al. A protein domain-centric approach for the comparative analysis of human and yeast phenotypically relevant mutations. BMC Genomics, 2013. 14 Suppl3: p. S5 (PMID: 238 194 56)

Page 77: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Yeast Mutations Cluster at

Protein Domain Positions

77

1,490 yeast mutations with known phenotypic impact

32,653 human disease-associated mutations

Page 78: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Number of domain hotspots

78

Tom Peterson

Page 79: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Number of domain hotspots

79

Tom Peterson

103 common feature-based hotspots

Only one common position-based hotspot

Position 246 on kinase domain PKc (cd00180)

Peterson, T.A., et al. A protein domain-centric approach for the comparative analysis of human and yeast phenotypically relevant mutations. BMC Genomics, 2013. 14 Suppl3: p. S5 (PMID: 238 194 56)

Page 80: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Summary Part II (mutations with

known phenotype)

DMDM provides a unique view of disease mutations that cannot be found in gene centric analysis or disease-specific

DMDM site has been reviewed and approved by UniProt and NCBI, direct links from these databases increase site visibility to 2,000 hits/week

Disease mutations cluster at certain domain positions or hotspots.

These hotspots can be used to prioritize mutations from genomic studies

We have devised ways to increase sensitivity, due to scarceness of mutation-phenotype: text-mining of more human mutations and using other species to define hotspots

80

Page 81: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Domain Profiling to study variants from

one clinical study

Domain Profiling of variants from multiple

clinical studies

Domain Domain Interactions

Map variants into Protein Domains

INSIDE DOMAIN INSIDE DOMAIN

Do cancer somatic mutations cluster at domain level? Nehrt et al. BMC-Genomics (2012)

Identification of Protein domain disease hot spots. Peterson et al. Bioinformatics (2007);

Peterson et al. JAMIA (2012)

Co-evolution methods to predict domain-domain interactions. Kann et al Proteins

(2007); Kann et al JMB (2009)

Domain-Centric Translational Bioinformatics

81

Page 82: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Future Work

Develop new machine learning techniques based on domain landscapes to predict biomarkers for disease stage (for instance, which genes or domains are biomarkers for metastasis)

Yeast studies indicate we can increase domain hotspots and that domain analysis is advantageous since it maps a larger number of mutations than genes. Mice and rat are in our pipelines now!

82

Page 83: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Kann Lab Research Network

83

Page 84: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

Acknowledgments

84

Kann Lab: Sarah Keasey, Emily Doughty, Amy Voltz, Raju Mishra, Tom Peterson, and Eduardo Llamas

Nathan Nehrt, MS

Page 85: Molecular Approaches for Translational Bioinformaticsccbb.jnu.ac.in/IUBDDJan2015/3.pdf · Towards Precision Medicine: Advances in Computational Approaches for the Analysis of Human

85

Databases used

THANK YOU !