11
1 Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays Hsun-Hsien Chang 1 , Michael McGeachie 1,2 1 Children’s Hospital Informatics Program, Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School 2 Channing Lab, Brigham and Women Hospital September 3, 2011

Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays

  • Upload
    lew

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays. Hsun-Hsien Chang 1 , Michael McGeachie 1,2 1 Children’s Hospital Informatics Program, Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School - PowerPoint PPT Presentation

Citation preview

Page 1: Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays

1

Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays

Hsun-Hsien Chang1, Michael McGeachie1,2

1 Children’s Hospital Informatics Program, Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School

2 Channing Lab, Brigham and Women Hospital

September 3, 2011

Page 2: Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays

2

Genetic Information Flows from DNA to RNA

• Central dogma of molecular biology.

• Research goals:– Decipher how genetic

variants influence RNA transcript expression, leading to disease formation.

– Create clinical tools to perform diagnosis & prognosis, design treatment strategies, etc.

Page 3: Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays

3

Measure Genetic Variants and RNA Abundance by Microarrays

• Genetic variants are measured by single nucleotide polymorphisms (SNPs).

• Modeled by discrete (multinomial) random variables.

• Microarrays can assess 500K SNPs in parallel.

• RNA abundance is measured by transcriptional expression levels.

• Modeled by continuous (log-normal) random variables.

• Microarrays can assess 50K transcripts in parallel.

Page 4: Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays

4

Identify SNP-Transcript Dependence

High expression level

Low expression level

Medium expression level

• Challenges: – Need an intelligent method to compare pairs of

500K SNPs and 50K transcripts. – Need a network analysis to capture molecular

interactions between SNPs and transcripts.

Page 5: Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays

5

Reduce Dimensionality by Phenotypes

SNPs microarrays

(discrete variables)

expression microarrays (continuous variables)

Filter by Phenotypes

(Bayes factor)

Filter by Phenotypes

(Bayes factor)

Page 6: Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays

6

Model SNP-Transcript Dependence

Reduced SNPs data

Reduced expression

data

S1

SM

G1

GN

A SNP can be influenced by other SNPs.

A transcript can be influenced by SNPs and other transcripts.

Page 7: Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays

7

Interplay of Phenotypes, SNPs, and Transcripts

• Network analysis is performed on the reduced data set.

• For each variable, find the set of modulating variables with the highestlikelihood .

• Implement a greedy search algorithm to search the best network.

Pheno

Page 8: Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays

8

Pediatric Acute Lymphoblastic Leukemia (ALL)

• Mutation of lymphoblasts leads to acute lymphoblastic leukemia (ALL).

• Two types of ALL have different responses to chemotherapies:– B-cell precursor ALL

(BCP-ALL)– common ALL (C-ALL)

Page 9: Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays

9

A SNP-Transcript Network Distinguishes Pediatric Acute Lymphoblastic Leukemia

• Database from GEO with access # GSE10792.

• 28 patients; 8 with BCP-ALL and 20 with C-ALL.

• Genotyped at 100k SNPs by Affymetrix Human Mapping 100K Set microarrays.

• Expression patterns of 50k genes were profiled using Affymetrix HG-U133 Plus 2.0 platforms.

• 96% phenotype classification accuracy.

Page 10: Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays

10

Functional Analysis of Signature GenesSNP/Gene Symbol Chromosome

LocationFunction

MAP1B 5q13 Cell signaling, Cell morphology, Cellular assembly

C8orf84 8q21.11 Cancer, Genetic disorder

SEMA6D 15q21.1 Cellular movement,

ID4 6p22-p21 Cellular growth

CDH2 18q11.2 Cell morphology, Cellular assembly, Cellular movement

CHRNA1 2q24-q32 Cell morphology

MYO3A 10p11.1 Genetic disorder

NID2 14q21-q22 Cell signaling

Page 11: Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays

11

Conclusions

• Use phenotypes to reduce data dimensionality.

• Capture genetic flow by modeling SNP-transcript dependence networks.

• Create phenotype dependent SNP-transcript networks.

• Apply the analysis to pediatric acute lymphoblastic leukemia.