Upload
lew
View
32
Download
0
Embed Size (px)
DESCRIPTION
Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays. Hsun-Hsien Chang 1 , Michael McGeachie 1,2 1 Children’s Hospital Informatics Program, Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School - PowerPoint PPT Presentation
Citation preview
1
Phenotype Prediction by Integrative Network Analysis of SNP and Gene Expression Microarrays
Hsun-Hsien Chang1, Michael McGeachie1,2
1 Children’s Hospital Informatics Program, Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School
2 Channing Lab, Brigham and Women Hospital
September 3, 2011
2
Genetic Information Flows from DNA to RNA
• Central dogma of molecular biology.
• Research goals:– Decipher how genetic
variants influence RNA transcript expression, leading to disease formation.
– Create clinical tools to perform diagnosis & prognosis, design treatment strategies, etc.
3
Measure Genetic Variants and RNA Abundance by Microarrays
• Genetic variants are measured by single nucleotide polymorphisms (SNPs).
• Modeled by discrete (multinomial) random variables.
• Microarrays can assess 500K SNPs in parallel.
• RNA abundance is measured by transcriptional expression levels.
• Modeled by continuous (log-normal) random variables.
• Microarrays can assess 50K transcripts in parallel.
4
Identify SNP-Transcript Dependence
High expression level
Low expression level
Medium expression level
• Challenges: – Need an intelligent method to compare pairs of
500K SNPs and 50K transcripts. – Need a network analysis to capture molecular
interactions between SNPs and transcripts.
5
Reduce Dimensionality by Phenotypes
SNPs microarrays
(discrete variables)
expression microarrays (continuous variables)
Filter by Phenotypes
(Bayes factor)
Filter by Phenotypes
(Bayes factor)
6
Model SNP-Transcript Dependence
Reduced SNPs data
Reduced expression
data
S1
SM
G1
GN
A SNP can be influenced by other SNPs.
A transcript can be influenced by SNPs and other transcripts.
7
Interplay of Phenotypes, SNPs, and Transcripts
• Network analysis is performed on the reduced data set.
• For each variable, find the set of modulating variables with the highestlikelihood .
• Implement a greedy search algorithm to search the best network.
Pheno
8
Pediatric Acute Lymphoblastic Leukemia (ALL)
• Mutation of lymphoblasts leads to acute lymphoblastic leukemia (ALL).
• Two types of ALL have different responses to chemotherapies:– B-cell precursor ALL
(BCP-ALL)– common ALL (C-ALL)
9
A SNP-Transcript Network Distinguishes Pediatric Acute Lymphoblastic Leukemia
• Database from GEO with access # GSE10792.
• 28 patients; 8 with BCP-ALL and 20 with C-ALL.
• Genotyped at 100k SNPs by Affymetrix Human Mapping 100K Set microarrays.
• Expression patterns of 50k genes were profiled using Affymetrix HG-U133 Plus 2.0 platforms.
• 96% phenotype classification accuracy.
10
Functional Analysis of Signature GenesSNP/Gene Symbol Chromosome
LocationFunction
MAP1B 5q13 Cell signaling, Cell morphology, Cellular assembly
C8orf84 8q21.11 Cancer, Genetic disorder
SEMA6D 15q21.1 Cellular movement,
ID4 6p22-p21 Cellular growth
CDH2 18q11.2 Cell morphology, Cellular assembly, Cellular movement
CHRNA1 2q24-q32 Cell morphology
MYO3A 10p11.1 Genetic disorder
NID2 14q21-q22 Cell signaling
11
Conclusions
• Use phenotypes to reduce data dimensionality.
• Capture genetic flow by modeling SNP-transcript dependence networks.
• Create phenotype dependent SNP-transcript networks.
• Apply the analysis to pediatric acute lymphoblastic leukemia.