34
Associations to Quantitative Trait Associations to Quantitative Trait Network and Analysis of Asthma Network and Analysis of Asthma Data Data Seyoung Kim and Eric P. Xing {sssykim, epxing}@cs.cmu.edu Machine Learning Dept. Carnegie Mellon University 10/30/2009 Genome Informatics 2009 @ Cold Sprint Harbor Lab

Associations to Quantitative Trait Network and Analysis of Asthma Data

  • Upload
    becca

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

Associations to Quantitative Trait Network and Analysis of Asthma Data. Seyoung Kim and Eric P. Xing {sssykim, epxing}@cs.cmu.edu Machine Learning Dept. Carnegie Mellon University. 10/30/2009. Genome Informatics 2009 @ Cold Sprint Harbor Lab. Association Analysis of Single Trait. - PowerPoint PPT Presentation

Citation preview

Page 1: Associations to Quantitative Trait Network and Analysis of Asthma Data

Associations to Quantitative Trait Associations to Quantitative Trait Network and Analysis of Asthma Network and Analysis of Asthma

DataData

Seyoung Kim and Eric P. Xing

{sssykim, epxing}@cs.cmu.edu

Machine Learning Dept.

Carnegie Mellon University

10/30/2009Genome Informatics 2009 @ Cold Sprint Harbor Lab

Page 2: Associations to Quantitative Trait Network and Analysis of Asthma Data

Association Analysis of Single Trait

a univariate phenotype:a univariate phenotype:i.e., disease/control, i.e., disease/control, gene expression levelgene expression level

causal SNPcausal SNP

Page 3: Associations to Quantitative Trait Network and Analysis of Asthma Data

Association Analysis of Quantitative Trait Network

a univariate phenotype:a univariate phenotype:i.e., disease/control, i.e., disease/control, gene expression levelgene expression level

causal SNPcausal SNP

Page 4: Associations to Quantitative Trait Network and Analysis of Asthma Data

Subnetworks for lung physiology Subnetwork for

quality of life

Genetic Association for Asthma Clinical Traits

TCGACGTTTTACTGTACAATT

Page 5: Associations to Quantitative Trait Network and Analysis of Asthma Data

Gene correlation network with gene modules

TCGACGTTTTACTGTACAATTExpression QTL Mapping

Microarray experiments

Page 6: Associations to Quantitative Trait Network and Analysis of Asthma Data

Motivation : Multiple-trait AssociationTraditional approach: analyze one phenotype at a

time

Our approach: consider multiple related phenotypes jointly and incorporate correlation structure in the phenotypesGraph-guided fused lasso (Kim & Xing, PLoS Genetics, 2009)

Page 7: Associations to Quantitative Trait Network and Analysis of Asthma Data

T G

A A

C C

A T

G A

A G

T

A

GenotypeAllergy Symptom

Multivariate Regression for Single-Trait Association Analysis

Xy

2.1 x=

x= β

Association Strength

Page 8: Associations to Quantitative Trait Network and Analysis of Asthma Data

T G

A A

C C

A T

G A

A G

T

A

Genotype

Multivariate Regression for Single-Trait Association Analysis

Many non-zero associations: how to pick the threshold?

2.1 x=

Association Strength

argmin (y – Xβ) (y – Xβ)β

'

Allergy Symptom

Page 9: Associations to Quantitative Trait Network and Analysis of Asthma Data

Genotype

x=2.1

Lasso for Reducing False Positives (Tibshirani, 1996)

Many zero associations (sparse results), but what if there are multiple related traits?

+

J

j 1

| βj |

T G

A A

C C

A T

G A

A G

T

A

Lasso Penalty for sparsity

Association Strength

argmin (y – Xβ) (y – Xβ)β

' λ

Allergy Symptom

Page 10: Associations to Quantitative Trait Network and Analysis of Asthma Data

Genotype

(3.4, 1.5, 2.1, 0.9, 1.8)

Allerg

y fo

r cat

s

Multivariate Regression for Multiple-Trait Association Analysis

T G

A A

C C

A T

G A

A G

T

A

Allergy Lung physiology

How to combine information across multiple traits to increase the power?

Association Strength

x=

+

J

j 1

| βj |argmin (y – Xβ) (y – Xβ)β

' λ

Allerg

y fo

r roa

ches

Allerg

y in

spr

ing

FEVFEF

Page 11: Associations to Quantitative Trait Network and Analysis of Asthma Data

Genotype

(3.4, 1.5, 2.1, 0.9, 1.8)

Multivariate Regression for Multiple-Trait Association Analysis

T G

A A

C C

A T

G A

A G

T

A

Allergy Lung physiology

Association Strength

x=

+We introduce

graph-guided fusion penalty

argmin (y – Xβ) (y – Xβ)β

' +

J

j 1

| βj |λ

Allerg

y fo

r cat

s

Allerg

y fo

r roa

ches

Allerg

y in

spr

ing

FEVFEF

Page 12: Associations to Quantitative Trait Network and Analysis of Asthma Data

Genotype

(3.4, 1.5, 2.1, 0.9, 1.8)

Multivariate Regression for Multiple-Trait Association Analysis

T G

A A

C C

A T

G A

A G

T

A

Allergy Lung physiology

Association Strength

x=

+

argmin (y – Xβ) (y – Xβ)β

' +

J

j 1

| βj |λ

Allerg

y fo

r cat

s

Allerg

y fo

r roa

ches

Allerg

y in

spr

ing

FEVFEF

Page 13: Associations to Quantitative Trait Network and Analysis of Asthma Data

Fusion Penalty

Fusion Penalty: | βjk - βjm |

If two traits are correlated (connected in the trait network), they are likely to share a similar association strength

ACGTTTTACTGTACAATT

SNP j

Trait m

Trait k

Association strength between SNP j and Trait k: βjk

Association strength between SNP j and Trait m: βjm

Page 14: Associations to Quantitative Trait Network and Analysis of Asthma Data

ACGTTTTACTGTACAATT

Overall effect

Graph-Constrained Fused Lasso

Fusion effect propagates to the entire network Association between SNPs and subnetworks of traits

Page 15: Associations to Quantitative Trait Network and Analysis of Asthma Data

ACGTTTTACTGTACAATT

Graph-Weighted Fused Lasso

Subnetwork structure is embedded as a densely connected nodes with large edge weights

Edges with small weights are effectively ignored

Overall effect

Page 16: Associations to Quantitative Trait Network and Analysis of Asthma Data

Previous Works vs. Our Approach

Previous approach Our approach

PCA-based approach (Weller et al., 1996, Mangin et al., 1998)

Implicit representation of trait correlations

Hard to interpret the derived traits

Explicit representation of trait correlations

Extension of module network for eQTL study (Lee et al., 2009)

Average traits within each trait cluster

Loss of information

Original data for traits are used

Network-based approach (Chen et al., 2008, Emilsson et al., 2008)

Separate association analysis for each trait (no information sharing)

Single-trait association are combined in light of trait network modules

Joint association analysis of multiple traits

Page 17: Associations to Quantitative Trait Network and Analysis of Asthma Data

Asthma Dataset

543 severe asthma patients from the Severe Asthma Research Program (SARP)

Genotypes : 34 SNPs in IL-4R gene 40kb region of chromosome 16 Impute missing genotypes with PHASE (Li and Stephens, 2003)

Traits : 53 asthma-related clinical traits Quality of Life: emotion, environment, activity, symptom Family history: number of siblings with allergy, does the father has asthma? Asthma symptoms: Chest tightness, wheeziness

Page 18: Associations to Quantitative Trait Network and Analysis of Asthma Data

Asthma Trait Network

Trait Correlation Structure

Traits are reordered according to hierarchical clustering results

Threshold at 0.7

Trait Network

Page 19: Associations to Quantitative Trait Network and Analysis of Asthma Data

Asthma Trait Network

Subnetwork for quality of life

Subnetwork for lung physiology

Phenotype Correlation Structure

Subnetwork for Asthma symptoms

Page 20: Associations to Quantitative Trait Network and Analysis of Asthma Data

Results from Single-SNP/Trait Test

Trait Correlation Matrix

Single-Marker Single-Trait Test

SN

Ps

Phenotypes

Phe

noty

pes

Trait Network

Lung physiology-related traits I• Baseline FEV1 predicted value: MPVLung • Pre FEF 25-75 predicted value • Average nitric oxide value: online • Body Mass Index • Postbronchodilation FEV1, liters: Spirometry • Baseline FEV1 % predicted: Spirometry • Baseline predrug FEV1, % predicted • Baseline predrug FEV1, % predicted

Q551R SNP• Codes for amino-acid changes in the intracellular signaling portion of the receptor• Exon 12

Permutation test α = 0.05

Permutation test α = 0.01

Page 21: Associations to Quantitative Trait Network and Analysis of Asthma Data

Lasso Graph-constrained Fused Lasso

Graph-weighted Fused Lasso

Comparison of Gflasso with Others

Trait Correlation Matrix

Single-Marker Single-Trait Test

SN

Ps

Phenotypes

Phe

noty

pes

? ?

Trait Network

Lung physiology-related traits I• Baseline FEV1 predicted value: MPVLung • Pre FEF 25-75 predicted value • Average nitric oxide value: online • Body Mass Index • Postbronchodilation FEV1, liters: Spirometry • Baseline FEV1 % predicted: Spirometry • Baseline predrug FEV1, % predicted • Baseline predrug FEV1, % predicted

Q551R SNP• Codes for amino-acid changes in the intracellular signaling portion of the receptor• Exon 12

Page 22: Associations to Quantitative Trait Network and Analysis of Asthma Data

Software for Genome-Phenome Association

SNP File

Gene Expression File

10600000 10700000

Chromosome 12

Gene module 1

Page 23: Associations to Quantitative Trait Network and Analysis of Asthma Data

Future Work: Correlated Genome-Transcriptome-Phenome Association Analysis

Genome Structure

Transcriptome Structure

Phenome Structure

Linkage Disequilibrium

Population Structure

Gene Modules

Clinical Traits

• Bi-clustering

• GFlasso• Tree lasso• Population lasso

• GFlasso• Tree lasso• Population lasso

Three-way Association!

Page 24: Associations to Quantitative Trait Network and Analysis of Asthma Data

Thanks! Software is available at http://sailing.cs.cmu.edu/gflasso Acknowledgements: Ross Curtis, Kyung-Ah Sohn, Sally Wenzel

Funding:Funding:

Page 25: Associations to Quantitative Trait Network and Analysis of Asthma Data

Reference Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of Royal

Statistical Society, Series B 58:267–288.

Weller J, Wiggans G, Vanraden P, Ron M (1996) Application of a canonical transformation to detection of quantitative trait loci with the aid of genetic markers in a multi-trait experiment. Theoretical and Applied Genetics 92:998–1002.

Mangin B, Thoquet B, Grimsley N (1998) Pleiotropic QTL analysis. Biometrics 54:89–99.

Chen Y, Zhu J, Lum P, Yang X, Pinto S, et al. (2008) Variations in DNA elucidate molecular networks that cause disease. Nature 452:429–35.

Lee SI, Dudley A, Drubin D, Silver P, Krogan N, et al. (2009) Learning a prior on regulatory potential from eQTL data. PLoS Genetics 5:e1000358.

Emilsson V, Thorleifsson G, Zhang B, Leonardson A, Zink F, et al. (2008) Genetics of gene expression and its effect on disease. Nature 452:423–28.

Page 26: Associations to Quantitative Trait Network and Analysis of Asthma Data

Traditional Approach

Multiple-Trait Association: Dependencies in Phenome

ACGTTTTACTGTACAATTcausal SNPcausal SNP

Association with Phenome

ACGTTTTACTGTACAATT

Multivariate complex syndrome (e.g., asthma)Multivariate complex syndrome (e.g., asthma)age at onset, history of eczemaage at onset, history of eczema

genome-wide expression profilegenome-wide expression profile

a univariate phenotype:a univariate phenotype:i.e., disease/control, i.e., disease/control, gene expression levelgene expression level

Page 27: Associations to Quantitative Trait Network and Analysis of Asthma Data

Multiple-trait Association: Graph-Constrained Fused Lasso

Step 1: Thresholded correlation graph of phenotypes

ACGTTTTACTGTACAATT

Step 2: Graph-constrained fused lasso

Lasso Penalty

Graph-constrained fusion penalty

Fusion

Page 28: Associations to Quantitative Trait Network and Analysis of Asthma Data

Multiple-trait Association: Graph-Weighted Fused Lasso

Step 1: Thresholded correlation graph of phenotypes with weights

Step 2: Graph-weighted fused lasso

ACGTTTTACTGTACAATT

Lasso Penalty

Graph-constrained fusion penalty

Weighted Fusion

Page 29: Associations to Quantitative Trait Network and Analysis of Asthma Data

Estimating Parameters (Association Strength)Quadratic programming formulation

Graph-constrained fused lasso

Graph-weighted fused lasso

Many publicly available software packages for solving convex optimization problems can be used

Page 30: Associations to Quantitative Trait Network and Analysis of Asthma Data

Trait Correlation Matrix

True Regression Coefficients

Single SNP-Single Trait Test

Ridge Regression

LassoGraph-constrained Fused Lasso

Graph-weighted Fused Lasso

Thresholded Trait Correlation Network

Simulation Results 50 SNPs taken

from HapMap chromosome 7, CEU population

10 traits

Phenotypes

SN

Ps

Page 31: Associations to Quantitative Trait Network and Analysis of Asthma Data

Lasso Graph-constrained Fused Lasso

Graph-weighted Fused Lasso

Results from Association

Trait Correlation Matrix

Single-Marker Single-Trait Test

SN

Ps

Phenotypes

Phe

noty

pes

? ?

Trait Network

Lung physiology-related traits II• Percent difference in FEV1: Spirometry•Post FEF 25-75 value•Postbronchodilation FEV1, % pred: Spirometry•Baseline FEV1, liters: Spirometry•Baseline predrug FEV1, liters•Maximum FEV1, liters: MPVLung•Baseline predrug FEV1, liters

Page 32: Associations to Quantitative Trait Network and Analysis of Asthma Data

Linkage Disequilibrium Structure in IL-4R gene

SNP Q551R

SNP rs3024660

SNP rs3024622

r2 =0.64

r2 =0.07

Page 33: Associations to Quantitative Trait Network and Analysis of Asthma Data

Computation Time

Page 34: Associations to Quantitative Trait Network and Analysis of Asthma Data

ConclusionsSummary

Dependencies in phenome: Graph-guided fused lasso framework incorporates correlation information among traits to detect pleiotropic effect of genotypic variations.

Analysis of the asthma dataset suggests the effectiveness of the method

Future WorkDependencies in genome?: Poster Q06 (This evening)Dependencies in both genome and phenomeLearn the trait correlation network and association strengths

jointly

Availability: http://www.sailing.cs.cmu.edu/