10
Whole genome association studies Introduction and practical Boulder, March 2009

Whole genome association studies Introduction and practical Boulder, March 2009

Embed Size (px)

Citation preview

Page 1: Whole genome association studies Introduction and practical Boulder, March 2009

Whole genome association studies

Introduction and practical

Boulder, March 2009

Page 2: Whole genome association studies Introduction and practical Boulder, March 2009

100K+ SNPs, CNVs →100K+ SNPs, CNVs →

← ~

1000

s in

divi

dual

s←

~10

00s

indi

vidu

als

WholeGenome

AssociationStudy

WholeGenome

AssociationStudy

Page 3: Whole genome association studies Introduction and practical Boulder, March 2009

…?

Control

Control

Scz

Scz

Associating phenotypic and genotypic variationAssociating phenotypic and genotypic variation

Page 4: Whole genome association studies Introduction and practical Boulder, March 2009

Indirect association (linkage disequilibrium)

2 2 1 2 1 2 2 1 2 2 2 2 1 2

1 0 1 1 1 1 1 0 1 2 1 2 1 1

MDS PCA

Reference haplotypes(HapMap)

Observed genotypes

Imputed genotypes

Direct test of single SNP effect

Imputation of ungenotyped SNPs1) Increase coverage2) Facilitate meta-analysis across platforms3) Quality control (drop SNP/re-impute)

Empirical assessmentof ancestry1) Detect outliers, substructure2) Clusters or continuous indices3) Batch effects, relatedness, sample swaps, contamination, etc

Analytic tools to perform, validate and enhance basic single SNP WGAS, e.g.:

Page 5: Whole genome association studies Introduction and practical Boulder, March 2009

Age-related macular degenerationAge-related macular degeneration

Page 6: Whole genome association studies Introduction and practical Boulder, March 2009

Progress in type 2 diabetes and Crohn’s diseaseProgress in type 2 diabetes and Crohn’s disease

KCNJ11

PPARG

2000 2001 2002 2003 2004 2005 2006 2007 2008

TCF2WSF1CDKN2B/AIGF2BP2CDKAL1HHEXSLC30A8TCF7L2

NOD2

5q31

5p1310q213p21PTPN2IRGMIL12BNKX2-3

T2D - confirmed associated loci

Crohn’s - confirmed associated loci

TNFSF15 IL23RATG16L1

JAZF1CDC123ADAMTS9THADANOTCH2TSPAN8

PTPN22ITLN11q241q32CDKAL1MHC6q21CCR67p128q24

JAK210p1111q1312q1213q14ORMDL3STAT319p1321q21ICOSLG

Slide courtesy of Mark Daly

Page 7: Whole genome association studies Introduction and practical Boulder, March 2009

5 x 10-8

CACNA1CAnkryin-G (ANK3)

Bipolar WGAS of 10,648 samplesBipolar WGAS of 10,648 samples

Sample Cases Controls P-valueSTEP 7.4% 5.8% 0.0013WTCCC 7.6% 5.9% 0.0008EXT 7.3% 4.7% 0.0002Total 7.5% 5.6% 9.1×10-9

Sample Case Controls P-valueSTEP 35.7% 32.4% 0.0015WTCCC 35.7% 31.5% 0.0003EXT 35.3% 33.7% 0.0108Total 35.6% 32.4% 7×10-8

X

>1.7 million genotyped and (high confidence) imputed SNPs

Ferreira et al (Nature Genetics, 2008)

Page 8: Whole genome association studies Introduction and practical Boulder, March 2009

Main focus of many association studies:

additive effects of single common SNPs on disease

Main focus of many association studies:

additive effects of single common SNPs on disease

Interactionsdominant,recessive

Joint tests of aggregate effect, genes, pathways

Variants <1% frequency

Structural variants

Subtypes,endophenotypes

Page 9: Whole genome association studies Introduction and practical Boulder, March 2009

Altshuler, Daly & Lander (2008) Science

Manolio, Brooks & Collins (2008) JCI

Further reading on association mapping and interpretation of GWAS findings

Maher (2008) Nature

Page 10: Whole genome association studies Introduction and practical Boulder, March 2009

Practical session• Data are in ~pshaun/prac2/

• Software required: PLINK and Haploview

• PDF with instructions is ~pshaun/instruct.pdf

• Work through until section “Empirical assessment of population stratification”

• Use PLINK website for help (http://pngu.mgh.harvard.edu/purcell/plink/)