75

Click here to load reader

SNP Selection

  • Upload
    khalil

  • View
    80

  • Download
    1

Embed Size (px)

DESCRIPTION

SNP Selection. University of Louisville Center for Genetics and Molecular Medicine January 10, 2008 Dana Crawford, PhD Vanderbilt University Center for Human Genetics Research. Outline of Tutorial. Concepts of tagSNPs LD and haplotype definitions Haplotype blocks and definitions - PowerPoint PPT Presentation

Citation preview

Page 1: SNP Selection

SNP Selection

University of LouisvilleUniversity of LouisvilleCenter for Genetics and Molecular MedicineCenter for Genetics and Molecular Medicine

January 10, 2008January 10, 2008

Dana Crawford, PhDDana Crawford, PhDVanderbilt UniversityVanderbilt University

Center for Human Genetics ResearchCenter for Human Genetics Research

Page 2: SNP Selection

Outline of Tutorial

• Concepts of tagSNPs

• LD and haplotype definitions

• Haplotype blocks and definitions

• Tools to identify tagSNPs

Page 3: SNP Selection

Why Do We Need tagSNPs?

Whole Genome:

• 15,000,000 SNPs

• 6,000,000 SNPs > 5% MAF

Too Many SNPs to Genotype!

Ex: E2F2

Average Gene:• 26.5 kb• 130 SNPs• 44 SNPs ≥5% MAF

Page 4: SNP Selection

SNP Genotypes Are Correlated(aka linkage disequilibrium)

“the nonindependence of alleles at different sites.” Pritchard and Przeworski 2001

Genotype at one site can predict genotype at another site

Proportion of genotypes are correlated

Page 5: SNP Selection

Measuring Pair-wise SNP Correlations

• SNP genotype correlation described by linkage disequilibrium (LD)

• Pair-wise measures of LD: D´ and r2

D = pAB - pApB; D´ = D/Dmax Recombination

r2 = D2

f(A1)f(A2)f(B1)f(B2) Power

Page 6: SNP Selection

• r2 is inversely related to power (“effective sample size”)

1/r2

1,000 cases 1,250 cases1,000 controls r2=1.0 1,250 controls r2 = 0.80

• D´ is related to recombination history

D´ = 1 no recombinationD´ < 1 historical recombination

LD Statistics: Practical Uses

Page 7: SNP Selection

Where to Find Population LD Statistics

For your gene or region of interest, search

• HapMap www.hapmap.org

• Perlegen genome.perlegen.com

• SeattleSNPs PGA pga.gs.washington.edu

• NIEHS SNPs egp.gs.washington.edu

Page 8: SNP Selection

Where to Find Population LD Statistics

For your gene or region of interest, search

• HapMap www.hapmap.org

• Perlegen genome.perlegen.com

• SeattleSNPs PGA pga.gs.washington.edu

• NIEHS SNPs egp.gs.washington.edu

Page 9: SNP Selection

Visualizing Pair-wise LD

Page 10: SNP Selection

Visualizing Pair-wise LD

Page 11: SNP Selection

Visualizing Pair-wise LD

Page 12: SNP Selection

Where to Find Population LD Statistics

For your gene or region of interest, search

• HapMap www.hapmap.org

• Perlegen genome.perlegen.com

• SeattleSNPs PGA pga.gs.washington.edu

• NIEHS SNPs egp.gs.washington.edu

Genome Variation Server

Page 13: SNP Selection

Visualizing Pair-wise LD

Page 14: SNP Selection

Visualizing Pair-wise LD

Page 15: SNP Selection

Visualizing Pair-wise LD

Page 16: SNP Selection

Visualizing Pair-wise LD

Page 17: SNP Selection

Visualizing Pair-wise LD

Page 18: SNP Selection

Visualizing Pair-wise LD

Page 19: SNP Selection

Visualizing Pair-wise LD

Page 20: SNP Selection

Visualizing Pair-wise LD

Page 21: SNP Selection

Visualizing Pair-wise LD

Page 22: SNP Selection

Multi-SNP Genotype Correlations(aka Haplotypes)

“…a unique combination of genetic markers present in a chromosome.” pg 57 in Hartl & Clark, 1997

Page 23: SNP Selection

Constructing Haplotypes

C TA G

T TG G

C CA G

C/T, A/G

C/C, A/GT/T, G/G

C/T, A/AC/C, A/G

Collect pedigrees Somatic cell hybrids

Human Rodent

Hybrid

SNP 1 SNP 2

C/T A/G

Allele-specific PCR

Page 24: SNP Selection

Constructing HaplotypesExamples of Haplotype Inference Software:

EM AlgorithmHaploview http://www.broad.mit.edu/mpg/haploview/index.php Arlequinhttp://lgb.unige.ch/arlequin/

PHASE v2.1http://www.stat.washington.edu/stephens/software.html

HAPLOTYPERhttp://www.people.fas.harvard.edu/~junliu/Haplo/docMain.htm

Page 25: SNP Selection

Haplotypes in NIEHS SNPs

• >625 genes re-sequenced Cell cycle, DNA repair/replication, apoptosis

• 2 DNA panels1: Polymorphism Discovery Resource (PDR90)2: Europeans, Africans, Hispanics, and Asians

• PHASEv2.0 results posted on website

• Interactive tool (VH1) to visualize and sort haplotypes

http://egp.gs.washington.edu

Page 26: SNP Selection

Haplotypes in NIEHS SNPs

Page 27: SNP Selection

Haplotypes in NIEHS SNPs

Page 28: SNP Selection

Haplotypes in NIEHS SNPs

Page 29: SNP Selection

Haplotypes in NIEHS SNPs

Page 30: SNP Selection

Haplotypes in NIEHS SNPs

Page 31: SNP Selection

Haplotypes in NIEHS SNPs

Page 32: SNP Selection

Haplotypes in NIEHS SNPs

Page 33: SNP Selection

Haplotypes in NIEHS SNPs

Page 34: SNP Selection

Haplotypes in NIEHS SNPs

Page 35: SNP Selection

Haplotypes in NIEHS SNPs

Page 36: SNP Selection

Haplotypes in NIEHS SNPs

Page 37: SNP Selection

Haplotypes in NIEHS SNPs

Page 38: SNP Selection

• r2 is inversely related to power (“effective sample size”)

1/r2

1,000 cases 1,250 cases1,000 controls r2=1.0 1,250 controls r2 = 0.80

• D´ is related to recombination history

D´ = 1 no recombinationD´ < 1 historical recombination

Example: Tagger and LDSelect

Example: Haplotype “blocks”

Using LD and Haplotypes to Pick tagSNPs

Page 39: SNP Selection

• r2 is inversely related to power (“effective sample size”)

1/r2

1,000 cases 1,250 cases1,000 controls r2=1.0 1,250 controls r2 = 0.80

Example: Tagger and LDSelect

Using LD and Haplotypes to Pick tagSNPs

Discovery genotype data pair-wise LD pick tagSNPs

Page 40: SNP Selection

LDSelect: Using LD to Pick tagSNPsLDSelect

• Uses SNP discovery data (not haplotypes)• Finds all correlated SNP genotypes to minimize the total number• Maintains genetic diversity of locus

Carlson et al. AJHG (2004)

Page 41: SNP Selection

TagSNPs Are Population SpecificEuropean-descent (BLM)

African-descent (BLM)

Page 42: SNP Selection

SNP Selection: tagSNP Data BLM

Page 43: SNP Selection

Side Note: Categorizing tagSNPs

• SNP contextNonrepetitive > repetitive

• Location of SNPCoding > noncoding

• FunctionNonsynonymous > synonymous

Page 44: SNP Selection

Categorizing tagSNPsLPO

Page 45: SNP Selection

Haplotypes in Genetic Association Studies

Two main approaches with haplotypes:

Haplotypes Pick tagSNPs Genotype samples

Pick tagSNPs Infer haplotypes Test for association

Page 46: SNP Selection

Haplotypes in Genetic Association Studies

Two main approaches with haplotypes:

Haplotypes Pick tagSNPs Genotype samples

Pick tagSNPs Infer haplotypes Test for association

RecombinationNatural selectionPopulation historyPopulation demography

Haplotype block definition

Page 47: SNP Selection

Haplotype “Blocks”

Strong LD Few Haplotypes Represent most chromosomes

Daly et al 2001Daly et al Nat. Genet. (2001)

Page 48: SNP Selection

Block DefinitionsDaly et al 2001

D´ [Gabriel et al Science (2002)]

Daly et al Nat. Genet. (2001)

Page 49: SNP Selection

Block Definitions

A Ba bA ba B

Four-gamete test:

A B

a b

<4 haplotypes, D´=1 block

4 haplotypes, D´<1 boundary

Page 50: SNP Selection

Haplotype Blocks and tagSNPs

Identifying blocks and tagSNPs:

• ManuallyVisual haplotype

• Algorithms HapMap and Haploview

Page 51: SNP Selection

Haplotype Blocks and tagSNPs

LTA:16 SNPs (MAF >10%)

6 “common” haplotypes

tagSNPs

Page 52: SNP Selection

Haplotype Blocks and tagSNPs

Identifying blocks and tagSNPs:

• ManuallyVisual Haplotype

• AlgorithmsHapMap and HaploView

Page 53: SNP Selection

HapMap Data and Haploview

www.hapmap.org

Page 54: SNP Selection
Page 55: SNP Selection
Page 56: SNP Selection

HapMap Data and Haploview

Page 57: SNP Selection
Page 58: SNP Selection
Page 59: SNP Selection

http://www.broad.mit.edu/mpg/haploview/

Page 60: SNP Selection

Import HapMap Data into Haploview

Page 61: SNP Selection
Page 62: SNP Selection
Page 63: SNP Selection
Page 64: SNP Selection
Page 65: SNP Selection
Page 66: SNP Selection
Page 67: SNP Selection
Page 68: SNP Selection

Note: HapMap is not complete variation data

Page 69: SNP Selection

HapMap5 tagSNPs

Variation data, LD, and tagSNPs for ANAPC10 in European-Americans

NIEHS SNPs12 tagSNPs

Page 70: SNP Selection

tagSNPs and Genome Variation Server

Page 71: SNP Selection
Page 72: SNP Selection

Note: Tagger is essentially the same as LDSelect

Page 73: SNP Selection
Page 74: SNP Selection

Haplotypes, TagSNPs, and Caveats

• Haplotypes are inferred

• Block-like structure assumed for some software

• Different block definitions

• Block boundaries sensitive to marker density

• Genotype savings may not be great (recombination)

tagSNPs based on LD more popular than htSNPs

Page 75: SNP Selection

• Resources available for pair-wise LD and haplotypes

• Software for tagSNP selection available

• Be aware the limitations of the approach you choose

• Be aware that some SNP datasets may not representall common variation of gene or gene region

• Be aware that a fraction of tagSNPs do not convertinto a successful genotyping assay

SNP Selection Summary