34
Haplotypes and imputed genotypes in diverse human populations Noah Rosenberg April 29, 2009

Haplotypes and imputed genotypes in diverse human populations

  • Upload
    marcos

  • View
    56

  • Download
    0

Embed Size (px)

DESCRIPTION

Haplotypes and imputed genotypes in diverse human populations. Noah Rosenberg April 29, 2009. Human Genome Diversity Cell Line Panel. 525,910 single-nucleotide polymorphisms in 29 populations. M Jakobsson et al. (2008) Nature 451:998-1003. Overview. - PowerPoint PPT Presentation

Citation preview

  • Haplotypes and imputed genotypes in diverse human populationsNoah RosenbergApril 29, 2009

  • Human Genome Diversity Cell Line Panel525,910 single-nucleotide polymorphisms in 29 populationsM Jakobsson et al. (2008) Nature 451:998-1003

  • How do we measure and compare haplotype diversity across populations?Imputation in diverse populationsOverview

  • Which populations and genomic sites have more haplotype diversity?Population 1Population 2

    X0XX0X000X00X000000000000

    X0XXX00XX0X00000X00000000

    00000000000000000XX00X0XX

    000X0XX000000XXX000XX0000

    0X00X00XX0X00000X0000X0XX

    0X000X000X00X000000000000

    000X0X000X00X000000000000

    X00XX00XX0X00000X00000000

    0X000XX000000XXX000XX0000

    0000X00XX0X00000X0000X0XX

    0X00X00XX0X00000X0000X0XX

    0X0X0000000000000XX000000

    0X000000000000000XX000000

    00000XX000000XXX000XX0000

    0X000XX000000XXX000XX0000

    X0XX0X000X0XX000000000X00

    0000X00XX0X00000X00000000

    0X0X0X000X00X00000000X0XX

    0X00000XX0X00000X0000X0XX

    0X0X0XX000000XXX000XXX0XX

  • Which populations and genomic sites have more haplotype diversity?Population 1Population 2

    XXXXXX

    XXXXXXXX

    XXXXX

    XXXXXXXX

    XXXXXXXXX

    XXXX

    XXXX

    XXXXXXX

    XXXXXXXX

    XXXXXXXX

    XXXXXXXXX

    XXXX

    XXX

    XXXXXXX

    XXXXXXXX

    XXXXXXXX

    XXXXX

    XXXXXXXX

    XXXXXXXX

    XXXXXXXXXXXX

  • Which populations and genomic sites have more haplotype diversity?Population 1Population 2P Scheet, M Stephens (2006) AJHG 78:629-644

    XXXXXXXX

    XXXXXXXX

    XXXX

    XXXX

    XXXXXXX

    XXXXXXXX

    XXXXXXXXX

    XXXXXXXX

    XXXXX

    XXXXXXXX

    XXXXXX

    XXXXX

    XXXXXXXXX

    XXXXXXXX

    XXXX

    XXX

    XXXXXXX

    XXXXXXXX

    XXXXXXXX

    XXXXXXXXXXXX

  • Which populations and genomic sites have more haplotype diversity?Population 1Blue

    1111566666666666666663333

    XXXXXXXX

    XXXXXXXX

    XXXX

    XXXX

    XXXXXXX

    XXXXXXXX

    XXXXXXXXX

    XXXXXXXX

    XXXXX

    XXXXXXXX

  • Which populations and genomic sites have more haplotype diversity?Population 1BlueGreen

    111X1566666666666666663333

    XXXXXXXX

    XXXXXXXX

    XXXX

    XXXX

    XXXXXXX

    XXXXXXXX

    XXXXXXXXX

    3332000000000000000000000

    XXXXXXXX

    XXXXX

    XXXXXXXX

  • Which populations and genomic sites have more haplotype diversity?Population 1BlueGreenOrange

    111X1566666666666666663333

    XXXXXXXX

    XXXXXXXX

    XXXX

    XXXX

    XXXXXXX

    XXXXXXXX

    XXXXXXXXX

    3332000000000000000000000

    0000000000000000000006666

    XXXXXXXX

    XXXXX

    XXXXXXXX

  • Which populations and genomic sites have more haplotype diversity?Population 1BlueGreenOrangePink

    111X1566666666666666663333

    XXXXXXXX

    XXXXXXXX

    XXXX

    XXXX

    XXXXXXX

    XXXXXXXX

    XXXXXXXXX

    3332000000000000000000000

    0000000000000000000006666

    4444322222222222222221111

    XXXXXXXX

    XXXXX

    XXXXXXXX

  • Which populations and genomic sites have more haplotype diversity?Population 1BlueGreenOrangePinkYellow

    111X1566666666666666663333

    XXXXXXXX

    XXXXXXXX

    XXXX

    XXXX

    XXXXXXX

    2222222222222222222220000

    XXXXXXXX

    XXXXXXXXX

    3332000000000000000000000

    0000000000000000000006666

    4444322222222222222221111

    XXXXXXXX

    XXXXX

    XXXXXXXX

  • Which populations and genomic sites have more haplotype diversity?

    Chart2

    0.70.72

    0.70.7

    0.70.7

    0.720.76

    0.620.72

    0.560.72

    0.560.72

    0.560.72

    0.560.72

    0.560.72

    0.560.72

    0.560.72

    0.560.72

    0.560.72

    0.560.72

    0.560.72

    0.560.72

    0.560.72

    0.560.72

    0.560.72

    0.560.72

    0.540.76

    0.540.76

    0.540.76

    0.540.76

    Population 1

    Population 2

    Site number

    Heterozygosity

    Sheet1

    10.70.72

    Blue111256666666666666666333320.70.7

    Green333200000000000000000000030.70.7

    Orange000000000000000000000666640.720.76

    Pink444432222222222222222111150.620.72

    Yellow222222222222222222222000060.560.72

    1010101010101010101010101010101010101010101010101070.560.72

    0.70.70.70.720.620.560.560.560.560.560.560.560.560.560.560.560.560.560.560.560.560.540.540.540.5480.560.72

    90.560.72

    Blue2333111111111111111111111100.560.72

    Green4443000000000000000003333110.560.72

    Orange0001333333333333333333333120.560.72

    Pink2111333333333333333332222130.560.72

    Yellow2222333333333333333331111140.560.72

    10101010101010101010101010101010101010101010101010150.560.72

    0.720.70.70.760.720.720.720.720.720.720.720.720.720.720.720.720.720.720.720.720.720.760.760.760.76160.560.72

    170.560.72

    180.560.72

    190.560.72

    200.560.72

    210.560.72

    220.540.76

    230.540.76

    240.540.76

    250.540.76

    Sheet1

    Population 1

    Population 2

    Site number

    Heterozygosity

    Sheet2

    Sheet3

  • Which populations and genomic sites have more haplotype diversity?Population 1

    Less diversityPopulation 2

    More diversity

    XXXXXXXX

    XXXXXXXX

    XXXX

    XXXX

    XXXXXXX

    XXXXXXXX

    XXXXXXXXX

    XXXXXXXX

    XXXXX

    XXXXXXXX

    XXXXXX

    XXXXX

    XXXXXXXXX

    XXXXXXXX

    XXXX

    XXX

    XXXXXXX

    XXXXXXXX

    XXXXXXXX

    XXXXXXXXXXXX

  • Haplotype cluster frequencies for a typical genomic regionM Jakobsson et al. (2008) Nature 451:998-1003

  • More haplotype diversity in AfricaAfricaEuropeMiddleEastEastAsiaOceaniaAmericaC AsiaM Jakobsson et al. (2008) Nature 451:998-1003

  • Less haplotype homozygosity and more haplotype diversity in AfricaM Jakobsson et al. (2008) Nature 451:998-1003

  • Genetic diversity declines with distance from AfricaHaplotypeheterozygosity

  • Haplotype clusters recover population structureAfricaMiddle EastEuropeCentral/South AsiaOceaniaAmericaEast AsiaM Jakobsson et al. (2008) Nature 451:998-1003

  • Haplotype clusters recover population structureM Jakobsson et al. (2008) Nature 451:998-1003

  • Low haplotype diversity in the lactase region in EuropeAfricaEuropeMiddleEastEastAsiaOceaniaAmericaC AsiaM Jakobsson et al. (2008) Nature 451:998-1003

  • Haplotype cluster homozygosity as a test for selectionRandom regionM Jakobsson et al. (2008) Nature 451:998-1003

  • Haplotype clusters can be used to encode haplotypes pointwise for measurement of diversityHaplotype cluster diversity is greatest in AfricaLow haplotype cluster diversity can potentially be used to detect selectionHaplotype diversity summary

  • Measuring haplotype diversity using haplotype clustersImputation in diverse populationsOverview

  • StudysampleGenotyped positionsImputed genotypes can be tested for disease associationGenotypes can be imputed using a reference panel but imperfectly

  • 443 individuals in 29 populations from the Human Genome Diversity Panel Genotypes at >500,000 SNPs (Jakobsson et al. Nature 451:998-1003, 2008)420 HapMap reference haplotypes of ~2,000,000 SNPs, omitting offspring in triosRandomly hide 15% genotypes in HGDP individuals and impute with MACHMeasure the proportion of alleles imputed correctlyEvaluating imputation accuracy in worldwide populations

  • Imputation accuracy is predicted by haplotype diversityImputation accuracyL Huang et al. (2008) AJHG 84:235-250

  • Imputation accuracy is greatest with a close reference panelL Huang et al. (2008) AJHG 84:235-250

  • Highest-accuracy reference panels match geographic locationsAfricaEurope/W AsiaE Asia/Oceania/AmericasL Huang et al. (2008) AJHG 84:235-250

  • Instead of imputing based on separate HapMap panels, impute from mixturesChoose mixtures to have optimal size given specified ratiosImputation accuracy can be increased using HapMap mixturesL Huang et al. (2008) AJHG 84:235-250

  • Imputation accuracy can be increased using HapMap mixturesL Huang et al. (2008) AJHG 84:235-250

  • Strategies to improve imputation studies

    -Increased sample size

    -Improved imputation algorithms

    -Improved use of reference panels

    -Development of additional reference panels

    -Improved haplotyping

    -Use of additional data from relativesSummary imputation accuracy

  • Imputation error and sample size inflation are greatest in AfricaImputation summarySeveral strategies may be available for improving imputation, including use of mixtures

  • Rosenberg labJames DegnanMike DeGiorgioLucy HuangMattias JakobssonTrevor PembertonPaul ScheetZach SzpiechJenna VanLiereChaolong WangCollaboratorsGoncalo Abecasis (Michigan)Raph Gibbs (NIA)John Hardy (UCL)Yun Li (Michigan)Sonja Scholz (NIA)Andy Singleton (NIA)FundingAlfred P. Sloan FoundationBurroughs Wellcome Fund National Institutes of HealthU of M Rackham Graduate School [M DeGiorgio]U of M Center for Genetics in Health and Medicine [M Jakobsson]