Upload
marcos
View
56
Download
0
Embed Size (px)
DESCRIPTION
Haplotypes and imputed genotypes in diverse human populations. Noah Rosenberg April 29, 2009. Human Genome Diversity Cell Line Panel. 525,910 single-nucleotide polymorphisms in 29 populations. M Jakobsson et al. (2008) Nature 451:998-1003. Overview. - PowerPoint PPT Presentation
Citation preview
Haplotypes and imputed genotypes in diverse human populationsNoah RosenbergApril 29, 2009
Human Genome Diversity Cell Line Panel525,910 single-nucleotide polymorphisms in 29 populationsM Jakobsson et al. (2008) Nature 451:998-1003
How do we measure and compare haplotype diversity across populations?Imputation in diverse populationsOverview
Which populations and genomic sites have more haplotype diversity?Population 1Population 2
X0XX0X000X00X000000000000
X0XXX00XX0X00000X00000000
00000000000000000XX00X0XX
000X0XX000000XXX000XX0000
0X00X00XX0X00000X0000X0XX
0X000X000X00X000000000000
000X0X000X00X000000000000
X00XX00XX0X00000X00000000
0X000XX000000XXX000XX0000
0000X00XX0X00000X0000X0XX
0X00X00XX0X00000X0000X0XX
0X0X0000000000000XX000000
0X000000000000000XX000000
00000XX000000XXX000XX0000
0X000XX000000XXX000XX0000
X0XX0X000X0XX000000000X00
0000X00XX0X00000X00000000
0X0X0X000X00X00000000X0XX
0X00000XX0X00000X0000X0XX
0X0X0XX000000XXX000XXX0XX
Which populations and genomic sites have more haplotype diversity?Population 1Population 2
XXXXXX
XXXXXXXX
XXXXX
XXXXXXXX
XXXXXXXXX
XXXX
XXXX
XXXXXXX
XXXXXXXX
XXXXXXXX
XXXXXXXXX
XXXX
XXX
XXXXXXX
XXXXXXXX
XXXXXXXX
XXXXX
XXXXXXXX
XXXXXXXX
XXXXXXXXXXXX
Which populations and genomic sites have more haplotype diversity?Population 1Population 2P Scheet, M Stephens (2006) AJHG 78:629-644
XXXXXXXX
XXXXXXXX
XXXX
XXXX
XXXXXXX
XXXXXXXX
XXXXXXXXX
XXXXXXXX
XXXXX
XXXXXXXX
XXXXXX
XXXXX
XXXXXXXXX
XXXXXXXX
XXXX
XXX
XXXXXXX
XXXXXXXX
XXXXXXXX
XXXXXXXXXXXX
Which populations and genomic sites have more haplotype diversity?Population 1Blue
1111566666666666666663333
XXXXXXXX
XXXXXXXX
XXXX
XXXX
XXXXXXX
XXXXXXXX
XXXXXXXXX
XXXXXXXX
XXXXX
XXXXXXXX
Which populations and genomic sites have more haplotype diversity?Population 1BlueGreen
111X1566666666666666663333
XXXXXXXX
XXXXXXXX
XXXX
XXXX
XXXXXXX
XXXXXXXX
XXXXXXXXX
3332000000000000000000000
XXXXXXXX
XXXXX
XXXXXXXX
Which populations and genomic sites have more haplotype diversity?Population 1BlueGreenOrange
111X1566666666666666663333
XXXXXXXX
XXXXXXXX
XXXX
XXXX
XXXXXXX
XXXXXXXX
XXXXXXXXX
3332000000000000000000000
0000000000000000000006666
XXXXXXXX
XXXXX
XXXXXXXX
Which populations and genomic sites have more haplotype diversity?Population 1BlueGreenOrangePink
111X1566666666666666663333
XXXXXXXX
XXXXXXXX
XXXX
XXXX
XXXXXXX
XXXXXXXX
XXXXXXXXX
3332000000000000000000000
0000000000000000000006666
4444322222222222222221111
XXXXXXXX
XXXXX
XXXXXXXX
Which populations and genomic sites have more haplotype diversity?Population 1BlueGreenOrangePinkYellow
111X1566666666666666663333
XXXXXXXX
XXXXXXXX
XXXX
XXXX
XXXXXXX
2222222222222222222220000
XXXXXXXX
XXXXXXXXX
3332000000000000000000000
0000000000000000000006666
4444322222222222222221111
XXXXXXXX
XXXXX
XXXXXXXX
Which populations and genomic sites have more haplotype diversity?
Chart2
0.70.72
0.70.7
0.70.7
0.720.76
0.620.72
0.560.72
0.560.72
0.560.72
0.560.72
0.560.72
0.560.72
0.560.72
0.560.72
0.560.72
0.560.72
0.560.72
0.560.72
0.560.72
0.560.72
0.560.72
0.560.72
0.540.76
0.540.76
0.540.76
0.540.76
Population 1
Population 2
Site number
Heterozygosity
Sheet1
10.70.72
Blue111256666666666666666333320.70.7
Green333200000000000000000000030.70.7
Orange000000000000000000000666640.720.76
Pink444432222222222222222111150.620.72
Yellow222222222222222222222000060.560.72
1010101010101010101010101010101010101010101010101070.560.72
0.70.70.70.720.620.560.560.560.560.560.560.560.560.560.560.560.560.560.560.560.560.540.540.540.5480.560.72
90.560.72
Blue2333111111111111111111111100.560.72
Green4443000000000000000003333110.560.72
Orange0001333333333333333333333120.560.72
Pink2111333333333333333332222130.560.72
Yellow2222333333333333333331111140.560.72
10101010101010101010101010101010101010101010101010150.560.72
0.720.70.70.760.720.720.720.720.720.720.720.720.720.720.720.720.720.720.720.720.720.760.760.760.76160.560.72
170.560.72
180.560.72
190.560.72
200.560.72
210.560.72
220.540.76
230.540.76
240.540.76
250.540.76
Sheet1
Population 1
Population 2
Site number
Heterozygosity
Sheet2
Sheet3
Which populations and genomic sites have more haplotype diversity?Population 1
Less diversityPopulation 2
More diversity
XXXXXXXX
XXXXXXXX
XXXX
XXXX
XXXXXXX
XXXXXXXX
XXXXXXXXX
XXXXXXXX
XXXXX
XXXXXXXX
XXXXXX
XXXXX
XXXXXXXXX
XXXXXXXX
XXXX
XXX
XXXXXXX
XXXXXXXX
XXXXXXXX
XXXXXXXXXXXX
Haplotype cluster frequencies for a typical genomic regionM Jakobsson et al. (2008) Nature 451:998-1003
More haplotype diversity in AfricaAfricaEuropeMiddleEastEastAsiaOceaniaAmericaC AsiaM Jakobsson et al. (2008) Nature 451:998-1003
Less haplotype homozygosity and more haplotype diversity in AfricaM Jakobsson et al. (2008) Nature 451:998-1003
Genetic diversity declines with distance from AfricaHaplotypeheterozygosity
Haplotype clusters recover population structureAfricaMiddle EastEuropeCentral/South AsiaOceaniaAmericaEast AsiaM Jakobsson et al. (2008) Nature 451:998-1003
Haplotype clusters recover population structureM Jakobsson et al. (2008) Nature 451:998-1003
Low haplotype diversity in the lactase region in EuropeAfricaEuropeMiddleEastEastAsiaOceaniaAmericaC AsiaM Jakobsson et al. (2008) Nature 451:998-1003
Haplotype cluster homozygosity as a test for selectionRandom regionM Jakobsson et al. (2008) Nature 451:998-1003
Haplotype clusters can be used to encode haplotypes pointwise for measurement of diversityHaplotype cluster diversity is greatest in AfricaLow haplotype cluster diversity can potentially be used to detect selectionHaplotype diversity summary
Measuring haplotype diversity using haplotype clustersImputation in diverse populationsOverview
StudysampleGenotyped positionsImputed genotypes can be tested for disease associationGenotypes can be imputed using a reference panel but imperfectly
443 individuals in 29 populations from the Human Genome Diversity Panel Genotypes at >500,000 SNPs (Jakobsson et al. Nature 451:998-1003, 2008)420 HapMap reference haplotypes of ~2,000,000 SNPs, omitting offspring in triosRandomly hide 15% genotypes in HGDP individuals and impute with MACHMeasure the proportion of alleles imputed correctlyEvaluating imputation accuracy in worldwide populations
Imputation accuracy is predicted by haplotype diversityImputation accuracyL Huang et al. (2008) AJHG 84:235-250
Imputation accuracy is greatest with a close reference panelL Huang et al. (2008) AJHG 84:235-250
Highest-accuracy reference panels match geographic locationsAfricaEurope/W AsiaE Asia/Oceania/AmericasL Huang et al. (2008) AJHG 84:235-250
Instead of imputing based on separate HapMap panels, impute from mixturesChoose mixtures to have optimal size given specified ratiosImputation accuracy can be increased using HapMap mixturesL Huang et al. (2008) AJHG 84:235-250
Imputation accuracy can be increased using HapMap mixturesL Huang et al. (2008) AJHG 84:235-250
Strategies to improve imputation studies
-Increased sample size
-Improved imputation algorithms
-Improved use of reference panels
-Development of additional reference panels
-Improved haplotyping
-Use of additional data from relativesSummary imputation accuracy
Imputation error and sample size inflation are greatest in AfricaImputation summarySeveral strategies may be available for improving imputation, including use of mixtures
Rosenberg labJames DegnanMike DeGiorgioLucy HuangMattias JakobssonTrevor PembertonPaul ScheetZach SzpiechJenna VanLiereChaolong WangCollaboratorsGoncalo Abecasis (Michigan)Raph Gibbs (NIA)John Hardy (UCL)Yun Li (Michigan)Sonja Scholz (NIA)Andy Singleton (NIA)FundingAlfred P. Sloan FoundationBurroughs Wellcome Fund National Institutes of HealthU of M Rackham Graduate School [M DeGiorgio]U of M Center for Genetics in Health and Medicine [M Jakobsson]