Upload
agnes-erica-bryant
View
219
Download
5
Embed Size (px)
Citation preview
Sequencing 128 Ashkenazi Genomes: Implications for Medical
Genetics and History
Shai CarmiDepartment of Computer
Science
Columbia University
Itsik Pe’er’s labUCLA
October 2014
Outline
• Ashkenazi Jewish Genetics: Background
• The Ashkenazi Genome Sequencing Project
• Segment Sharing and Population History
• Opportunities and Future Directions
Outline
• Ashkenazi Jewish Genetics: Background
• The Ashkenazi Genome Sequencing Project
• Segment Sharing and Population History
• Opportunities and Future Directions
Ashkenazi Jewish (AJ) Genetics: Significance
Medical genetics• Large founder population• Mendelian disorders• Complex diseases
o Breast cancer, Parkinson’s, Crohn’s
Population genetics• Debated origins• Genetics of a founder event
mtDNA: Behar et al., 2004; Behar et al., 2006Y chr: Behar et al., 2003; Behar et al., 2004Disease genes: Risch et al., 2003; Slatkin, 2004SNP arrays: Gusev et al., 2012; Palamara et al., 2012Review: Ostrer and Skorecki, 2013
Founder Populations: Opportunities
Recent successes• Greece
o Tachmazidou et al., 2013; HDL
• Finlando Kurki et al. 2014; aneurysm
• Icelando Many papers; most recently
Steinthorsdottir et al., 2014; T2D
• Ashkenazi Jewso Hui et al., in preparation;
Crohn’s
See also: • Hatzikotoulas et al.,
2014• Zuk et al., 2014
TimeFounder populationNon-founder population
Disease alleles
Bottleneck
Population sizePresent
Problem: Common genotyping platforms do not include alleles rare outside the founder population
Opportunities: Reduced Haplotype Diversity
Chromosomes in the sample
Full sequence
Partial sequence (SNP array, low-coverage sequence)
Observed data
Imputation
Inferred sequence
Nearly-complete inferred sequence
Problem: The Ashkenazi population is missing a reference panel of complete sequences
Opportunities: Personal Genomics in AJ
Personal clinical genomics is hereBut genomes are hard to interpret
Problem: The Ashkenazi population is missing a reference panel of complete sequences
The Documented Ashkenazi History
• Ca. 1000: Small communities in Northern France, Rhineland
• Migration east
• Expansion
• Migration to US and Israel
• Origin?
• Founder event?
• European gene flow:o Where?o When?o How much?
• Relation to other Jews?
Whole-genomes?
Outline
• Ashkenazi Jewish Genetics: Background
• The Ashkenazi Genome Sequencing Project
• Segment Sharing and Population History
• Opportunities and Future Directions
The Ashkenazi Genome Consortium
NY area labs interested in specific diseases
Quantify utility in medical genetics
Learn about population
history
Phase I: 128 whole genomes (Completed*)Phase II: ≈500 whole genomes (NYGC; under way)
Large cohorts of AJ cases
Impute
* Carmi et al., Nat Commun, 2014
Technical Details
Property Genome (exome)
Coverage ≈56x
Fraction called 96.7±0.3% (98.1%)
Concordance with arrays
99.67±0.25%
Ti/Tv ratio 2.14±0.004 (3.05)
• Ashkenazi ancestry verified
• Some phenotypes exist• Sequencing by
Complete Genomics in three batches
o Uniform QC measures
• Error rate estimateso Using runs-of-homozygosity and a duplicateo SNVs: ≈10-40k errors per genome (FDR: 0.3-1.3%)o Indels: ≈10-30k errors per genome (FDR: 2-6%)
• QC: Remove indels, poly-allelic variants, Hardy-Weinberg violations, low call rate
• Errors after QC: ≈5k per genome
hets
roh
Comparison to Europeans
Comparison panels:• 26 Flemish from Belgium (platform-
matched)• 87 North-West Europeans [CEU (1000
Genomes)]Fraction novel (%)(dbSNP135)
Population-specific variants(25x25 genomes)
An Ashkenazi reference panel filters more benign variants than a European panel.
AJ Clinical Genomics
AJ Medical Genetics: Imputation
An Ashkenazi reference panel improves imputation accuracy of AJ SNP arrays compared to the standard European panel.
Correlation
between imputed and real
data
Rare variants (≤1%) accuracy:
87% vs 65%
Using Impute2
AJ Medical Genetics: Applications
• Our consortium:o An expanded carrier screening panel o Pharmacogenetically-important alleleso Low-frequency deletions in tumorso Association studies: schizophrenia,
Parkinson’s, Crohn’s, longevity, cancer
• Others:o Frequency lookups (clinical/pedigrees)o Association studies: Epilepsy, Autism, …
Principal Component Analysis (PCA)
Price et al., 2008; Olshen et al., 2008; Need et al., 2009; Kopelman et al., 2009; Atzmon et al., 2010; Behar et al., 2010; Bray et al., 2010; Guha et al., 2012; Behar et al., 2014
Ashkenazi Jews
Middle-East
Europe
Druze
Palestinians
Bedouins Sardinians
Tuscans
Italians
Basque
French
Flemish
Sephardi Jews(Italy, Turkey)
The Documented Ashkenazi History
• Origin?
• Founder event?
• European gene flow:o Where?o When?o How much?
• Relation to other Jews?
The Documented Ashkenazi History
• Origin?
• Founder event?
• European gene flow:o Where?o When?o How much?
• Relation to other Jews?
Outline
• Ashkenazi Jewish Genetics: Background
• The Ashkenazi Genome Sequencing Project
• Segment Sharing and Population History
• Opportunities and Future Directions
Identical-by-Descent (IBD) Shared Segment
Formal definition: A contiguous segment inherited from a single, recent common ancestor.
g
IBD segment
After Browning & Browning, 2012
What’s “recent”?
Identical-by-Descent (IBD) Shared Segment
Practical definition: A contiguous segment nearly identical over a sequence length longer than a cutoff.
• Requires strong genetic drift
• Segments are rare but long o Probability of a site to be shared o Segment length
• Current methods can detect segments 1cM
g
IBD segment
Formal definition: A contiguous segment inherited from a single, recent common ancestor.
Applications
• A segment indicates recent co-ancestry:o Disease mappingo Pedigree reconstructiono Detecting natural selectiono Demographic (historical)
inferenceo Estimating mutation rates
• Identical sequence across individuals:o Resolving haplotypes
(phasing)o Imputationo Estimating heritabilityo Estimating genotyping error
rate
g
IBD segment
Eskin’s lab
IBD Sharing Theory
• Model:o A population with a constant effective size No Two chromosomes of length L (Morgans)o A minimal segment length m (Morgans)
• The number of shared segments nm?
• The fraction of the chromosome in shared segments fm?
L
mℓ1 ℓ3ℓ2
;
Results overview
• Under the Sequentially Markov Coalescent (SMC):
• The number of shared segments:
;
• The fraction of the chromosome in shared segments:
;
• Results for a more realistic coalescent model (SMC’)
• Implicit expressions for the distributions
• All results generalizable to variable population size
Palamara et al., 2012; Carmi et al., Genetics, 2013; Carmi et al., Theor Popul Biol, 2014
Demographic Inference: Maximum Likelihood
Carmi et al., Theor Popul Biol, 2014
Use the distribution of the number of shared segments
Demographic Inference: A Practical Approach
Palamara et al., 2012
• Historical size N(t)=N0 ν(t).
• Mean fraction of the genome in segments of length ℓ1<ℓ<ℓ2:
(1)
Method:• Record IBD segments in
each length bin• Using Eq. (1), find the
history N(t) that fits best
Hypothetical example
IBD Sharing in Ashkenazi Jews
Gusev et al., 2012
A pair of AJ individuals shares ≈50cM in ≈15 long segments (>3cM)
Atzmon et al., 2010
Bray et al., 2010
AJ
EU
Inferring the Bottleneck Size and Time
Carmi et al., Nat. Commun., 2014Palamara et al., 2012
Time (years)
Caveats
• Phasing and sequencing errors; IBD detection errors
• Reasonable power only for 10-50 generations ago
• Model specification (e.g. prolonged bottleneck, admixture)
Parameter 95% confidence interval
Ancestral size 3654-5856
Bottleneck size 249-419
Growth rate (per generation)
16-53%
Bottleneck time (years)
625-800• A bottleneck 700ya confirmed by an independent method: lengths of haplotypes around rare variants
o Mathieson and McVean, 2014
The Documented Ashkenazi History
• Origin?
• Founder event?
• European gene flow:o Where?o When?o How much?
• Relation to other Jews?
Outline
• Ashkenazi Jewish Genetics: Background
• The Ashkenazi Genome Sequencing Project
• Segment Sharing and Population History
• Opportunities and Future Directions
Coverage by Shared Segments
A sequenced reference panel
Partly sequenced genome
Impute
What fraction of the genome can we cover with shared segments?
Full sequence
Partial sequence
Nearly-complete inferred sequence
The Era of Near-Complete Coverage
NowPhase II
Mine public data?Other studies?
Opportunities:• Interpret personal genomes
o Time-stamp rare mutations• Cost-effective large-scale association
studieso Resolve haplotypeso Impute SNP arrays or low-coverage
sequenceso Mapping rare variants/haplotypes
See Carmi et al., Genetics, 2013 for a theoretical analysis
The Era of Near-Complete Coverage
New algorithms
needed!
g
IBD segment
Time-stamp rare mutations
NowPhase II
Mine public data?Other studies?
Ashkenazi History
• Origin?
• Founder event?
• European gene flow:o Where?o When?o How much?
• Relation to other Jews?
The Place of European Gene Flow
“Most of these theories … are myths or speculation … based on some vague or misunderstood references. … It will probably be impossible to say definitely where the hundreds or thousands of Jews in Poland in the 13th to 14th centuries came from.”
B. Weinryb, The Jews of Poland, 1972
Approach
Johnson et al., 2011; Moreno-Estrada et al., 2013
oooooo
oooooo
EU ME
xx xxxx
xxxx
xxx xx
xxxxxx
xxxxxx
xx xxxx
xxxx
xxx xx
xxxxxx
xxxxxx
EU
ME
xx xxxx
xxxx
xxx xx
oooooo
xxxxxx
xxxxxx
EUME
AJ
An Ashkenazi genome
PC2
PC1
PC1 PC1
PC2 PC2
Preliminary Results
• Origin in the Levant
• Gene flow mostly fromWest-Europe, about 30 generations ago
• Sex-imbalanced history?
Summary
• It is important to study Ashkenazi genetics• We sequenced 128 whole-genomes• Useful for personal clinical genomics and
imputation• Segment sharing reveals a founder event
and suggests opportunitiesMy research statement
Acknowledgements
Funding:Human Frontier Science program
Itsik Pe’er’s lab:James Xue, Ethan Kochav, Shuo Yang, Pier Palamara, Vladimir Vacic
TAGC consortium members:Todd Lencz, Semanti Mukherjee (LIJMC)Lorraine Clark, Xinmin Liu (CUMC)Gil Atzmon, Harry Ostrer, Danny Ben-Avraham (AECOM)Inga Peter, Judy Cho (ISMMS) Ariel Darvasi (HUJI)Joseph Vijai (MSKCC)Ken Hui (Yale)VIB Ghent, Belgium
Thank you for your attention!
Harvard University:Peter Wilton, John Wakeley
Sheba Medical Center:Eitan Friedman