Upload
cathleen-griffith
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Genome-Wide SNP Genotyping in Grape – What is Next?
Part of National Genetic Trait Index ProjectCRIS# 1907-21000-030-00D
USDA-ARSGeneva, Cornell, Davis, Cold Spring
Harbor
Acknowledgement: Sean Miles/Doreen Ware
Team
• Edward Buckler and Sean Myles – Genomics and statistical analysis
• Doreen Ware, Jer-Ming Chia, Bonnie Hurwitz – Bioinformatics
• Charles Simon, Gan-Yuan Zhong, Mallikarjuna Aradhya, Bernard Prins – Germplasm
Genus Vitis• Contains over 60 species mostly found in
temperate regions of the northern hemisphere (Both Old and New World Distributions) – ~3500 accessions
• Vitis vinifera is the most important domesticated species cultivated for table grapes and wine making (~1300 Accessions)
• The wild grape Vitis sylvestris is considered the progenitor of the domesticated grape
• Highly heterozygous and low LD (~200bp)
Genetic Diversity in the Domesticated Grape
Cluster Density Cluster Size
Berry Size Berry Shape?Genetic Diversity
Objectives
1. Grape as a Model Crop for National Genetic Trait Index (NGTI)2. Characterization of Molecular Diversity – Functional Variability3. Genome-wide Association Mapping4. Identify Markers Associated With Economic Traits5. Develop Strategies for Marker Assisted Breeding – Juvenile Selection in Perennial/Tree Crops
Steps• Step 1: SNP Discovery - Next-Generation
Sequencing to sample diversity– DNA preparation, sequencing method and
analysis of sequencing reads for variation– Characterization of SNPs: position, allele
support, and coverage– 10k SNP array development
• Step 2: Genotype and Assemble Data for Analysis
• Step 3: Phenotyping
Step 1: Discovery of genetic variants (SNPs)
Diverse Samples10 cultivated Vitis varieties (Vitis vinifera)
6 wild Vitis species
Genome complexity reduction
Digestion with HpaII restriction enzyme
Illuminia/Solexa sequencingSequencing by synthesis
60 million sequencesTotal: 2 billion base pairs of sequence
Discovery of >1 million SNPs
Make data availableIntegrate SNP data into public
grape genome browser
SNP Discovery Panel
• Goal: Capture recent variation in the genus Vitis
• RRLs constructed from 10 domesticated cultivars and 6 wild species
1. Ehrenfelser2. French Colombard3. Gewurztraminer4. Kadarka5. Malvasia6. Muscat of Alexandria7. Pinot Noir8. Plavac Mali9. Thompson Seedless10. White Riesling
11. Vitis amurensis12. Vitis cinerea13. Vitis labrusca14. Vitis palmata15. Vitis rotundifolia16. Vitis sylvestris
17. Inbred Pinot Noir (Reference Genome)
Library Construction ProtocolReducing the complexity of the Genome
DNA ExtractionDNA Extraction
Whole Genome Amplification*Whole Genome Amplification*
Genome Complexity Reduction: Restriction enzyme digest
Genome Complexity Reduction: Restriction enzyme digest
Size Selection from Gel: 100-600bp
Size Selection from Gel: 100-600bp
Addition of ‘A’ Base to 3`endsAddition of ‘A’ Base to 3`ends
Ligation of Solexa AdaptorsLigation of Solexa Adaptors
Solexa Genome AnalyzerSolexa Genome Analyzer
Next-Generation Sequence Analysis Workflow
Aln Consensus &
Quality
Variation Discovery
Filters
Called SNPs
Variation
Variation Discovery
Gapped Alignment
Ungapped Alignment
Alignments
Mapped to genome?
YES
NO
Read Mapping
Data Accessibility
Image files from
Solexa GA
Base Calling
Sequence and Base Quality
Data Storage
Sequence and Base Quality
Firecrest, Bustard
Overview of the Solexa SNP pipeline1. 56 Million reads (1.8 billion bp) are aligned to the
reference genome– The divergence within V. vinifera and with other Vitis is
so great we need to develop other algorithms to map the reads
2. 1.1 Million regions of the genome have potential SNPs, which are statistically evaluated for genotypic basis.
3. 50,000 high probability SNPs are identified4. Empirically validating a small subset of the data.5. With improved algorithms and increased
knowledge of grape diversity, we may be able to extract 100,000s of SNPs.
10K SNPs Consequence within Genomic Sequence
• SNP consequence data facilitated via the integration of SNP calls with the genome annotation through Ensembl
• Selected 10K SNPs enriched for genic SNPs.
• In contrast, genome is 46% in genic space, 41% repetitive/transposable elements
Step 2: Genotyping the grape germplasm repository
SNP selectionChoose 10,000 high quality SNPs
from the 500,000 Solexa SNPs
10K SNP chipProduction of custom 10,000 (8898) SNP genotyping array
Genotype the germplasm repository-1200 cultivated species (Vitis vinifera)
- 1000 wild species
21 million genotypes
Analyses-Establish core germplasm collection-Identify synonyms and homonyms
- Association mapping- Estimate population genetic parameters
min0 10 20 30 40 50
mAU
0
20
40
60
80
100
DAD1 E, Sig=525,20 Ref=off (C:\DOCUME~1\LC89\DESKTOP\05230008.D)
min0 10 20 30 40 50
mAU
0
5
10
15
20
DAD1 C, Sig=365,20 Ref=off (C:\DOCUME~1\LC89\DESKTOP\05230008.D)
min0 10 20 30 40 50
mAU
0
10
20
30
40
50
60
DAD1 A, Sig=280,20 Ref=off (C:\DOCUME~1\LC89\DESKTOP\05230008.D)
Profiling anthocyanins (525 nm) and other phenolics in grapes (HPLC-DAD chromatograms)
525nm 365nm 280nm
Phenotyping Economic Traits/ Key Secondary Metabolites of Grapes
Phenotyping the USDA-ARS Vitis collections will be the next critical step for maximizing the value of the current genotyping effort
A pilot project has been initiated for phenotyping key secondary metabolites of the Vitis collections from both Davis, CA and Geneva, NY
About 400 V. vinifera and 200 North American collections will be phenotyped for 50 various phenolics including anthocyanins