Upload
julianne-foat
View
222
Download
1
Embed Size (px)
Citation preview
Agilent TechnologiesSureSelect™
Target Enrichment Platform
Providing focus fornext-generation sequencing workflows
David Willmot, PhD Sr. Applications Scientist
Enabling Products for the Next-Generation Sequencing Workflow
Page 2
Agilent’s SureSelect Platform: New Options
SureSelect Target Enrichment System
(in solution)Developed in collaboration
with the Broad InstituteDr. Chad Nusbaum et al.
SureSelect DNA Capture Array
(on array)Developed in collaboration
with Cold Spring HarborDr. Greg Hannon et al.
Agilent 60mer Array
1-5 µg gDNA
1-5 µg gDNA, or20 µg gDNA
Enabling Products for the Next-Generation Sequencing Workflow
Page 3
A Choice in Agilent Target Enrichment Options for Specific Project Needs
SureSelect Target Enrichment System
SureSelect DNA Capture Array
(244k array)
Throughput High Low
Study Sizes 10-1,000s samples 1-10 samples
DNA Input 1-3 µg 1-20 µg*
Capture of Target DNA
3.3+ Mb Custom ~1 Mb
Format Kit Array plus application note/protocol
or
SureSelect™ Target Enrichment System: Workflow
SureSelect™Target Enrichment System
Illumina GA Kit Available End of February
Enabling Products for the Next-Generation Sequencing Workflow
Page 5
Agilent Online Custom SolutionsFlexibility in Target Enrichment
244K
eArrayTarget Enrichment
Design
DNA Capture Design
Custom and CatalogDesigns for CGH, Gene
Expression, etc
researcher
Manufacturing Process Development
Bioinformatics
SureSelect DNA Capture Array
SureSelect Target Enrichment System
User Sequences
Basis of SureSelect Target Enrichment System: publ. by Broad Inst. Feb. 2009 Nature Biotechnology
SureSelect™ Target Enrichment System:Design and Order Process
1. Design & Order
2. Kit Production
3. Single Tube Kit
DeliverySelect customgenome partitioning
set using eArray
or, selectcatalog oligo set
eArray Web Portal
55K unique 120 mer oligos synthesized on
one wafer
Oligos released
Oligo IVT to RNA-biotin
Kit Includes
1. Biotinylated-cRNA
2. Buffers3. Protocol
Agilent genome partitioning kit
shipped to customer
SureSelect™Target Enrichment System
Quality Control on the Bioanalyzer
• Simplified Experimental Workflow
Target Enrichment
Enabling Products for the Next-Generation Sequencing Workflow
Page 9
Santa Clara Manufacturing Facility
- Industrial manufacturing – Class 10,000 clean-room
- Wired directly into eArray, allowing direct customer access to fully customizable products
- High-performance inkjet printing enables long oligo manufacturing
Manufacturing Process
Development Bioinformatics
Agilent’s Microarray Platform
• Reliable inkjet printing
• Sensitivity of probes
• Flexibility of microarrays
• Ease of implementation
• Probe Fidelity can synthesize up to 200mer
Enabling Products for the Next-Generation Sequencing Workflow
Page 11
Agilent’s Strength in Ultra-long Oligo Synthesis
3) Deblock
1) Coupling
2) Oxidation
Repeat n times
Depurinationside reaction
0
20
40
60
80
100
0 50 100 150 200
Oligo Length (mer)
Fu
ll L
en
gth
(%
)
98% cycle yield, no depurination99.5% cycle yield, no depurination
98% cycle yield, with depurination99.5% cycle yield, with depurination
Agilent
Conventional Processes
N1O
OP O
ROO
N2O
OP O
ROO
NiO
OP O
ROO
HO
Manufacturing Process
Development Bioinformatics
Agilent SureSelect™ PlatformEnabling Products for the Next-Generation Sequencing WorkflowPage 12
SureSelect™ Target Enrichment System: Bringing Cost-Efficiency to Next-gen Sequencing
Workflows
Agilent SureSelect™ PlatformEnabling Products for the Next-Generation Sequencing WorkflowPage 13
The Cost of DNA Sequence Information: Illumina GA
Typical Illumina Sequence Experiment Costs
Bases/ Run 2,000,000,000Lanes/ Run 8Samples/ Run 7Bases/ Lane(Sample) 250,000,000Cost/ Run (Sequencing Chemistry/ Flow Cell) $7,000Cost/ Lane (Sample) $1,000Cost/ Megabase Read $4.00
How much of this is useful information?What is the cost of the useful information?
Agilent SureSelect™ PlatformEnabling Products for the Next-Generation Sequencing WorkflowPage 14
SureSelect™ Target Enrichment: Cost Benefit
Typical Target Experiment
Size of Interesting Area of Genome (Mb) 3
Non-targeted Experiment % On Target Reads 0.10%# of Useful Bases/ Sequencing Lane 250,000Cost per Useful Mb Sequenced $4,000
SureSelect % On Target Reads 40%# of Useful Bases/ Sequencing Lane 100,000,000Cost of Targeting Enrichment per 3.3 Mb $1,100Cost of Target Enrichment $1,100Cost per Useful Mb Sequenced $21Cost Savings with SureSelect Per Mb Targeted $3,979
SubstantialCost
Differential
Enabling Products for the Next-Generation Sequencing Workflow
Page 15
SureSelect™ Target Enrichment System Using Illumina GA– Sample Data from an Exon Capture Experiment
0 60 120
180
240
300
360
420
480
540
600
660
720
780
840
900
960
1020
1080
0%
1%
2%
3%
4%
5%
6%
7%
8%
9%
10%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Depth distribution
Read Depth
% p
er B
in
Cum
ulati
ve %
Target: 9,107 exons3+ Mb
Sample: 3 µg Genomic DNA Covaris Shearing Illumina library prep End-sequencing 35bp
Enabling Products for the Next-Generation Sequencing Workflow
Page 16
Genomic Space View (Zoomed-in View of 7 Exons)
1st exon covered by 102 tiled baitsRead depth maxes out at >269xAverage read depth is 100x
2nd exon covered by 1 bait (because it was right next to a repeat region)Read depth ~ 10x
3rd – 5th exons covered by 2-4 baitsRead depth ~ 40x
SureSelect™ Target Enrichment Reproducibility
Agilent Confidential
0 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100
120
140
160
180
R² = 0.961609744553544
Average read depth of technical replicates
Average read depth per target interval (AGT1)
Avera
ge r
ead d
epth
per
targ
et
inte
rval (A
GT7)
Agilent SureSelect™ PlatformEnabling Products for the Next-Generation Sequencing WorkflowPage 18
SureSelect™ Target Enrichment System: Strong Allele Balance Indicates Negligible Bias
number non-reference bases number reference bases0
5000
10000
15000
20000
25000
Heterozygous SNP calls
• Baits are designed to wildtype sequence• Long oligos lead to unbiased SNP capture and calling
SureSelect™ Target Enrichment System: Efficient SNP Validation and Discovery
SNP ID Chr Basepair Reference
(reads)
SNP (reads
)
rs11204545 124612607
6 47 42
rs11204546 124612608
5 52 40
rs831043 216981133
2 44 52
rs6886628 511292735
1 52 41
Novel 910338143
4 36 44
rs944682 115095921
2 0 91
GAC C A GG GC AA/TGT T C CT C AT/CGC T C T T C T A C
rs11204545 and rs11204546
GAC C A GG GC AA/TGT T C CT C AT/CGC T C T T C T A CGAC C A GG GC AA/TGT T C CT C AT/CGC T C T T C T A C
rs11204545 and rs11204546
T T G C T T G C A CA/GC T A G C T T A T C
rs6886628
T T G C T T G C A CA/GC T A G C T T A T CT T G C T T G C A CA/GC T A G C T T A T C
rs6886628
A C A C A G G A T C G T C T G G C T G C T
rs944682
A C A C A G G A T C G T C T G G C T G C TA C A C A G G A T C G T C T G G C T G C T
rs944682
G A A A G A G C C C A/C G A G A A G T G T T
Novel Chr9:103381434
G A A A G A G C C C A/C G A G A A G T G T TG A A A G A G C C C A/C G A G A A G T G T T
Novel Chr9:103381434
A A C C A C C C A C A/GG A G C A G T G T G
rs831043
A A C C A C C C A C A/GG A G C A G T G T GA A C C A C C C A C A/GG A G C A G T G T G
rs831043
GAC C A GG GC AA/TGT T C CT C AT/CGC T C T T C T A C
rs11204545 and rs11204546
GAC C A GG GC AA/TGT T C CT C AT/CGC T C T T C T A CGAC C A GG GC AA/TGT T C CT C AT/CGC T C T T C T A C
rs11204545 and rs11204546
T T G C T T G C A CA/GC T A G C T T A T C
rs6886628
T T G C T T G C A CA/GC T A G C T T A T CT T G C T T G C A CA/GC T A G C T T A T C
rs6886628
A C A C A G G A T C G T C T G G C T G C T
rs944682
A C A C A G G A T C G T C T G G C T G C TA C A C A G G A T C G T C T G G C T G C T
rs944682
G A A A G A G C C C A/C G A G A A G T G T T
Novel Chr9:103381434
G A A A G A G C C C A/C G A G A A G T G T TG A A A G A G C C C A/C G A G A A G T G T T
Novel Chr9:103381434
A A C C A C C C A C A/GG A G C A G T G T G
rs831043
A A C C A C C C A C A/GG A G C A G T G T GA A C C A C C C A C A/GG A G C A G T G T G
rs831043
• Allelic balance no bias• Efficient identification of SNPs• Illumina readout confirmed with CE• Efficiently captures mutations
Efficient Capture of 5 bp deletion on the X-Chromosome: Menke’s Syndrome
SureSelect™ Target Enrichment Kit Efficiently Captures 5 bp MutantReadout on Illumina GA
hg18_ChrX_77131408_77131467_+ : Wildtype Bait Design
CTATTGTTTATCAACCTCATCTTATCTCAGTAGAGGAAATGAAAAAGCAGATTGAAGCT
CTATTGTTTATCAACCTCATCTT AGTAGAGGAAATGAAAA ATTGTTTATCAACCTCATCTT AGTAGAGGAAATGAAAAAGTTGTTTATCAACCTCATCTT AGTAGAGGAAATGAAAAAGCGTTTATCAACCTCATCTT AGTAGAGGAAATGAAAAAGCAG
TATCAACCTCATCTT AGTAGAGGAAATGAAAAAGCAGATTATCAACCTCATCTT AGTAGAGGAAATGAAAAAGCAGATTGATCAACCTCATCTT AGTAGAGGAAATGAAAAAGCAGATTGATCAACCTCATCTT AGTAGAGGAAATGAAAAAGCAGATTGCAACCTCATCTT AGTAGAGGAAATGAAAAAGCAGATTGAA
CCTCATCTT AGTAGAGGAAATGAAAAAGCAGATTGAAGCT
hg18_ChrX_77131408_77131467_+ : Wildtype Bait Design
CTATTGTTTATCAACCTCATCTTATCTCAGTAGAGGAAATGAAAAAGCAGATTGAAGCT
CTATTGTTTATCAACCTCATCTT AGTAGAGGAAATGAAAA ATTGTTTATCAACCTCATCTT AGTAGAGGAAATGAAAAAGTTGTTTATCAACCTCATCTT AGTAGAGGAAATGAAAAAGCGTTTATCAACCTCATCTT AGTAGAGGAAATGAAAAAGCAG
TATCAACCTCATCTT AGTAGAGGAAATGAAAAAGCAGATTATCAACCTCATCTT AGTAGAGGAAATGAAAAAGCAGATTGATCAACCTCATCTT AGTAGAGGAAATGAAAAAGCAGATTGATCAACCTCATCTT AGTAGAGGAAATGAAAAAGCAGATTGCAACCTCATCTT AGTAGAGGAAATGAAAAAGCAGATTGAA
CCTCATCTT AGTAGAGGAAATGAAAAAGCAGATTGAAGCT
SummaryThe Agilent SureSelect platform• Flexible custom designs with Agilent’s eArray portal - free of charge.• Efficiently captures mutations, with 1/10th the gDNA vs. competing
products• Scalable solutions for small scale to large population studies and
automation• Array-based target enrichment application coming shortly as well as
SOLiD and 454 in-solution protocols• Target Enrichment lowers:
Reagent usageDNA inputLaborData handling
For more information http://www.opengenomics.com/SureSelect
See Protocol: G3360-90010
Contact [email protected] 1-800-227-9770 options 2x2x1
A QuickStart Guidefor the Creation and
Ordering ofSureSelect Target
Enrichment Oligo Sets
Using eArray, a free web-based design tool
Summary of Steps
• Getting Started– Access eArray– Confirm Application Type– Select Method
• Terms and Definitions• Initiate the Wizard
• Create Library by Bait Tiling– Library Options and Target Details
Design Strategy– Centered versus Justified– Formats for Describing Target
Intervals– Upload Message– View the Design– Continue to Create a Library– Define Library– Layout Baits– Save and Submit Library– Final Steps– Download Library– Provide Quote Details– Review and Submit Quote
• Access eArray at: https://earray.chem.agilent.com/earray/
• Log in to eArray, if this is your first time visiting, click Request for Registration
Slide 24
Getting StartedAccess eArray
The Login Name is the e-mail address
used to register
• Confirm that the Application Type is TargetEnrichment
Slide 25
Getting StartedConfirm Application Type
If necessary, select Switch
Application Type to select Target
Enrichment
• Choose between two methods for creating a SureSelect Library
Slide 26
Getting Started Select Method
Benefits Caveats When to Use
Following the Wizard
• Easy, stream-lined method• The Wizard takes the user from initiation of design to submission
• The user will only be able to download the library details after the library is completed• The user will not have access to a "fate" file
When a simple design is planned
Independent of the Wizard*
• User can download additional files, including a Fate file that lists the # of baits for each target interval • User can fine-tune the design, creating and combining multiple bait groups parameter settings• User can track the success of bait tiling for individual targets
• User will not be guided by a Wizard
When iterations may be desired for an optimal design
Terms and Definitions
• Bait:
– A single oligo sequence of pre-determined length (120 bp) that complements a targeted region of the genome
• Bait Group:
– Consists of a group of Baits designed to complement a single or set of targeted intervals
– May be formed from baits generated within eArray, baits uploaded into eArray, or bait search results within eArray
• Library:
– Consists of one or more Bait Groups– Represents the set of oligos that will be produced for the
kit
Slide 27
• Select the method for Library creation in the Library Wizards quadrant, and click Next
Slide 28
Initiate the Wizard
Create Library by Bait Tiling:Use this option to upload the
genomic intervals of your targets and allow eArray’s algorithm to
design baits for these targets. This is the option outlined in this tutorial.
Create Library from Existing Bait Group(s):
If you have already created the bait group(s) that will be used for this
library, use this option.
Create Library from Bait Upload:If you have already designed the
baits that will recognize your desired target regions, use this option.
Slide 29
Create Library by Bait TilingStep 1: Library Options and Target Details
1. Enter a name for the design job.
2. Design Strategy: To use the parameters previously optimized for general bait tiling, leave the checkmark
on this option. To change the parameters, uncheck this option. The next slide describes the parameter changes
that can be applied.
3. Species: Select the species for which the target intervals were designed. The Genome Build will
then automatically be populated by eArray.
4. Genomic Target Intervals: Either type in or upload the genomic intervals for the
targets to be enriched. Examples are provided on a later slide.
5. Genomic Avoid Intervals:• Choose to avoid the standard repeat masked
regions by leaving a checkmark next to this option (based on the UCSC RepeatMasker track)
• Add additional intervals to avoid with the baits by typing those intervals in or uploading them in the same format as the target intervals.
6.Submit:Select Submit when
Options and Details are completed.
Create Library by Bait TilingDesign Strategy
Slide 30
1. Change the parameters: To be able to change these parameters, first remove the checkbox from the “Use Optimized Parameters” option.
2. Centered versus Justified: (see next slide for visuals)
3. Bait Length: Currently, 120 bp is available as the only bait length option. All baits will be designed to be 120 bp in length.
4. Bait Tiling Frequency: Options include 2X, 3X, 4X, and 5X and indicate the amount of bait overlap. Tiling frequency is not enforced at
target edges. Increasing the frequency will lead to the ability to cover fewer or smaller regions in a library.
5. Allowed overlap into avoid regions: Centered baits may overlap with regions
adjacent to the target. In case the targets are
adjacent to Avoid Regions, enter the
acceptable amount of overlap with these in
bp. To ensure that there is no overlap in any Avoid Regions,
select ‘0 bp’.
Centered:Baits are centered, and
evenly distributed, across each target region. Baits
may overlap regions outside of the target.
Why use this? This is the method tested most
extensively.
Justified:Baits are first tiled across the
target. If baits extend past the target interval, all baits are shifted inward so that
there will be no overlap with adjacent but un-targeted
genomic regions.
Why use this? If it is desired to avoid having baits
to any region adjacent to targets, for example, if
sequencing cDNA.
2X Tiling: 3X Tiling: 4X Tiling: 5X Tiling:
Target
Baits
(b) Target region is 2 times the bait length
baits
(a) Target region is large (example of 2x tiling)
(c) Target region is shorter than the bait length
Create a Bait GroupCentered versus Justified
JustifiedCentered
target region
Design is the same for
Centered and Justified
Centered baits extend past interval boundaries, but have even coverage
across entire region Justified baits do not
extend past boundaries, but may have uneven
coverage (e.g., see circle where regions have both
2x and 3x tiling)
Design is the same for
Centered and Justified
Create Library by Bait TilingFormats for describing Target Intervals
Slide 32
Option 1: Type in the genomic intervals to be targeted.The format for typing in the intervals is as follows:
chrX:100003816-100003948|chrX:100004037-100004218|chrX:100004314-100004465|
chrX:100004329-100004465|chrX:100004804-100004888
Each interval should be separated by the | character. If there is a long list of intervals, it would be better to use
option 2.
Option 2: Upload a file that includes the genomic intervals to be targeted:
The format for uploading the intervals is as follows:
Each interval should be presented on a separate line, and the file should be saved as a text file.
When using this option, select Upload, Browse to find the saved file, and then select Upload File.
Create Library by Bait TilingUpload Message
• Once you’ve clicked on Submit, you will receive the above Upload Message
• You have now submitted a design for the creation of a Bait Group!
• Select Exit, and return to the Home page on eArray
• You can now monitor the status of your submission in the Library Wizards quadrant
– Click on Refresh to view the most current status
– You will receive an e-mail alert when the submission is complete
Slide 33
Create Library by Bait TilingView the Design (I)
• Once the Bait submission is complete and Baits are uploaded, return to the eArray Home page and select View Design
Slide 34
Create Library by Bait TilingView the Design (II)
• Toggle between Design Summary, Design Details, Target Fate, and Bed File to view the details of the bait design
• Only the first 100 lines are displayed
Slide 35
Create Library by Bait TilingContinue
• If satisfied with the creation of this Bait Group, return to the eArray Home page and select Continue
Slide 36
Create Library by Bait TilingDefine Library
Slide 37
Provide a name for the library:Previously in the Wizard, a Bait Group was defined and created. Now we
are creating a Library, which might consist of one or more Bait Groups. Therefore, it also requires a name.
Provide a description for the library:Here, it is possible to include a description,
keywords, and comments to help characterize this library.
When completed, select Next
A control grid is always, and
automatically, included in a library
Create Library by Bait TilingLayout Baits
Slide 38
3. Determine the # of Replicates per Bait
Group:When a Library is not full, it is possible to
create replicates of one or more Bait Groups
5. When completed, select Next
1. Describe the Control Type:
Leave this empty if the Bait Group is not
a control.
2. If desired, add additional, pre-
made Bait Groups to this library:
Select Add, Search for the Bait by name,
Add to the box on the right, and click
Done
4. Check Library Statistics:The Percentage Filled value should be less than or equal to100%. If there are a number of features still
available, or the Percentage Filled is low, it is possible to add replicates or additional, existing Bait Groups.
Create Library by Bait TilingSave and Submit Library
• Save the new Library in the format of choice– To prepare the Library for manufacturing, it is necessary
to choose Submit– When choosing Submit, click on Design check list and
complete the checklist before choosing Save• After Submitting a Library, it is ready for ordering, but you
must still generate the Quote
– The library is not ordered at this step, it is only submitted
Slide 39
Create Library by Bait TilingFinal Steps
• Once a Library has been submitted, it is available for generating a Quote
• The Library can be found in the My Libraries quadrant in the eArray home page– If the library is not listed there, select Refresh– If the library is still not listed there, it may have been saved but not submitted– The library can also be found using the Library Search feature in the Search quadrant of
eArray
Slide 40
Quote:Select this option to generate a
Quote and initiate the manufacture of this library
Download:Select this option to download the BED file (for
viewing of baits in the UCSC browser, for instance), or a TDT (tab delimited table) summarizing the
baits in this library
Create Library by Bait TilingDownload Library• The following window results from having selected Download:
Slide 41
1. Place a checkmark next to the file types desiredExample of a BED file
Example of TDT file
2. Click Download:If the files do not download, hold down the Ctrl key when clicking on Download and until the
window appears that requests whether you want to Save or Open the file
Create Library by Bait TilingProvide Quote Details
• The following window results from having selected Quote:
Slide 42
1. Determine the number of Libraries desired
2. Determine the number of reactions per library:A reaction size of 50 means that a single Library can
be used for capture with 50 DNA samples
3. Determine the sequencing technology and protocol
4. Click Next
Create Library by Bait TilingReview and Submit Quote
• Review Quote details, if satisfied, select Submit
Slide 43
Some Differences in Workflow for the
Non-Wizard Approach
Create a Bait GroupDesign Baits
• To create a Bait Group, go to the Baits page, and select Bait Tiling
• This will bring you to the page where you can define design parameters for a Bait Group
Slide 45
Click on Baits
Click on Bait Tiling
Workflow Summary
Slide 46
Create a Bait Group by tiling Baits across target
intervals
Examine Bait Group, determine which targets were avoided and how many baits were tiled
If desired, create additional Bait Groups, using additional target intervals or
alternate design parameters
Combine desired Bait Groups into a Library
(minimum of 1)
Save and Submit Library
Generate Quote
Create a Bait GroupDownload the Design (II)
There are four files included in the downloaded zip:
Slide 47
Example of a BED file:
Example of a TDT (tab delimited table) file:
Example of a Fate file:
Example of a Summary file:
Create Bait GroupCreate Bait Group
Slide 48
Once satisfied with the Bait Group Design, select Create Bait Group
Check the status of the Bait Group creation in the Pending
Jobs quadrant of the Home page
While waiting for the Bait Group to be created, design additional Bait Groups, if desired
Examine Bait GroupLook for Missed Targets• Open the BaitTiling_fate file in Excel
• Sort the file for Status to identify invalid (failed) genomic coordinates– These are targets for which the genomic coordinates could not be correctly assigned
• Sort the file for Baits Generated to identify targets for which no baits were assigned
– In this example, according to the BaitTiling_sum file (left): 8423 - 8081 = 342 targets were not assigned baits; these can be viewed on the right in the BaitTiling_fate file
Slide 49
Examine Bait GroupView Design in UCSC Browser• The BED file can be uploaded into the UCSC browser as an easy method to
visualize the probes tiled across targets of interest
Slide 50
Segment of chrX with baits tiled across exons
(target intervals)
Zoomed in view of baits
tiled across a single exon
Baits
Genes
RepeatMasker
Baits
Genes
RepeatMasker
Examine Bait GroupIdentify Targets for Potential Re-Tiling
Slide 51
Baits
Genes
RepeatMasker
1. Select targets not assigned baits in the
Fate file.
3. In this example, the exon on the right is covered by a section identified as "repeats" by the RepeatMasker track, and
design parameters had specified to avoid these regions. Therefore, this target received zero baits.
If the target is still desired, it is possible to create a new, additional Bait Group for this target with new design parameters that allow repeat regions. Both Bait
Groups can later be included in the final library.
2. Upload BED file to the UCSC browser, and zoom in on the target to identify the reason that it was provided
zero baits.
For more information:
• www.agilent.com/chem/eArray
• www.OpenGenomics.com/sureselect
• 1-800-229-9770