Upload
denis-bauer
View
928
Download
0
Embed Size (px)
DESCRIPTION
Exome sequencing has emerged as an economical way of focusing DNA sequencing efforts on the most functionally understood regions of the genome. Pre-capture pooling, where one bait library is used to pull down the exonic regions of several pooled samples simultaneously is a further financial improvement. However, rare alleles in the pool might not be able to attract baits at the same rate as reference conform sequences can, and may hence be underrepresented. We investigated this potential issue by sequencing a hapmap family (4 individuals) using the pre-capture protocol from Illumina and Nimblegen. We did not observe clear evidence that heterozygote variants are missed but noted a trend for indels to be imbalanced. Our findings do not provide clear evidence to rule out allelic imbalance or bias having an impact on research findings, this may be especially critical for low cellular cancer tissue where rare alleles are more ubiquitous.
Citation preview
Assessment of allelic bias in pre-capture platforms for exome sequencing
CMIS
Back to the future?Denis Bauer | Research Scientist 28 March 2012
Part 1:
My Backgroundand a selection of bioinformatics tools developed in Brisbane in the
Bailey Group and Boden Group
Exon Capture Comparison | [email protected] | Page 2
My Background
Berlin
Neusta
dt Brisb
ane
IMB Institute for Molecular Bioscience
QBI Queensland Brain Institute
NorahDesk
Sumoylation Predictor
Timothy Bailey
Mikael Bodén
Fabian Buske
Chikako Ragan
Exon Capture Comparison | [email protected] | Page 3
http://meme.sdsc.edu/meme/intro.html
Bauer, D.C., Buske, F.A., Bailey, T.L., “Dual-functioning transcription factors in the developmental gene network of Drosophila melanogaster”; BMC Bioinformatics 11 (1), 366; PMID: 20594356. Cited: 4
Bauer, D.C., Bailey, T.L., “Optimizing static thermodynamic models of transcriptional regulation.”, Bioinformatics, 2009, 25, 1640-1646. PMID:19398449. Cited: 5
Bauer, D.C., Bailey, T.L., “STREAM: Static Thermodynamic REgulAtory Model of transcription.”, Bioinformatics 2008 24: 2544-2545. PMID:18776194. Cited: 1
Bauer, D.C., Bailey T.L., “Studying the functional conservation of cis-regulatory modules and their transcriptional output.”, BMC Bioinformatics, Apr 29;9(1):220. PMID: 18442418. Cited: 10
StreamQuantitative model of transcriptional regulation
Exon Capture Comparison | [email protected] | Page 4
http://www.bioinformatics.org.au/stream/
Fabian A. Buske et al., "Triplexator: Detecting nucleic acid triple helices in genomic and transcriptomic data", Genome Research 2012, accepted
Fabian A. Buske et al., "Potential in vivo roles of nucleic acid triple-helices", RNA biology, 2011, PMID: 21525785
TriplexatorSearch/Design tool nucleic acid triple helices
Sneak P
review ... C
oming soon to
http://www.bioinformatics.org.au/triplexator/
Exon Capture Comparison | [email protected] | Page 5
NORAHDESKDetecting ncRNA in sequencing data
Ragan, C., Mowry, B.J. and Bauer, D.C. “Hybridization based reconstruction of small non-coding RNA transcripts from deep sequencing data”, NAR, 2012, review received.
http://www.bioinformatics.org.au/norahdesk/
Specifically useful for miRNA- and piRNA-clusters that are transcribed together
Exon Capture Comparison | [email protected] | Page 6
Part 2: Back to the future
Exon capture is the economical way for an unbiased genome wide analysis.
However, extensive sample manipulation can introduce biases that we might not be aware of.
Is less sophistication saver? Exon Capture Comparison | [email protected] | Page 7
Economical way of focusing 2GS efforts on the most functionally understood regions. Whole DNA sample Sonicate Pull out fragments corresponding to the sequence of known “exons”
However, with sequencing cost going down the capture reaction becomes the bottleneck. Solution: “Pre-capture pooling” Apply Bait Library to more than one sample
Pre-capture pooling for exome captureThe business side
Clark MJ, et al., Nat Biotechnol. 2011 PMID: 21947028.
Exon Capture Comparison | [email protected] | Page 8
Bait library design NG: empirically optimized AG: overlapping RNA-baites IL: Gapped tiles
What is an “exon” ? Everything that is known to be transcribed/has function … trust companyNow AG: 72Mb
Pre-capture pooling for exome captureThe technical side
Clark MJ, et al., Nat Biotechnol. 2011 PMID: 21947028.
Exon Capture Comparison | [email protected] | Page 9
Oddities: targeted exons not follow the same length distribution as RefSeq exons
Presentation title | Presenter name | Page 10
Oddities: cont’ Theoretical vs actual capture efficiency of longest exon
Exon Capture Comparison | [email protected] | Page 11
Potential issue: “Allelic Bias”/ “Allelic imbalance” ?
Sequence hapmap family (4 individuals) with• AG: Post capture
• Ill: Precapture
• NG: Precapture
Pre-capture pooling for exome captureThe potential problem
Bait
Het sample 4
Potentially underrepresentedallele
Reference conform + hom bar-coded samples 1-3
Exon Capture Comparison | [email protected] | Page 12
• If Het-variances are not captured reliably in pre-capture the
het/hom ratio would be lower and they would not overlap with DBs
Allelic Bias ?H
et/h
om r
atio
known novel
NG: More Hets in postIll: More Hets in pre
Hapmap 1000G
Fra
ctio
n of
ove
rlap
NG: slighly lower overlap Ill: no difference
Exon Capture Comparison | [email protected] | Page 13
... they would have lower coverage
Allelic Bias ?
Asan, Xu Y et al. Genome Biol. 2011 PMID: 21955857
cove
rage
Illumina Nimblegen
INDELS
SNPs Com Post Pre
Com Post Pre
Com Post Pre
Exon Capture Comparison | [email protected] | Page 14
1. We (and others) did not detect any obvious allelic imbalance, however no one tested samples with really rare alleles (e.g. Low cellularity in cancer)
2. To be on the save side (BACK TO THE FUTURE): we go for post-capture whole-exom-sequencing
Conclusion
Exon Capture Comparison | [email protected] | Page 15
Thank youCMISDenis C. BauerExon Capture Comparisont +61 2 9325 3174e [email protected] www.csiro.au/cmis
CMIS
Institute for Molecular Bioscience, UQTimothy Bailey (MEME)
School of Chemistry and Molecular Biosciences, UQ
Mikael Bodén (Machine Learning)
Queensland Brain Institute, UQVikki MarshallJoon-Yong An Sam LukowskiChikako Ragan (NorahDesk)
Garvan Institute, UNSWJohn MattickFabian Buske (Triplexator)