16
Assessment of allelic bias in pre- capture platforms for exome sequencing CMIS Back to the future? Denis Bauer | Research Scientist 28 March 2012

Allelic Imbalance for Pre-capture Whole Exome Sequencing

Embed Size (px)

DESCRIPTION

Exome sequencing has emerged as an economical way of focusing DNA sequencing efforts on the most functionally understood regions of the genome. Pre-capture pooling, where one bait library is used to pull down the exonic regions of several pooled samples simultaneously is a further financial improvement. However, rare alleles in the pool might not be able to attract baits at the same rate as reference conform sequences can, and may hence be underrepresented. We investigated this potential issue by sequencing a hapmap family (4 individuals) using the pre-capture protocol from Illumina and Nimblegen. We did not observe clear evidence that heterozygote variants are missed but noted a trend for indels to be imbalanced. Our findings do not provide clear evidence to rule out allelic imbalance or bias having an impact on research findings, this may be especially critical for low cellular cancer tissue where rare alleles are more ubiquitous.

Citation preview

Page 1: Allelic Imbalance for Pre-capture Whole Exome Sequencing

Assessment of allelic bias in pre-capture platforms for exome sequencing

CMIS

Back to the future?Denis Bauer | Research Scientist 28 March 2012

Page 2: Allelic Imbalance for Pre-capture Whole Exome Sequencing

Part 1:

My Backgroundand a selection of bioinformatics tools developed in Brisbane in the

Bailey Group and Boden Group

Exon Capture Comparison | [email protected] | Page 2

Page 3: Allelic Imbalance for Pre-capture Whole Exome Sequencing

My Background

Berlin

Neusta

dt Brisb

ane

IMB Institute for Molecular Bioscience

QBI Queensland Brain Institute

NorahDesk

Sumoylation Predictor

Timothy Bailey

Mikael Bodén

Fabian Buske

Chikako Ragan

Exon Capture Comparison | [email protected] | Page 3

http://meme.sdsc.edu/meme/intro.html

Page 4: Allelic Imbalance for Pre-capture Whole Exome Sequencing

Bauer, D.C., Buske, F.A., Bailey, T.L., “Dual-functioning transcription factors in the developmental gene network of Drosophila melanogaster”; BMC Bioinformatics 11 (1), 366; PMID: 20594356. Cited: 4

Bauer, D.C., Bailey, T.L., “Optimizing static thermodynamic models of transcriptional regulation.”, Bioinformatics, 2009, 25, 1640-1646. PMID:19398449. Cited: 5

Bauer, D.C., Bailey, T.L., “STREAM: Static Thermodynamic REgulAtory Model of transcription.”, Bioinformatics 2008 24: 2544-2545. PMID:18776194. Cited: 1

Bauer, D.C., Bailey T.L., “Studying the functional conservation of cis-regulatory modules and their transcriptional output.”, BMC Bioinformatics, Apr 29;9(1):220. PMID: 18442418. Cited: 10

StreamQuantitative model of transcriptional regulation

Exon Capture Comparison | [email protected] | Page 4

http://www.bioinformatics.org.au/stream/

Page 5: Allelic Imbalance for Pre-capture Whole Exome Sequencing

Fabian A. Buske et al., "Triplexator: Detecting nucleic acid triple helices in genomic and transcriptomic data", Genome Research 2012, accepted

Fabian A. Buske et al., "Potential in vivo roles of nucleic acid triple-helices", RNA biology, 2011, PMID: 21525785

TriplexatorSearch/Design tool nucleic acid triple helices

Sneak P

review ... C

oming soon to

http://www.bioinformatics.org.au/triplexator/

Exon Capture Comparison | [email protected] | Page 5

Page 6: Allelic Imbalance for Pre-capture Whole Exome Sequencing

NORAHDESKDetecting ncRNA in sequencing data

Ragan, C., Mowry, B.J. and Bauer, D.C. “Hybridization based reconstruction of small non-coding RNA transcripts from deep sequencing data”, NAR, 2012, review received.

http://www.bioinformatics.org.au/norahdesk/

Specifically useful for miRNA- and piRNA-clusters that are transcribed together

Exon Capture Comparison | [email protected] | Page 6

Page 7: Allelic Imbalance for Pre-capture Whole Exome Sequencing

Part 2: Back to the future

Exon capture is the economical way for an unbiased genome wide analysis.

However, extensive sample manipulation can introduce biases that we might not be aware of.

Is less sophistication saver? Exon Capture Comparison | [email protected] | Page 7

Page 8: Allelic Imbalance for Pre-capture Whole Exome Sequencing

Economical way of focusing 2GS efforts on the most functionally understood regions. Whole DNA sample Sonicate Pull out fragments corresponding to the sequence of known “exons”

However, with sequencing cost going down the capture reaction becomes the bottleneck. Solution: “Pre-capture pooling” Apply Bait Library to more than one sample

Pre-capture pooling for exome captureThe business side

Clark MJ, et al., Nat Biotechnol. 2011 PMID: 21947028.

Exon Capture Comparison | [email protected] | Page 8

Page 9: Allelic Imbalance for Pre-capture Whole Exome Sequencing

Bait library design NG: empirically optimized AG: overlapping RNA-baites IL: Gapped tiles

What is an “exon” ? Everything that is known to be transcribed/has function … trust companyNow AG: 72Mb

Pre-capture pooling for exome captureThe technical side

Clark MJ, et al., Nat Biotechnol. 2011 PMID: 21947028.

Exon Capture Comparison | [email protected] | Page 9

Page 10: Allelic Imbalance for Pre-capture Whole Exome Sequencing

Oddities: targeted exons not follow the same length distribution as RefSeq exons

Presentation title | Presenter name | Page 10

Page 11: Allelic Imbalance for Pre-capture Whole Exome Sequencing

Oddities: cont’ Theoretical vs actual capture efficiency of longest exon

Exon Capture Comparison | [email protected] | Page 11

Page 12: Allelic Imbalance for Pre-capture Whole Exome Sequencing

Potential issue: “Allelic Bias”/ “Allelic imbalance” ?

Sequence hapmap family (4 individuals) with• AG: Post capture

• Ill: Precapture

• NG: Precapture

Pre-capture pooling for exome captureThe potential problem

Bait

Het sample 4

Potentially underrepresentedallele

Reference conform + hom bar-coded samples 1-3

Exon Capture Comparison | [email protected] | Page 12

Page 13: Allelic Imbalance for Pre-capture Whole Exome Sequencing

• If Het-variances are not captured reliably in pre-capture the

het/hom ratio would be lower and they would not overlap with DBs

Allelic Bias ?H

et/h

om r

atio

known novel

NG: More Hets in postIll: More Hets in pre

Hapmap 1000G

Fra

ctio

n of

ove

rlap

NG: slighly lower overlap Ill: no difference

Exon Capture Comparison | [email protected] | Page 13

Page 14: Allelic Imbalance for Pre-capture Whole Exome Sequencing

... they would have lower coverage

Allelic Bias ?

Asan, Xu Y et al. Genome Biol. 2011 PMID: 21955857

cove

rage

Illumina Nimblegen

INDELS

SNPs Com Post Pre

Com Post Pre

Com Post Pre

Exon Capture Comparison | [email protected] | Page 14

Page 15: Allelic Imbalance for Pre-capture Whole Exome Sequencing

1. We (and others) did not detect any obvious allelic imbalance, however no one tested samples with really rare alleles (e.g. Low cellularity in cancer)

2. To be on the save side (BACK TO THE FUTURE): we go for post-capture whole-exom-sequencing

Conclusion

Exon Capture Comparison | [email protected] | Page 15

Page 16: Allelic Imbalance for Pre-capture Whole Exome Sequencing

Thank youCMISDenis C. BauerExon Capture Comparisont +61 2 9325 3174e [email protected] www.csiro.au/cmis

CMIS

Institute for Molecular Bioscience, UQTimothy Bailey (MEME)

School of Chemistry and Molecular Biosciences, UQ

Mikael Bodén (Machine Learning)

Queensland Brain Institute, UQVikki MarshallJoon-Yong An Sam LukowskiChikako Ragan (NorahDesk)

Garvan Institute, UNSWJohn MattickFabian Buske (Triplexator)