Upload
brenda-lang
View
215
Download
0
Embed Size (px)
Citation preview
130 Sept. 2010
Genome Sciences Centre
BC Cancer Agency, Vancouver, BC, Canada
Malachi Griffith
ALEXA-Seq analysis reveals breast cell type specific mRNA isoforms
www.AlexaPlatform.org
2
In most genes, transcript diversity is generated by alternative expression
Types of alternative expressionGene expression
3
Transcript variation is important to the study of human disease
• Alternative expression generates multiple distinct transcript variants from most human loci
• Specific transcript variants may represent useful therapeutic targets or diagnostic markers
(Venables, 2006)
4
Massively parallel RNA sequencing
Isolate RNAs
Sequence ends
263 million paired reads21 billion bases of sequence
Generate cDNA, fragment, size select, add linkersLuminal
Map to genome, transcriptome, and predicted exon junctions
Discover isoforms and measure abundance
Myoepithelial
hESCs
vHMECs
Tissues/Cell Lines
5
Pipeline overview
6
What is an ALEXA-Seq sequence ‘feature’
Summary of features for human:~4 million total (14% ‘known’)
37k Genes62k Transcripts
278k exons2,210k exon junctions407k alternative exon boundaries560k intron regions227k intergenic regions
7
Data analyzed to date
• ALEXA-Seq processing: 19 projects – REMC + 18 others
• 105 libraries (200+ lanes)
• 3.9 billion paired-end reads
• 36-mers to 75-mers
8
Output
• Expression, differential expression and alternative expression values for 3.8 million features for each library processed
• Library quality analysis• Number of features expressed (above background)
– Genes, transcripts, exon regions, junctions, etc.
• Differential gene expression– Ranked lists
• Alternative expression– Ranked lists– Alternative isoforms involving exon skipping, alternative transcript
initiation sites, etc.– Known or predicted novel isoforms
• Candidate peptides– Ranked lists
9
ALEXA-Seq data browser(using REMC analysis as an example)
• Goals– Visualization, interpretation, design of validation
experiments, distribute results to internal/external collaborators
• What kinds of questions does ALEXA-Seq allow us to ask/answer?
• http://www.alexaplatform.org/alexa_seq/Breast/Summary.htm
10
Is the RNA-Seq library suitable for alternative expression analysis?
• Library summary• Read quality• Tag redundancy• End bias• Mapping rates• Signal-to-noise• hnRNA & gDNA
contamination• Features detected
11
Is my favorite gene expressed? alternatively expressed?
12
What are the most highly expressed genes, exons, etc. in each library?
• Expression• Differential
expression • Alternative
expression• Provided for each
feature type (gene, exon, junction, etc.)
• Ranked lists of events
13
e.g. most highly expressed genes
14
What are the top DE and AE genes for each tissue comparison?
• Candidate genes
• Each comparison
• DE or AE events
• Gains or Losses
15
Summary page for vHMECs vs. Luminal
16
Candidate features gained in vHMECs
CD10
vHMECs vs. Luminal
17
Which exons/junctions and corresponding peptides might be suitable for antibody design?
18
Candidate peptides gained in vHMECs
vHMECs vs. Luminal
19
Example housekeeping gene(Actin; no change)
20
CD10 (used to sort myoepithelial cells)
Myoepithelial & vHMECs
Luminal
422-fold higher in Myoepithelial than Luminal
21
CD227 (used to sort luminal epithelial cells)
Myoepithelial
Luminal CD227
CD227
22
Differential gene expression of CASP14 (Caspase 14 gained in vHMECs)
23
Novel skipping of PTEN exon 6
24
Exon 12 skipping of DDX5 (p68)
25
Tissue specific isoforms of CA12
Luminal
Myoepithelial vHMECs
26
Alternative first exons of INPP4B
27
Alternative first exons of SERPINB7
28
FERM domain containing proteins are alternatively expressed *
* (FRM6, FRM4A, FRMD4B are AE) (FRMD3, FRMD8 are DE)
29
Novel isoforms observed only in vHMECs
E6-E10 E7-E10
30
How reliable are predictions from ALEXA-Seq?
• Are novel junctions real?– What proportion validate by RT-PCR and Sanger
sequencing?
• Are differential/alternative expression changes observed between tissues accurate?– How well do DE values correlate with qPCR?
• To answer these questions we performed ~400 validations of ALEXA-Seq predictions from a comparison of two cell lines…
31
Validation (qualitative)
33 of 189 assays shown. Overall validation rate = 85%
32
Validation (quantitative)
qPCR of 192 exons identified as alternatively expressed by ALEXA-Seq
Validation rate = 88%
33
Conclusions
• ALEXA-Seq approach provides comprehensive global transcriptome profile– Input: paired-end RNA sequence data
– Output: expression, differential expression, alternative expression, candidate peptides, etc.
• Detection of both known and novel isoforms– Subset that differ between conditions
• Predictions are highly accurate– 86% validation rate by RT-PCR, qPCR and Sanger
sequencing
• www.AlexaPlatform.org
34
Acknowledgements
SupervisorMarco Marra
Committee Joseph ConnorsStephane FlibotteSteve JonesGregg Morin
BioinformaticsObi GriffithRyan MorinRodrigo GoyaAllen DelaneyGordon RobertsonRichard Corbett
Sequencing
Martin Hirst
Thomas Zeng
Yongjun Zhao
Helen McDonald
Laboratory
Trevor Pugh
Tesa Severson
5-FU resistance
Michelle Tang
Isabella Tai
Marco Marra
Multiple Myeloma
Rodrigo Goya
Marco Marra
Neuroblastoma
Olena Morozova
Marco Marra
Morgen
Pamela Hoodless
Jacquie Schein
Inanc Birol
Gordon Robertson
Shaun Jackman
Iressa and Sutent
Obi Griffith
Steven Jones
Lymphoma
Ryan Morin
Marco Marra
Griffith M, Griffith OL, Morin RD, Tang MJ, Pugh TJ, Ally A, Asano JK, Chan SY, Li I, McDonald H, Teague K, Zhao Y, Zeng T, Delaney AD, Hirst M, Morin GB, Jones SJM, Tai IT, Marra MA. Alternative expression analysis by RNA sequencing. In review (Nature Methods).
35