View
1.051
Download
0
Embed Size (px)
DESCRIPTION
Gene function and regulation inside mammalian cells occurs spatially and temporally within the context of local microenvironment. Each individual cell is at a particular expression stage of gene activities which defines specific cellular functions/phenotypes such as cell growth, proliferation, and interactions with other cells. A comprehensive molecular characterization of individual cells will help uncover the structure and dynamics of the cell lineage tree within a tissue/organ, in health and in disease, thus leading to a leapfrog advance in biology and medicine. This talk will focus on some of the recent development of single cell transcriptome methodologies and their applications in cancer and stem cell research. The criteria for effective single-cell transcriptome analysis are (1) to be able to measure gene expression reliably and (2) to be able to profile a large number of individual cells cost-effectively. This talk will also discuss efforts toward the development of novel in-situ sequencing platforms that could carry out targeted expression analysis of 100s to 1000s of genes in millions of individual cells simultaneously, in either the tissue at a spatial resolution of single cell or a heterogeneous cell population in tissue culture.
Citation preview
© 2013 Illumina, Inc. All rights reserved.
Illumina, IlluminaDx, BaseSpace, BeadArray, BeadXpress, cBot, CSPro, DASL, DesignStudio, Eco, GAIIx, Genetic Energy, Genome Analyzer, GenomeStudio, GoldenGate, HiScan, HiSeq, Infinium,
iSelect, MiSeq, Nextera, NuPCR, SeqMonitor, Solexa, TruSeq, TruSight, VeraCode, the pumpkin orange color, and the Genetic Energy streaming bases design are trademarks or registered trademarks
of Illumina, Inc. All other brands and names contained herein are the property of their respective owners.
High-Resolution
Transcriptome Analysis:
One Cell at a Time
AMATA 2013
Queensland, Australia
October 16, 2013
Jian-Bing Fan
Senior Director, Scientific Research
2
All junctions are covered uniformly in RNA-Seq
The Intuitive Beauty of RNA-Seq Data
3
RNA-Seq has evolved in 5 years
New methods: Stranded vs. Non-stranded
– New Stranded RNA Prep kits
New methods: Poly-A vs. Total RNA
– RiboZero kits method of choice for rRNA reduction
– Total RNA methods reveal ncRNAs and allow “RIN independent” preps
Lower Input Levels
– Standard input levels into all TruSeq RNA kits today is only 100 ng total RNA
Methods for studying highly degraded RNA
– Can sequence RNA from FFPE samples
Single Cell RNA Sequencing Methods
4
Cellular heterogeneity
– What is a cell type?
– How many cell types are there?
Non-symptomatic somatic mutations
– Cells at terminal differentiation contain “substantial” variations
Development and cellular differentiation
– Cell lineage
– Reprogramming
Metagenomes
Circulating cells (liquid biopsy)
– CTC
– Stem cells
– Fetal cells
Why single cells
5
Single cell transcriptional landscapes
6
Unbiased cell-type discovery
Sten Linnarsson, MBB, Mol Neuro
7
STRT (single-cell tagged reverse transcription)
Based on template-switching at 5’ of mRNA
Barcoding already at RT step, pooling before amplification
Sequence ~50 bp from 5’ end of mRNA (= TSS)
Highly multiplexed: 96 cells at a time
Sten Linnarsson, MBB, Mol Neuro
8 Sten Linnarsson, MBB, Mol Neuro
Reverse transcription, with TdT activity adding Cs
Template switching, PCR
Fragmentation, retaining 5’ end
P2 adapter
P1 adapter (library PCR)
Finished library
STRT (single-cell tagged reverse transcription)
9 Sten Linnarsson, MBB, Mol Neuro
1
10
100
1000
10000
1 10 100 1000 10000
Nu
mb
er
of
mo
lecu
les
(sin
gle
we
ll)
Number of molecules (single well)
R2 = 0.98
Synthetic mRNA ES cells
Reproducibility
mRNA molecules (ES cell #1)
mR
NA
mo
lec
ule
s (
ES
ce
ll #
2)
R2 = 0.97
10
Distinguish cell types by clustering
1. 96 individual cells, representing 3
different cell types were profiled.
2. Transcripts from each cell was
tagged by a short 5-base code
(during RT) and pooled from 96 cells
for amplification and made into
sequencing library for mRNA-Seq.
3. Cell neighborhood was calculated
based on individual cell expression
profiles.
4. The results is a set of clusters of
mutually similar cells, which
reflected the true identity of cells
Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Islam S,
Kjällquist U, Moliner A, Zajac P, Fan JB, Lönnerberg P, Linnarsson S. Genome Research. 2011.
Sten Linnarsson, Karolinska Inst
Embryonic stem cells
Embryonic fibroblasts (MEF)
Neuroblastoma
(Neuro2A)
11
Gene expression mapped on the
cellular landscape. The number of hits
to each gene, normalized to
transcripts per million (t.p.m.)
sequencing reads is shown on a
logarithmic color scale (inset, upper
left). The left column shows
housekeeping genes selected from a
range of average t.p.m. levels. The
middle column shows genes known as
ES cell markers. The right column
shows genes that were determined in
this study to be preferentially
expressed in Neuro2A.
Cell type specific expression pattern
12
Single-cell transcriptional profiling
13
Clontech SMARTer ultra low RNA kit
for Illumina sequencing
14
Sequencing the transcriptome of a single cell
Sort Cells Smart-Seq
Amplification
Illumina
Library Prep
NGS
Sequencing
cDNA
Good
Bad
Cells RNA
1 0.01 ng
10 0.1ng
100 1 ng
1000 10 ng
10000 100 ng
15
SMARTer™ technology overview
Key aspects of SMARTer™ protocol:
switching mechanism at 5’ end of
RNA template
Single tube, single enzyme cDNA
synthesis
SMARTer oligo provides increased
template switching efficiency of RT
Minimal handling of starting material
lowers the probability of RNA
degradation
Enrichment for full-lengths cDNA
transcripts
16
DAY TWO
Workflow overview
Total RNA
SMART cDNA Synthesis
Full-length ds cDNA Amplification
Covaris
End Repair
A tailing
Adp ligation
PCR Amplification
1 day
Total RNA
SMART cDNA Synthesis
Full-length ds cDNA Amplification
Nextera Tagmentation
PCR Amplification
• ~ 5 hour
• Automatable Spri
purification
• < 2 hour
• Automatable Spri
purification
17
Primary sequencing metrics
0.00
20.00
40.00
60.00
80.00
100.00
120.00
10ng
rep1
10ng
rep2
1ng rep1 1ng rep2 0.1ng
rep1
0.1ng
rep2
0.05ng
rep1
0.05ng
rep2
0.01ng
rep1
0.01ng
rep2
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
%unique reads
%mapped reads
% rRNA
gene
18
Reproducibility with various amounts of input RNA
Scatter plots comparing gene counts (i.e., log2 RPKM values) for replicate
samples prepared using 10 ng, 1 ng, and 0.1 ng of mouse brain total RNA
Input levels represent the amount of RNA obtained from ~500, 50, and 5
cells, respectively
With decreased amount of input reproducibility is typically decreased
1 ng 0.1 ng 10 ng
19
Sequencing coverage of SMARTer ultra low library
724 genes analyzed for average coverage across the entire length of
the transcripts
The graphs show consistent results between the 1 ng, 0.1 ng, 0.5 ng
and 0.01 ng input amount of mouse brain total RNA
% distance from 5’
Ba
se
Cove
rage
20
Number of genes retained: 705
Correlation (R): 0.942
Slope: 0.913
Number of genes retained: 581
Correlation (R): 0.856
Slope: 0.754
-10 -5 0 5 10
-10
-50
51
0
Log2 sequencing count ratio (brain vs UHR)
Lo
g2
qP
CR
ra
tio
(b
rain
vs U
HR
)
-10 -5 0 5 10
-10
-50
51
0
Log2 sequencing count ratio (brain vs UHR)
Lo
g2
qP
CR
ra
tio
(b
rain
vs U
HR
)
MAQC UHR/Brain
1ng Total RNA
Accuracy of SMARTer ultra low compared to Taqman
MAQC UHR/Brain
0.1ng Total RNA
21
Performance summary
Sensitive cDNA synthesis technology combined with Illumina next-
generation sequencing
Single-tube protocol, robust library generation starting from picogram
quantities of total RNA
High mapping rate, wide dynamic range, accurate gene
quantification, and uniform transcript coverage
The SMARTer kit has been used and validated by more than 100
labs around the world
Fluidigm C1 Single-Cell Autoprep system has been customized for
SMARTer assay
22
Example 1:
Gene-expression “landscape” of
hematopoietic stem cells (HSCs)
23
Transcriptional ‘architecture’ of the first steps of the
human hematopoietic hierarchy
The transcriptional architecture of early human hematopoiesis identifies multilevel control of
lymphoid commitment. Elisa Laurenti, Sergei Doulatov, Sasan Zandi, Ian Plumb, Jing Chen, Craig April,
Jian-Bing Fan & John E Dick. Nature Immunology. 2013.
John Dick, University of Toronto
‘Distances’ between
hematopoietic populations,
as measured by difference in
expression in the
downstream population
relative to that in its
progenitor (over twofold
difference; FDR, <0.05),
overlaid on the present
hierarchical model of human
hematopoietic differentiation.
24
Example 2:
Single-cell transcriptome analysis of
mammalian cell cycle
25
Single-cell transcriptomes of different cell cycle stages
E xpres s ion of C dt1 and G eminin
-5000
0
5000
10000
15000
20000
25000
30000
35000
40000
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
C yc le Number
Flu
ore
sc
en
ce
(d
R)
G2
G1
Li et al, Biotechnol. Adv. 2013
John Zhong, USC
26
Molecular map of cell cycle
John Zhong, USC
Single-cell transcriptomes can be organized by similarity into a molecular
map to re-constructs stepwise cell cycle events at the molecular level
27
Example 3:
NIH single cell analysis program (SCAP)
28
29
The UCSD (PI)/Harvard/Scripps/Illumina team
George
Church Jerold Chun
Kun Zhang
Wei Wang Mostafa Ronaghi
Jian-Bing Fan
TSRI
UCSD
Illumina
Harvard
Samples
Data
Methods
30
NIH Single Cell Analysis Program
Three centers funded from the National Institutes of
Health's Common Fund, through its Single Cell
Analysis Program (SCAP).
– UCSD, USC and UPenn
Single-cell sequencing and in-situ mapping of
mRNA transcripts in human brains:
– Generating total-RNAseq data on 10,000
microdissected single cells or flow-sorted single
nuclei from Human Cortex and to create a 3D
transcriptional map of the human brain.
– Development and optimization of an in-situ RNA
sequencing technology.
– In-situ mapping of ~500 transcripts in 36 cortex
sections, and integration with 10,000 sets of total-
RNAseq data.
– Includes UCSD (Kun Zhang (PI), Wei Wang), Scripps
(Jerold Chun), Harvard (George Church), Illumina
(Jian-Bing Fan, Mostafa Ronaghi)
.
31
Approach
Sample preparation (TSRI).
– Microdissection of neurons and glia.
– Flow sorting of neuronal and non-neuronal nuclei.
Single-cell total-RNAseq (Illumina & UCSD).
– RNA transcripts +/- A-tails.
– Long and short transcripts.
– Strand-specificity.
– Batch processing in 96-well plates.
RNA in situ sequencing (UCSD, Harvard & Illumina).
– In-situ conversion of single RNA molecules into DNA nanoballs (rolonies).
– In-situ decoding and counting by hybridization or sequencing on automated confocal microscope with customized fluidic devices.
32
Single-cell transcriptome sequencing methods
Surani/LifeTech: Full length mRNA (Tang et al. 2009)
STRT: mRNA 5’-end sequencing (Islam et al. 2011)
CEL-seq: mRNA 3’-end sequencing (Hashimshony et al. 2012)
Smart-seq: Full-length mRNA (Ramskold et al. 2012)
Smart-seq2: Full-length mRNA (Picelli et al. 2013)
Toto-RNAseq (UCSD/Illumina, being developed)
– Full length
– Strand specific
– mRNAs and ncRNAs
– High throughput
33
Context is important
Murray et al. Nat. Method, 2008
34
RNA FISH
RNA FISH +
epifluorescent imaging
Barcoded RNA FISH +
STORM
Raj et al. Nat. Methods, 2008
Lubeck et al. Nat. Methods, 2012
35
In situ sequencing for RNA analysis in preserved tissue
and cells
Ke and Nilsson et al. Nat. Method, 2013
36
Fluorescent in situ sequencing (FISSEQ)
Jay Lee and George Church, Harvard
37
Two sequencing chemistries
Jay Lee and George Church, Harvard
38
Characterization of the 3D RNA-Seq library
The system was able to
sequence the whole
transcriptome in situ in 3D,
mapping over 100,000
reads and 6000 clusters,
detecting mRNA, ncRNA,
and antisenseRNA which
can then strongly indicate
the cell type.
Jay Lee and George Church, Harvard
39
Cancer
– Early diagnosis of cancer
Circulating tumor cells may be present before …
Limited clinical samples and early stage cancers
Heterogeneity in tumors
– Change in clonal population post-treatment
Brain transcriptome
– 3-D transcriptome map of a brain at high resolution
Human cell lineage tree in health and disease (European Commission)
Embryo to Adult
– Accumulation of somatic mutations with cell division
– Stem cell differentiation
– Cellular origin mapping
Fetal cells
Single cell microbes (metagenomes)
Single cell sequencing applications
40
Summary
Single cell transcriptomes provided comprehensive molecular characterization of
individual cells and revealed unique cell types/stages; discovered cell types
correspond to marker-based cell types
Systematic whole-organism cell mapping is feasible
– Millions of single-cell transcriptomes needed
Future technology development and integration
– Isolation, identification & characterization of cells from all organs and systems in
health, disease, & post-mortem
– Molecular characterization of individual cells (e.g. single cell RNA-Seq)
– Platforms: Next-gen sequencing, microfluidics, DNA arrays, & other analyses of
individual cells
– Three-dimensional subcellular transcriptome sequencing in situ
– Real-time measurement
– Computer Science & Systems: Extremely large-scale data capture, analysis,
coalescence & management tools, methods & algorithms, cell lineage analysis &
reconstruction algorithms, interactive data analyses & presentation.
– Mathematics & Statistics
41
Acknowledgements
STRT technology development
Sten Linnarsson (Karolinska Inst)
Saiful Islam (Karolinska Inst)
SMART kit development
Shujun Luo (Illumina)
Gary Schroth (Illumina)
Richard Sandberg (Ludwig Institute for Cancer Research)
Daniel Ramskold (Ludwig Institute for Cancer Research)
Andrew Farmer (Clontech)
HSC and cell cycle projects
John Dick (Ontario Cancer Institute, University of Toronto)
Elisa Laurenti (Ontario Cancer Institute)
John Zhong (University of Southern California)
NIH SCAP
Kun Zhang (PI; UCSD)
Wei Wang (UCSD)
Jerold Chun (Scripps)
Jian-Bing Fan (Illumina)
Mostafa Ronaghi (Illumina)
Jay Lee (Harvard)
George Church (Harvard)
42
Thank You
43
Fluorescent in situ sequencing (FISSEQ)
Jay Lee and George Church, Harvard