Upload
caleb-packwood
View
215
Download
0
Embed Size (px)
Citation preview
S
Processing of miRNA samples and primary data
analysis
Preparing the starting material
Initial evaluation of small RNA sample on
Bioanalyzer Bioanalyzer small RNA chip
Mature miRNAs are 16-29 bases (usually 22-23 bases)
Library construction
Size selection for miRNA inserts
(PAGE gel, cut & purify)
80
60
PCR
135 120
Sequence on SOLiD
The size-selected, bar-coded libraries are sequenced on the SOLiD 5500.
Reads are from single end, 50 bp.
Target Read Counts for miRNA
The vast majority of miRNA-seq reads do map successfully to miRNA (~90%)
Target read counts will be a function of how well resolved low abundance miRNA need to be resolved
Large shift or shifts in abundant miRNAs do not necessitate many reads. We aimed for about 10 million reads per condition, which
was achievable for 9 samples on one multiplexed lane
Only a few miRNAs tend to dominate the population
# miRNAs
Cum
ula
tive %
of
Read
s
80% of reads from 30 miRNA; 90% from 54
~340 miRNAs were described by populations of 1000+ reads across conditions in our experiment
Treating Raw miRNA Data
Due to the short length of inserts, trimming of adapter sequence is required.
Due to a high level of redundancy, it’s often advisable to collapse identical reads to speed alignment. Unique sequences align only once rather than
aligning the same sequence thousands of times. Retain count information for quantitation following
alignment.
Aligning miRNA reads
Alignment is often performed in two stages 1st against a prepared reference containing ONLY known
miRNA sequences for the appropriate organism (miRBase or elsewhere).
2nd against the genome for identification of novel small RNA.
Any typical aligner works well for this purpose Novocraft, Bowtie(1), BWA, etc
Other packages exist that ease this process and identification of novel miRNA such as miRanalyzer.
miRanalyzer Available via command-line or by a webapp (common organisms).
http://bioinfo5.ugr.es/miRanalyzer/miRanalyzer.php
Novel miRNA and Quantitation
Novel identified sequences need to be evaluated for the possibility of forming hairpin structures miRanalyzer does this already, scoring novel alignment regions
for the possibility of forming miRNAs
Read count tables are produced for further analysis and comparison Reads per miRNA
Novel miRNA are only really comparable between experiments in which the same species are observed and are typically kept separately
Comparison Between Conditions
Normal RNAseq tools for identifying differential expression from quantitated data tables is the preferred method. DESeq, edgeR, baySeq, limma, etc
DESeq was utilized on count tables produced from miRanalyzer (and is also a part of the webapp package).
Triplicates from three experimental conditions were compared pairwise for differential expression of miRNA. p-values for exact test of change between conditions are
generated padj values result from Benjamini-Hochberg multiple testing to
determine a FDR (cutoff of 0.1 is typically applied here). Output varies depending on tool used.
Additional tasks
Target Database/Prediction mining of differentially expressed miRNAs miRbase, miRanda, TarBase (experimental
observations), etc
Validation of DE of miRNA and targets
Enrichment analysis