33
Analysis of ribo-seq data for prediction translation efficiency and protein quantity from transcriptomics data Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia Biosoft.Ru July 6th - 11th 2013, St. Petersburg, RUSSIA

Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

  • Upload
    teryl

  • View
    30

  • Download
    3

Embed Size (px)

DESCRIPTION

Analysis of ribo-seq data for prediction translation efficiency and protein quantity from transcriptomics data. Bios o ft.Ru. Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia. July 6th - 11th 2013, St. Petersburg, RUSSIA. Bukharov Aleksandr, Kiselev Ilya. - PowerPoint PPT Presentation

Citation preview

Page 1: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Analysis of ribo-seq datafor prediction translation

efficiency and protein quantity from transcriptomics data

Fedor Kolpakov

Biosoft.Ru, Ltd.Institute of Systems Biology, Ltd.

Novosibirsk, Russia

Biosoft.Ru

July 6th - 11th 2013, St. Petersburg, RUSSIA

Page 2: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Genome-scale model for prediction of synthesis rates of mRNAs and proteins

Initial data:Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J,

Wolf J, Chen W, Selbach M. Global quantification ofmammalian gene expression control. Nature, 2011, 473(7347):337-342.

- mouse fibroblasts, parallel metabolic pulse labelling- simultaneously measured absolute mRNA and protein

abundance and turnover for 5000+ genes- first genome-scale quantitative model for prediction of

synthesis rates of mRNAs and proteins

Bukharov Aleksandr, Kiselev Ilya

Page 3: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Experiment design

Page 4: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia
Page 5: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia
Page 6: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia
Page 7: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia
Page 8: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Schwanhäusser B., et al., 20011 - Fig. 6: Comparison of synthesis rates of mRNA and proteins assuming the measured levels reflect averages over one cell cycle or steady-state values. For the synthesis rates of mRNA (light gray), the deviation between the two approaches is small, because mRNA half lives are mostly smaller than the cell cycle time. For protein synthesis (dark gray), the differences are substantial; they can differ for more than one order of magnitude.

Page 9: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Schwanhäusser B., et al., 20011 - Fig. 6: Comparison of synthesis rates of mRNA and proteins assuming the measured levels reflect averages over one cell cycle or steady-state values. For the synthesis rates of mRNA (light gray), the deviation between the two approaches is small, because mRNA half lives are mostly smaller than the cell cycle time. For protein synthesis (dark gray), the differences are substantial; they can differ for more than one order of magnitude.

P – protein, exp – population mean,ss – steady state;

Page 10: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

“They do not take into account that gene expression in mammalian cells is non-continuous. In addition, the non-uniform age distribution of cells in culture as described in 19, 23 is neglected, since this effect is expected to be small compared to the deviation obtained by neglecting the cell cycle.“

Schwanhäusser B., et al., 20011, supplementary materials

Page 11: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Agent based model

each cell is an agent

4247 blocks for protein synthesis

Page 12: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Phase of a cellular cycle Percent of cells in phase

Rate of transcription

G1 50.8% Vsr

S 22.9% 0.7*Vsr G2 13% 2*Vsr M 3.3% 0 G0 10% Vsr

Tab.1. Parameters of a cellular cycle

Numerical experiment.

The initial size of population is 200 cells which divide within 108 hours. Average quantity of protein molecules were calculated. This experiment was repeated for 4247 proteins.

Page 13: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Correlation of experiment and numerical modeling is equal to R=0.99

Absolute values also were coordinated (so for 81,6% of proteins absolute values differ by less than 7%

Main deviations from experimental values are observed for proteins with extremely low copy numbers, where experimental error can be significant.

Page 14: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia
Page 15: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Ignolia N. et al., 2011

Page 16: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

- The rate of translation is remarkably consistent between different classes of messages (Figures 3D and 3E).

- The kinetics of elongation are independent of length and protein abundance and are the same in secreted proteins, whose translation occurs on the ER surface.

- Translation speed is also independent of codon usage, which is consistent with the absence of pauses at rare codons.

- Although this may be the case for specific examples, they find no evidence for a large effect on the overall rate of elongation.

An important practical implication for the universality of the average rate of elongation is that ribosome footprint density provides a reliable measure of protein synthesis independent of the particular gene being translated.

Ignolia N. et al., 2011

Page 17: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

“Our data are consistent with recent work that indirectly infers translation levels from absolute mRNA and protein abundance measurements (Schwanhausser et al., 2011).

Notably, they found that translation was the single largest contributor to protein abundance, highlighting the value of direct measurements of protein synthesis.”

Ignolia N. et al., 2011

R = 0.49

Page 18: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

R = -0.17

R = -0.41

Schwanhausser et al., 2011

Ignolia N. et al., 2011

Page 19: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Current works

1. Database on ribo-seq data

2. Analyses of lncRNA

3. Models of biological pathways involved in translation regulation(for example, mTOR)

4. More predictors for translation efficiency• protein binding sites• miRNA binding sites• …

Page 20: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Initial row data, collected from literature, GEO, SRA and ENCODE databases were systematically collected and uniformly processed using specially developed workflow (pipeline) for BioUML platform:- sequenced reads were aligned to reference genome using Bowtie;- peaks were identified using MACS and SISSR algorithms- further refinement of obtained peaks- position weight matrices (PWM) were constructed by different methods(ChIPMunk, our own methods)- ROC curves were calculated to estimate and compare built PWM- site models (PWMs + thresholds) were constucted for recognition TFbinding sites.

TFClass database is used as a core for information about transcription factors, their classification and cross-linking with Ensembl.

BioUML platform provides web interface for access to GTRD database: search information, browsing, different data views. Built-in genomebrowser provides powerful visualisation of ChIP-seq data.

GTRD - Gene Transcription Regulation Database

Page 21: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Prediction of gene expression level by ChIP-seq dataChIP-seq peaks (MACS) for histones and transcriptio factor binding sites were extracted from GTRD database for 2 cell lines: GM12878 and K562.

Machine learning - Random Forest algorithm.

R - 0.72 – 0.77

Page 22: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Ribo-seq experimentsCell type Treatment Chemicals #Samp

lesReferences

K526 (human)

Cycloheximide 1 Batut P, Dobin A, Plessy C, Carninci P, Gingeras TR «High-fidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven developmental gene expression», Genome Res. 2013 Jan;23(1):169-80

HEK293 (human)

High, medium and low Mg buffer

Cycloheximide, Harringtonine

3 Ingolia NT, Brar GA, Rouskin S, McGeachy AM, Weissman JS «The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments», Nat Protoc. 2012 Jul 26;7(8):1534-50

PC3 (human)

Rapamycin, PP242

cycloheximide 12 Hsieh AC et al. «The translational landscape of mTOR signalling steers cancer initiation and metastasis», Nature. 2012 Feb 22;485(7396):55-61

HEK293 (human)

Two ribosomal populations: free (cytosol) and ER-bound

cycloheximide 4 Reid DW, Nicchitta CV. «Primary role for endoplasmic reticulum-bound ribosomes in cellular translation identified by ribosome profiling», J Biol Chem. 2012 Feb 17;287(8):5518-27

Olga Gluschenko, Ivan Yevshin

Page 23: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Cell type Treatment Chemicals Samples number

References

Embryonic stem cells(mouse)

Control – no drug, experiment - cycloheximide, emetine

Cycloheximide, emetine

18 Ingolia NT, Lareau LF, Weissman JS. «Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes», Cell. 2011 Nov 11;147(4):789-802

HeLa (human)

Transfection miR-1 and miR-155

Cycloheximide 12 Guo H, Ingolia NT, Weissman JS, Bartel DP.«Mammalian microRNAs predominantly act to decrease target mRNA levels», Nature. 2010 Aug 12;466(7308):835-40

Neutrophils (mouse)

Mir-223 knockout mouse, wild-type mouse

cycloheximide 4 Guo H, Ingolia NT, Weissman JS, Bartel DP.«Mammalian microRNAs predominantly act to decrease target mRNA levels», Nature. 2010 Aug 12;466(7308):835-40

Embryonic stem cells(mouse)

siLuc or  siLin28a

Cycloheximide 18 Cho J et al. «LIN28A is a suppressor of ER-associated translation in embryonic stem cells», Cell. 2012 Nov 9;151(4):765-77

Fibroblasts (mouse)

DMSO – control, treatment – Rapamycin

No information 21 GSE25626

Neurons (mouse)

PBS, NaCl Cycloheximide 4 GSE40969

Page 24: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Cell type Treatment Chemicals Samples number

References

Human fibroblasts infected by cytomegalovirus

Cycloheximide, Harringtonine

16 Stern-Ginossar N et al. «Decoding human cytomegalovirus», Science. 2012 Nov 23;338(6110):1088-93

HEK293 (human)

lactimidomycin 2 SRA056494 (SRS351807, SRS351808)

Embrionic fibroblast (mouse)

Cycloheximide, lactimidomycin

2 SRA056494 (SRS351809, SRS351810)

Page 25: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Workflow for ribo-seq data analyses

Page 26: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Workflow for ribo-seq data analyses

Page 27: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Workflow for ribo-seq data analyses

Page 28: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia
Page 29: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Model of mTOR pathway

on the base of model

Richard J. Dimelow R.J. and Wilkinson S.J.

Control of translation initiation: a model-based analysis from limited experimental data. J. R. Soc. Interface(2009)6, 51–61 doi:10.1098/rsif.2008.0221

Page 30: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Lequieu J, Chakrabarti A, Nayak S, Varner JD (2011) Computational Modeling and Analysis of Insulin Induced Eukaryotic Translation Initiation. PLoS Comput Biol 7(11): e1002263.doi:10.1371/journal.pcbi.1002263

Page 31: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Wanted experiment

For the same cell line and conditions:- CAGE -> transcription start sites- RNA-seq

- polyA +/-; nucleus, cytoplasm, whole cell- ribo-seq

- harringtonine -> translation start site- cycloheximide -> translation efficiency

- protein MS- pulse labelled with heavy amino acids (SILAC, left) ->

protein abundance and turnover.

Page 32: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Current works

1. Database on ribo-seq data

2. Analyses of lncRNA

3. Models of biological pathways involved in translation regulation(for example, mTOR)

4. More predictors for translation efficiency• protein binding sites• miRNA binding sites• …

Page 33: Fedor Kolpakov Biosoft.Ru, Ltd. Institute of Systems Biology, Ltd. Novosibirsk, Russia

Acknowledgements

Ivan YevshinOlga GluschenkoEseniya BasmanovaRuslan Sharipov