Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Ribonucleoprotein particles: advances and challengesin computational methodsShlomi Dvir1, Amir Argoetti1 and Yael Mandel-Gutfreund1,2
Available online at www.sciencedirect.com
ScienceDirect
RNA-binding proteins (RBPs) interact with RNA to form
Ribonucleoprotein Particles (RNPs). The interaction between
RBPs and their RNA partners are traditionally thought to be
mediated by highly conserved RNA-binding domains (RBDs).
Recently, high-throughput studies led to the discovery of
hundreds of novel proteins and domains, of which many do not
follow the classical definition of RNA-binding. Despite
technological innovations, experimental screenings are
currently limited to the detection of specific types of RNPs,
underscoring the importance of computational methods for
predicting novel RBPs and RNA interacting residues and
interfaces. Here, we discuss major challenges in computational
prediction of RBPs and RBDs and outline new strategies to
circumvent current limitations of experimental techniques.
Addresses1 Faculty of Biology, Technion-Israel Institute of Technology, Haifa
32000, Israel2Department of Computer Science, Technion-Israel Institute of
Technology, Haifa 32000, Israel
Corresponding author: Mandel-Gutfreund, Yael ([email protected])
Current Opinion in Structural Biology 2018, 53:124–130
This review comes from a themed issue on Protein–nucleic acid
interactions
Edited by Eric Westhof and Dinshaw Patel
https://doi.org/10.1016/j.sbi.2018.08.002
0959-440X/ã 2018 Elsevier Ltd. All rights reserved.
IntroductionWithin cells, RNA and proteins assemble into dynamic
nucleic-acid protein complexes named Ribonucleoprotein
Particles (RNPs) [1]. Traditionally, RNPs are thought to be
formed by interactions between RNA and RNA-binding
proteins (RBPs) that harbor evolutionary conserved RNA-
binding domains (RBDs). RBDs are the basic units respon-
sible for RNA recognition and binding and act in a combi-
natorial fashion to perform multiple functions [2]. The
RNA Recognition Motif (RRM) [3] is the most abundant
RBD in mammalian cells, binding a variety of RNA
sequences and structures via diverse modes of recognition
[4]. In addition to RRM, other well-conserved classical
RBDs exist (for details, see [2]). In 2002, Anantharaman
Current Opinion in Structural Biology 2018, 53:124–130
et al. provided the first comprehensive list of �100 evolu-
tionary conserved domains involved in RNA metabolism
[5]. More recently, Gerstberger et al. extended the list to
799 RBDs that were either experimentally confirmed to
bind RNA or found exclusively in well-characterized
RNPs, assembling the first census of 1,542 human RBPs
[6]. A revolution in the field of RBPs came from high-
throughput RNA interactome capture (RIC) studies [7��].The first experiments in human cell lines revealed hun-
dreds of new RBPs that form complexes with polyadeny-
lated [poly(A)] RNAs [8,9]. Variations of the initial meth-
odology have been applied to screen for RBPs in other cells
types and organisms, generating an unprecedented atlas of
RBPs (for a recent comprehensive review, see Hentze et al.[7��]). To extend our understanding of protein-RNA rec-
ognition other complementary methods were developed,
specifying the RNA-binding regions within proteins at
peptide-level resolution, for example, RNPxl [10],
RBDmap [11] and RBR-ID [12].
Whereas screening studies have significantly broadened
the repertoire of RBPs, they mainly enabled the discovery
of proteins that bind stable poly(A) RNA (e.g. eukaryotic
mRNAs and long non-coding RNAs). RBPs that likely
escaped discovery by RIC experiments include proteins
that bind non-poly(A) RNAs, proteins that regulate small
RNAs, bacterial RNAs, and most eukaryotic organelle
RNAs. Recently, two studies established a new methodol-
ogy to capture RBPs, independent of the polyadenylation
state of RNA, taking advantage of click chemistry to
capture metabolically labeled RNAs [13,14�]. Notably,
althoughthetotalnumber ofRBPs didnotchangeradically,
the latter studies provided a valuable list of novel proteins,
adding to the growing compendium of RBPs generated by
manual curation efforts, homology-based computational
analysis, and proteome-wide discovery. Despite the
increasing number of novel RBPs and RNPs identified,
limitations in existing technologies suggest that many
RBPs are yet to be discovered. In this review, we discuss
the challenges faced by researchers studying RBPs and
protein-RNA interactions. We further present an overview
of the available state-of-the-art computational approaches
for discovering novel RBPs, identifying RNA-binding sites
in proteins and predicting modes of binding.
Challenges in predicting RNA-binding proteinsRNA-binding proteins may lack known RNA-binding
domains
Consistent with early findings [6], a large fraction of the
RBPs identified in screening experiments do not harbor a
www.sciencedirect.com
Ribonucleoprotein particles: a theoretical perspective Dvir, Argoetti and Mandel-Gutfreund 125
known RBD [8,9]. This observation was recapitulated in
other studies from different cell types and organisms [7��].Figure 1 depicts a summary of the RBPs detected in
screening experiments for six major organisms. Based on
their domain content, RBPs wereclassified as RBD(blue) if
they contained at least one known RNA-binding domain,
RBD-unknown (green) if they did not harbor a known
RBD, and no-Pfam-domain (purple) when no domains
could be detected (for a detailed list, see Supplementary
data). Notably, between 42% and 64% of RBPs lack known
RBDs in human and roundworm, respectively (Figure 1;
green and purple fractions). The observed differences in
the fraction of proteins that are classified as RBD-unknown
can be attributed to experimental coverage, as well as to the
total number of characterized RBDs in each organism.
Despite these differences, the high proportion of proteins
that lack known RBDs reinforces the complexity and
challenges associated with the computational prediction
of RNA-binding function in organisms in which experi-
mental data on RBPs is limited.
As previously reported and shown in Figure 1, many
proteins detected in screens for RBPs do not possess a
structured Pfam domain [7��]. Moreover, experimentally
Figure 1
Human Mouse Yeast
Fruit fly Thale cress Roundworm
RBD RBD-unknown No pfam domain
58.38% 45.58% 37.64%
4.47%
2.96%49.95% 56.14%38.66%
49.29% 45.27% 36.09%
61.38%
6.82%
47.91%44.79%
5.92%
2.53%
6.22%
Current Opinion in Structural Biology
RBP domain distribution across major eukaryotic kingdoms. Fraction
of proteins harboring RNA-binding domains (RBDs) in high-throughput
screening studies for RBPs. Protein domains were obtained from Pfam
release 31.0. Domains were classified as RNA-binding (RBD) or
unknown to function in RNA recognition. See Supplemetary data for
the definition of RBDs (primarily based on Gerstberger et al. [6]). A
protein was defined as RBD (light blue) if it had at least one RNA-
binding domain or RBD-unknown if no RBD was found (light green).
Proteins for which no Pfam domains could be identified were
classified as no-Pfam-domain (purple). Analyzed organisms include
Human (Homo sapiens, Taxonomy ID: 9606), Mouse (Mus musculus,
Taxonomy ID: 10090), Yeast (Saccharomyces cerevisiae, Taxonomy
ID: 559292), Fruit fly (Drosophila melanogaster, Taxonomy ID: 7227),
Thale cress (Arabidopsis thaliana, Taxonomy ID: 3702) and
Roundworm (Caenorhabditis elegans, Taxonomy ID: 6239). The full list
of proteins and their categories is provided in Supplementary data.
www.sciencedirect.com
detected RBPs were shown to be highly enriched in Intrin-
sically Disordered Regions (IDRs) suggesting that, in many
cases, protein-RNA interactions within RNP complexes
are mediated by non-structured polypeptide chains
[15,16�]. In line with these findings, methods that are
dedicated to identifying RBDs have shown that approxi-
mately half of the protein regions that directly interact with
RNA are mapped to sequences with low amino acid com-
plexity and high disorder propensity [11,12]. While experi-
mental evidence strongly indicates that IDRs are charac-
teristic of RBPs, this feature is by no means unique to RBPs
[15]. For example, IDRs are commonly found in the tails of
DNA-Binding Proteins (DBPs) and were suggested to
facilitate the process of transcription factor-mediated
DNA search [17]. Thus, although disorder propensity is
a characteristic feature of RBPs, it cannot be used exclu-
sively for either predicting novel RBPs or identifying RNA
binding sites within known RBPs.
Classical RNA-binding domains do not necessarily bind
RNA
Extensive structural studies of protein-RNA complexes
revealed general principles that guide RNA recognition
via classical RBDs [18]. The accumulation of X-ray crys-
tallography- and NMR-based structural data for the most
abundant RBDs in mammalian cells (RRM, KH), pro-
vides valuable insights into the diverse and complex
modes of interaction adopted by these domain families
[3,19]. Notably, while RRM, KH and other classical
RBDs have evolved to bind RNA, members of these
families were shown to participate in other functions.
Some RRM domains (also known as atypical RRMs) have
been implicated in mediating protein-protein interac-
tions, rather than binding RNA, while others bind
RNA as well as proteins [20]. A well-characterized exam-
ple is the RRM domain of Y14. This domain directly
interacts with the Mago protein, as part of the Exon
Junction Complex (EJC) and has no direct contact with
RNA [21] (Figure 2a). Examples of several other RRM
domains that have been shown to mediate protein-protein
interactions are illustrated in Figure 2. Overall, the obser-
vations that classical RNA-binding domains do not nec-
essarily bind RNA raise a major question as to what extent
can we rely on functional annotation of RNA-binding that
is based solely on the presence of classical RBDs.
Many RBPs act as moonlighting proteins
An intriguing conclusion arising from previous RIC experi-
ments is that many of the detected proteins have other non-
RNA-related functions. In many cases, these proteins were
not previously shown to bind RNA [7��]. Among the
significantly enriched, non-RNA-binding functions, are
glycolytic enzymatic activity and intermediary metabo-
lism, specifically enriched in the yeast [22,23] and Arabi-dopsis mRNA interactome, respectively [24]. A well-known
dual-function protein in the intermediary metabolic path-
way is the Iron Regulatory Protein 1 (IRP1) [25]. Other
Current Opinion in Structural Biology 2018, 53:124–130
126 Protein–nucleic acid interactions
Figure 2
Y14-MagoFruit fly
U2AF2Human
RBM17Human
NCBP2-NCBP1Human
U2AF1Human
SET1Yeast
(a) (c) (e)
(b) (d) (f)
Current Opinion in Structural Biology
RRM domains predicted by computational approaches as non-RNA-binding. The RRM domains (in green) were experimentally validated to be
involved in protein–protein interactions and were predicted by structural-based computational methods as non-RNA-binding. (a) RRM domain of
the Fruit fly Y14 protein in complex with the Mago and Pym proteins (colored in bronze) in the Exon Junction Complex (PDB 1rk8). (b) RRM
domain of the human NCBP2 interacting with NCBP1 protein (bronze) within the Cap Binding Complex (PDB 6d0y). (c) The third RRM of the
human U2AF2 protein that was solved in complex with an N-terminal SF1 peptide (not shown) (PDB 1opi). (d) RRM domain of U2AF1 from the
U2AF1/U2AF2 human complex (PDB 1jmt). (e) The human RBM17 (SFP45) RRM domain known to mediate protein-protein interaction in the
spliceosome (PDB 2pe8). (f) The N-terminal RRM domain of SET1 from Saccharomyces cerevisiae (PDB 2j8a). Molecular graphics and analyses
were performed with the USCF Chimera package [50].
interesting groups of non-classical RBPs include proteins
involved in chromatin regulation [12] and double-stranded
(ds) DNA-binding [26]. While the enrichment for DNA-
related functions in published interactomes may result
from contamination, growing evidence supports the exis-
tence of proteins that bind DNA as well as RNA, in a
specific manner, namely DNA and RNA binding proteins
(DRBPs) [27]. Due to the complexity of the problem, it is
not trivial to predict dual-binding proteins, and specifically
moonlighting proteins, which primarilybind RNA but carry
out other functional activities [28]. The dual-binding
nature of some RBPs urges the development of new
experimental and computational methods that can
uniquely identify the specific functional regions and their
interacting partners.
Computational advances in studyingribonucleoproteinsPredicting classical and non-classical RNA-binding
proteins and domains
Recent efforts to systematically detect RBPs and RBDs
have considerably expanded the repertoire of RBPs,
Current Opinion in Structural Biology 2018, 53:124–130
originally generated from low-throughput structural and
biochemical studies. Despite these advances, it is likely
that many more RBPs await discovery. Among the under-
represented RBPs are proteins that bind non-poly(A)
RNAs, lowly expressed RBPs and species-specific RBPs
from organisms that have not been thoroughly studied. In
addition to experimental characterization, significant
attempts have been made to improve on existing compu-
tational algorithms to reliably predict RBPs. Such RBP
prediction methods can be roughly divided into two
groups: sequence-based approaches (e.g. RBPPred [29])
that are trained on evolutionary information and physico-
chemical properties extracted from primary sequence,
and structure-based methods that rely on the availability
of the protein three-dimensional structure [30–32]. Other
algorithms (e.g. SPOT-Seq-RNA [33]) integrate tem-
plate-based structure prediction (fold recognition) to infer
RNA function from sequence. Notably, most existing
sequence-based and structure-based methods for RBP
prediction strongly depend on evolutionary conservation
and thus are not appropriate for predicting novel RBPs
that do not share homology with known RBPs. To
www.sciencedirect.com
Ribonucleoprotein particles: a theoretical perspective Dvir, Argoetti and Mandel-Gutfreund 127
account for this, non-homology based methods have been
developed and shown to be successful in predicting novel
RBPs [31] and in classifying classical RBDs as nucleic-
acid or non-nucleic-acid binding [34�]. Figure 2 illustrates
six RRM domains that were experimentally confirmed to
be involved in protein-protein interactions (such as Y14),
all correctly classified as non-RNA-binding by structure-
based prediction algorithms (such as [34�]). However,
such methods that rely on features extracted from solved
structures or structural models would likely fail to predict
RBPs that are intrinsically disordered. To overcome this,
a new computational algorithm combined the prediction
of structured and disordered regions, attempting to pre-
dict all RBPs in the human proteome [35]. A different
computational approach that relies on experimental pro-
tein-protein interaction data, named SONAR (Support
vector machine Obtained from Neighborhood Associated
RBPs), takes advantage of the strong tendency of RBPs to
interact with one another [36�]. SONAR has been suc-
cessfully applied to predict a large number of RBPs in
several model organisms [36�].
Although considerable progress has been achieved by the
development of powerful machine learning algorithms,
current RBP predictors miss-annotate a significant frac-
tion of the experimentally detected proteins (presumably
false negatives, FNs). On the other hand, many proteins
predicted to bind RNA are not detected by existing
experimental methodologies (presumably false positives,
FPs). The relatively high number of miss-classified pro-
teins (FNs and FPs) is likely due to the incredible
complexity of the computational problem and the fact
that we still lack a complete understanding of the physi-
cochemical, molecular and structural properties that
endow proteins with RNA-binding capacity. Nonethe-
less, the discrepancy between the computational predic-
tions and the experimental results can also be explained
by shortcomings in existing technologies that are not
tailored to detect the various types of RBP populations.
Predicting RNA-binding sites in proteins and their
interactions with RNAs
In parallel to the development of computational tools for
RBP prediction, dedicated sequence-based and struc-
ture-based algorithms were designed to map the RNA-
interacting surface of proteins and to predict the RNA-
binding residues that mediate protein-RNA interactions
(reviewed in [37��]). In a comprehensive survey, Miao
and Westhof [38��] assessed a large number of computa-
tional programs for the prediction of nucleic-acid binding
sites (implemented in web servers and standalone pro-
grams), providing an extensive view of the existing tools
for predicting RNA-binding and DNA-binding residues.
The results of the survey reinforce the difficulty in
uniquely identifying the RNA-interacting regions in pro-
teins, as the high similarity in the electrostatic properties
renders RNA and DNA binding surfaces practically
www.sciencedirect.com
indistinguishable. Although differences in the shape
and geometry of DNA and RNA binding interfaces
(explicitly between interfaces that bind ssRNA and
dsDNA) have been previously described [39,40], RNA
and DNA binding residues have highly similar features,
leading to many false predictions [38��]. To address this,
DRNApred was recently introduced, employing a two-
layer architecture to reduce cross-predictions of nucleic-
acid binding residues [41].
As discussed above, many RBPs do not possess a classical
RBD or well-defined RNA-binding features. Given that
most methods that predict RNA-binding interfaces con-
sider evolutionary information, they are less suitable for
predicting the location of RNA-binding sites in non-
classical RBPs/RBDs. A well-studied example of a non-
classical RBP is the glycosidic enzyme glyceraldehyde 3-
phosphate dehydrogenase (GAPDH). Early studies have
demonstrated that GAPDH binds tRNAs [42]. GAPDH
was further suggested to bind AU-rich elements in the
30UTR of interferon-gamma (IFNg) mRNA, possibly
regulating its turnover and translation [43]. While most
computational methods fail to predict GAPDH as RBP,
algorithms that rely on physicochemical features for pre-
dicting RNA-binding interfaces, as OPRA (Optimal Pro-
tein-RNA Area) [44] and BindUP [34�], point to a similar
region on the surface of the homodimer as likely to bind
RNA (Figure 3a and b, respectively). Moreover, as
demonstarted in the figure, methods for predicting
RNA-binding residues, such as aaRNA [45], successfully
identify specific residues that were suggested by early
biochemical studies to mediate RNA-recognition [43]
(Figure 3c).
Docking RNAs to classical and non-classical RBPs
Predicting RBPs and RNA-binding sites is an immense
computational challenge. The hardest and most reward-
ing task would be to capture the specific interactions
between the protein and RNA within the RNP complex.
In contrast to the advances achieved in modeling protein–
protein interactions, the prediction of protein-RNA inter-
actions is still considered a daunting task [46]. Many
methods that were originally designed for docking pro-
tein-protein and protein-peptide complexes have also
been utilized for protein-RNA docking, for example
HDOCK [47]. Recently, the Bujnicki’s group has intro-
duced NPDock, a web server dedicated to protein-
nucleic acid docking that combines specific protein-
RNA scoring functions with established docking tools
[48] (see [49] and [37��] for reviews of methods for
protein-RNA docking). Existing tools for docking RNA
to proteins are limited by the availability of experimen-
tally determind protein and RNA three-dimensional
structures, the ability to correctly model the structure
of the individual parts of the complex and the outstanding
complexity in modeling the flexibility of the system.
Nevertheless, the growing number of RBP structures
Current Opinion in Structural Biology 2018, 53:124–130
128 Protein–nucleic acid interactions
Figure 3
(d) (e)
(a) (b) (c)
RNA binding-propensity RN A binding-propensityPredicted as RN A binding patch
Most favorable
Least favorable
High Low
Current Opinion in Structural Biology
Computational approaches predict the RNA-binding interfaces and RNA-binding sites on the solved structure of the non-classical RBP GAPDH
(PDB 5c7i). (a) RNA-binding propensity calculated by OPRA [44], predicting the RNA-binding interface on the GAPDH homodimer. (b) The largest
electrostatic patch calculated by BindUP [34�] corresponds well to the predicted RNA-binding interface on GAPDH homodimer. (c) The predicted
RNA-binding residues on GAPDH (predicted by aaRNA [45]) points to the NAD+ binding site, previously suggested to be involved in RNA-binding
[43]. (d) The 30UTR of IFNg (ENST00000229135, positions 659–700) docked to GAPDH by HDOCK [47]. Intriguingly, the region predicted by the
docking algorithm completely overlaps with the unified RNA-binding interface as calculated by OPRA and BindUP (colored in purple). (d) A view of
the predicted interactions between the AUUUA region (red) on the IFNg 30UTR and the predicted RNA-binding residues on GAPDH. Molecular
graphics and analyses were performed with the USCF Chimera package [50].
and protein-RNA complexes in the Protein Data Bank
(PDB) contributes to the continuous improvement in
performance accuracy of the methods. An attempt in this
direction is exemplified in Figure 3d and e, presenting a
predictive model of the IFNg 30UTR docked to the non-
classical RBP GAPDH.
ConclusionsRecent experimental technologies have dramatically
expanded the compendium of RBPs, contributing to
our understanding of the basic principles that govern
protein-RNA recognition. Computational work carried
out in parallel to the experimental efforts has further
Current Opinion in Structural Biology 2018, 53:124–130
advanced the identification and characterization of RBPs,
RBDs and their interaction with RNA. To uncover the
entire atlas of RBPs, a number of key challenges remain
and must be addressed: (a) approximately 50% of RBPs
lack known RBDs and thus cannot be discovered by
inferring homology from known RBPs, (b) classical RBDs
have extensively diverged during evolution to the extent
that some do not necessarily bind RNA, leading to many
false positive predictions, (c) many RBPs may have dual-
binding function, acting as moonlighting proteins, hence
adding to the difficulty of RBP prediction. These chal-
lenges can be partially tackled by new computational
algorithms and machine learning techniques that rely
www.sciencedirect.com
Ribonucleoprotein particles: a theoretical perspective Dvir, Argoetti and Mandel-Gutfreund 129
on the continuous flow of diverse types of RBPs identified
by existing and new experimental technologies. Future
advances are needed to fully capture the census of RBPs,
to understand their regulatory functions and modes of
interactions and to unravel the mechanistic details and
dynamics of RNPs.
FundingThis work was supported by The Israel Science Founda-
tion ISF [Grant number 1182/16].
Appendix A. Supplementary dataSupplementary material related to this article can be
found, in the online version, at doi:https://doi.org/10.
1016/j.sbi.2018.08.002.
References and recommended readingPapers of particular interest, published within the period of review,have been highlighted as:
� of special interest�� of outstanding interest
1. Singh G, Pratt G, Yeo GW, Moore MJ: The clothes make themRNA: past and present trends in mRNP fashion. Annu RevBiochem 2015, 84:325-354.
2. Lunde BM, Moore C, Varani G: RNA-binding proteins: modulardesign for efficient function. Nat Rev Mol Cell Biol 2007, 8:479-490.
3. Daubner GM, Clery A, Allain FH: RRM-RNA recognition: NMR orcrystallography . . . and new findings. Curr Opin Struct Biol2013, 23:100-108.
4. Clery A, Blatter M, Allain FH: RNA recognition motifs: boring?Not quite. Curr Opin Struct Biol 2008, 18:290-298.
5. Anantharaman V, Koonin EV, Aravind L: Comparative genomicsand evolution of proteins involved in RNA metabolism. NucleicAcids Res 2002, 30:1427-1464.
6. Gerstberger S, Hafner M, Tuschl T: A census of human RNA-binding proteins. Nat Rev Genet 2014, 15:829-845.
7.��
Hentze MW, Castello A, Schwarzl T, Preiss T: A brave new worldof RNA-binding proteins. Nat Rev Mol Cell Biol 2018, 19:327-341.
This is a recent comprehensive review discussing the state-of-the-arthigh-throughput experimental technologies for discovering RNA-bindingproteins and the main expected and unexpected discoveries arising fromthese experiments.
8. Baltz AG, Munschauer M, Schwanhausser B, Vasile A,Murakawa Y, Schueler M, Youngs N, Penfold-Brown D, Drew K,Milek M et al.: The mRNA-bound proteome and its globaloccupancy profile on protein-coding transcripts. Mol Cell2012, 46:674-690.
9. Castello A, Fischer B, Eichelbaum K, Horos R, Beckmann BM,Strein C, Davey NE, Humphreys DT, Preiss T, Steinmetz LM et al.:Insights into RNA biology from an atlas of mammalian mRNA-binding proteins. Cell 2012, 149:1393-1406.
10. Kramer K, Sachsenberg T: Photo-cross-linking and high-resolution mass spectrometry for assignment of RNA-bindingsites in RNA-binding proteins. Nat Methods 2014, 11:1064-1070.
11. Castello A, Frese CK, Fischer B: Identification of RNA-bindingdomains of RNA-binding proteins in cultured cells on asystem-wide scale with RBDmap. Nat Protoc 2017, 12:2447-2464.
12. He C, Sidoli S, Warneford-Thomson R, Tatomer DC, Wilusz JE,Garcia BA, Bonasio R: High-resolution mapping of RNA-bindingregions in the nuclear proteome of embryonic stem cells. MolCell 2016, 64:416-430.
www.sciencedirect.com
13. Bao X, Guo X, Yin M, Tariq M, Lai Y, Kanwal S, Zhou J, Li N, Lv Y,Pulido-Quetglas C et al.: Capturing the interactome of newlytranscribed RNA. Nat Methods 2018, 15:213-220.
14.�
Huang R, Han M, Meng L, Chen X: Transcriptome-widediscovery of coding and noncoding RNA-binding proteins.Proc Natl Acad Sci USA 2018, 115:e3879-e3887.
This paper presents a novel experimental methodology to capture RNA-binding proteins based on click chemistry. The method overcomes themajor limitation of current technologies that capture mainly RNPs withpolyadenylated RNAs.
15. He B, Wang K, Liu Y, Xue B, Uversky VN, Dunker AK: Predictingintrinsic disorder in proteins: an overview. Cell Res 2009,19:929-949.
16.�
Calabretta S, Richard S: Emerging roles of disorderedsequences in RNA-binding proteins. Trends Biochem Sci 2015,40:662-672.
This review discusses the unique properties of RNA-binding proteinspossessing intrinsically disordered features and the role of disorderedsequences in the function of RNA-binding proteins and the formation ofRNPs.
17. Vuzman D, Levy Y: Intrinsically disordered regions as affinitytuners in protein-DNA interactions. Mol Biosyst 2012, 8:47-57.
18. Chen Y, Varani G: Protein families and RNA recognition. FEBS J2005, 272:2088-2097.
19. Nicastro G, Taylor IA, Ramos A: KH-RNA interactions: back inthe groove. Curr Opin Struct Biol 2015, 30:63-70.
20. Maris C, Dominguez C, Allain FH: The RNA recognition motif, aplastic RNA-binding platform to regulate post-transcriptionalgene expression. FEBS J 2005, 272:2118-2131.
21. Fribourg S, Gatfield D, Izaurralde E, Conti E: A novel mode ofRBD-protein recognition in the Y14-Mago complex. Nat StructBiol 2003, 10:433-439.
22. Beckmann BM, Horos R, Fischer B, Castello A, Eichelbaum K,Alleaume AM, Schwarzl T, Curk T, Foehr S, Huber W et al.: TheRNA-binding proteomes from yeast to man harbourconserved enigmRBPs. Nat Commun 2015, 6:10127.
23. Matia-Gonzalez AM, Laing EE, Gerber AP: Conserved mRNA-binding proteomes in eukaryotic organisms. Nat Struct Mol Biol2015, 22:1027-1033.
24. Koster T, Marondedze C, Meyer K, Staiger D: RNA-bindingproteins revisited – the emerging arabidopsis mRNAinteractome. Trends Plant Sci 2017, 22:512-526.
25. Hentze MW, Argos P: Homology between IRE-BP, a regulatoryRNA-binding protein, aconitase, and isopropylmalateisomerase. Nucleic Acids Res 1991, 19:1739-1740.
26. Conrad T, Albrecht AS, de Melo Costa VR, Sauer S, Meierhofer D,Orom UA: Serial interactome capture of the human cellnucleus. Nat Commun 2016, 7:11212.
27. Hudson WH, Ortlund EA: The structure, function and evolutionof proteins that bind DNA and RNA. Nat Rev Mol Cell Biol 2014,15:749-760.
28. Jeffery CJ: Moonlighting proteins – an update. Mol Biosyst 2009,5:345-350.
29. Zhang X, Liu S: RBPPred: predicting RNA-binding proteinsfrom sequence using SVM. Bioinformatics 2017, 33:854-862.
30. Ahmad S, Sarai A: Analysis of electric moments of RNA-bindingproteins: implications for mechanism and prediction. BMCStruct Biol 2011, 11:8.
31. Shazman S, Mandel-Gutfreund Y: Classifying RNA-bindingproteins based on electrostatic properties. PLoS Comput Biol2008, 4:e1000146.
32. Zhao H, Yang Y, Zhou Y: Structure-based prediction of RNA-binding domains and RNA-binding sites and application tostructural genomics targets. Nucleic Acids Res 2011, 39:3017-3025.
33. Yang YU, Zhao H, Wang J, Zhou Y: SPOT-Seq-RNA: predictingprotein-RNA complex structure and RNA-binding function by
Current Opinion in Structural Biology 2018, 53:124–130
130 Protein–nucleic acid interactions
fold recognition and binding affinity prediction. Methods MolBiol 2014, 1137:119-130.
34.�
Paz I, Kligun E, Bengad B, Mandel-Gutfreund Y: BindUP: a webserver for non-homology-based prediction of DNA and RNAbinding proteins. Nucleic Acids Res 2016, 44 W568–574.
This paper presents a web server for classifying domains as nucleic-acidor non-nucleic-acid-binding, based solely on the structural and physi-cochemical properties of domains from solved structures or structuralmodels. In addition to the binding classification, the program displays thelargest electrostatic patch on the domains, representing the RNA or DNAbinding interface.
35. Chowdhury S, Zhang J, Kurgan L: In silico prediction andvalidation of novel RNA binding proteins and residues in thehuman proteome. Proteomics 2018:e1800064 http://dx.doi.org/10.1002/pmic.201800064.
36.�
Brannan KW, Jin W, Huelga SC, Banks CA, Gilmore JM, Florens L,Washburn MP, Van Nostrand EL, Pratt GA, Schwinn MK et al.:SONAR discovers RNA-binding proteins from analysis oflarge-scale protein–protein interactomes. Mol Cell 2016,64:282-293.
This paper presents a unique computational approach for predictingnovel RNA-binding proteins relying on information from protein-proteininteractions learned from large-scale mass spectrometry data. Themethod was successfully applied to predict RNA-binding proteins iden-tified in RNA interactome experiments, without relying on sequence orstructural data.
37.��
Jones S: Protein-RNA interactions: structural biology andcomputational modeling techniques. Biophys Rev 2016, 8:359-367.
This is the most updated comprehensive review of computational meth-ods for predicting RNA-binding proteins and RNA-interaction sites onproteins and of recent methods for docking RNAs to proteins. The reviewalso provides an overview on structural technologies for solving RNPsand the available structural data for RNA-binding proteins and Protein-RNA complexes in the Protein Data Bank.
38.��
Miao Z, Westhof E: A large-scale assessment of nucleic acidsbinding site prediction programs. PLoS Comput Biol 2015, 11:e1004639.
In this study, the authors benchmarked over 20 computational programsfor predicting nucleic-acid-binding sites in proteins. In their study, theytested the programs on 41 different datasets and analyzed their perfor-mance. The paper also discusses the advantages and pitfalls of differentcomputational methods.
Current Opinion in Structural Biology 2018, 53:124–130
39. Shazman S, Elber G, Mandel-Gutfreund Y: From face to interfacerecognition: a differential geometric approach to distinguishDNA from RNA binding surfaces. Nucleic Acids Res 2011,39:7390-7399.
40. Sonavane S, Chakrabarti P: Cavities in protein–DNA andprotein–RNA interfaces. Nucleic Acids Res 2009, 37:4613-4620.
41. Yan J, Kurgan L: DRNApred, fast sequence-based method thataccurately predicts and discriminates DNA- and RNA-bindingresidues. Nucleic Acids Res 2017, 45:e84.
42. Singh R, Green MR: Sequence-specific binding of transfer RNAby glyceraldehyde-3-phosphate dehydrogenase. Science1993, 259:365-368.
43. Nagy E, Rigby WF: Glyceraldehyde-3-phosphatedehydrogenase selectively binds AU-rich RNA in the NAD(+)-binding region (Rossmann fold). J Biol Chem 1995,270:2755-2763.
44. Perez-Cano L, Fernandez-Recio J: Optimal protein-RNA area,OPRA: a propensity-based method to identify RNA-bindingsites on proteins. Proteins 2010, 78:25-35.
45. Li S, Yamashita K, Amada KM, Standley DM: Quantifyingsequence and structural features of protein–RNA interactions.Nucleic Acids Res 2014, 42:10086-10098.
46. Madan B, Kasprzak JM, Tuszynska I, Magnus M, Szczepaniak K,Dawson WK, Bujnicki JM: Modeling of protein–RNA complexstructures using computational docking methods. MethodsMol Biol 2016, 1414:353-372.
47. Yan Y, Zhang D, Zhou P, Li B, Huang SY: HDOCK: a web serverfor protein–protein and protein–DNA/RNA docking based on ahybrid strategy. Nucleic Acids Res 2017, 45 W365-W373.
48. Tuszynska I, Magnus M, Jonak K, Dawson W, Bujnicki JM:NPDock: a web server for protein-nucleic acid docking.Nucleic Acids Res 2015, 43 W425-430.
49. Tuszynska I, Matelska D, Magnus M, Chojnowski G, Kasprzak JM,Kozlowski LP, Dunin-Horkawicz S, Bujnicki JM: Computationalmodeling of protein-RNA complex structures. Methods 2014,65:310-319.
50. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM,Meng EC, Ferrin TE: UCSF Chimera – a visualization system forexploratory research and analysis. J Comput Chem 2004,25:1605-1612.
www.sciencedirect.com