47
Title: High resolution ensemble description of metamorphic and intrinsically disordered proteins using an efficient hybrid parallel tempering scheme Authors: Rajeswari Appadurai 1 , Jayashree Nagesh 2 and Anand Srivastava 1* Affiliations: 1 Molecular Biophysics Unit, Indian Institute of Science, Bangalore C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian Institute of Science, Bangalore C.V. Raman Road, Bangalore Karnataka 560 012, India Email: [email protected] ORCID Id: 0000-0002-2757-1511 Running title: Conformational sampling of proteins with funneled, multi-funneled and weakly multi-funneled free-energy landscapes Keywords: Intrinsically disordered proteins | metamorphic proteins | conformational ensemble | integrative modeling | parallel tempering Author Contributions: AR and AS conceived and designed the research; AR performed research; JN provided computational resources; AR, JN, and AS analyzed data; and AR and AS wrote the paper * Corresponding author: Anand Srivastava, [email protected] . CC-BY-NC-ND 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628 doi: bioRxiv preprint

High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Title: High resolution ensemble description of metamorphic and intrinsically disordered proteins using an efficient hybrid parallel tempering scheme

Authors: Rajeswari Appadurai1, Jayashree Nagesh2 and Anand Srivastava1* Affiliations: 1Molecular Biophysics Unit, Indian Institute of Science, Bangalore C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian Institute of Science, Bangalore C.V. Raman Road, Bangalore Karnataka 560 012, India Email: [email protected] ORCID Id: 0000-0002-2757-1511 Running title: Conformational sampling of proteins with funneled, multi-funneled and weakly multi-funneled free-energy landscapes Keywords: Intrinsically disordered proteins | metamorphic proteins | conformational ensemble | integrative modeling | parallel tempering

Author Contributions: AR and AS conceived and designed the research; AR performed research; JN provided computational resources; AR, JN, and AS analyzed data; and AR and AS wrote the paper

* Corresponding author: Anand Srivastava, [email protected]

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 2: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Abstract: Determining the conformational ensemble for proteins with multi-funneled complex free-energy landscapes is often not possible with classical structure-biology methods that produce time and ensemble averaged data. With vastly improved force fields and advances in rare-event sampling methods, molecular dynamics (MD) simulations offer a complementary approach towards determining the collection of 3-dimensional structures that proteins can adopt. However, in general, MD simulations need to either impose restraints or reweigh the generated data to match experiments. The limitations extend beyond systems with high free-energy barriers as is the case with metamorphic proteins such as RFA-H. The predicted structures in even weakly-funneled intrinsically disordered proteins (IDPs) such as Histatin-5 (His-5) are too compact relative to experiments. Here, we employ a new computationally-efficient parallel-tempering based advanced-sampling method applicable across proteins with extremely diverse free-energy landscapes. And we show that the calculated ensemble averages match reasonably well with the NMR, SAXS and other biophysical experiments without the need to reweigh. We benchmark our method against standard model systems such as alanine di-peptide, TRP-cage and β-hairpin and demonstrate significant enhancement in the sampling efficiency. The method successfully scales to large metamorphic proteins such as RFA-H and to highly disordered IDPs such as His-5 and produces experimentally-consistent ensemble. By allowing accurate sampling across diverse landscapes, the method enables for ensemble conformational sampling of deep multi-funneled metamorphic proteins as well as highly flexible IDPs with shallow multi-funneled free-energy landscape. Significance/Authors’ Summary: Generating high-resolution ensemble of intrinsically disordered proteins, particularly the highly flexible ones with high-charge and low-hydrophobicity and with shallow multi-funneled free-energy landscape, is a daunting task and often not possible since information from biophysical experiments provide time and ensemble average data at low resolutions. At the other end of the spectrum are the metamorphic proteins with multiple deep funnels and elucidating the structures of the transition intermediates between the fold topologies is a non-trivial exercise. In this work, we propose a new parallel-tempering based advanced-sampling method where the Hamiltonian is designed to allow faster decay of water orientation dynamics, which in turn facilitates accurate and efficient sampling across a wide variety of free-energy landscapes.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 3: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

1. Introduction:

Biomolecules are not static but exhibit time-dependent dynamic motions that are tightly coupled with their functions. Molecular simulations are now increasingly relied upon for visualizing detailed events of such molecular motions at atomic resolutions, which are often not possible using experimental methods. Originally developed in 1950s,1 molecular dynamics (MD) simulation technique has come a long way since its first application on biological molecule in 1977.1,2 With wide-range of applications, MD simulations are now routinely employed in important biomolecular characterization such as stability, flexibility, conformational changes, ligand binding and the effect of perturbations upon mutations, post-translational modifications and environmental changes.3,4 Increased availability of high-performance computers in the last decade further expands its accessibility and allows simulating processes that take longer than microsecond timescales5,6 and involve billions of atoms.7,8 Nonetheless, sampling in classical MD simulations is limited to the local minima of the free energy landscape. Consequently, accessing the complex energy landscapes of proteins is far from trivial for proteins with multi-step folding,9,10 metamorphic proteins,11–13 and intrinsically disordered proteins (IDPs).14–16 Enhanced molecular dynamics simulations such as temperature replica exchange molecular dynamics (TREM),17 are particularly useful in sampling conformations across energy barriers and therefore expediting the observations of rare biomolecular transitions. In replica exchange method, conformations are sampled using multiple replicas simulated at a series of low and high temperatures and are stochastically exchanged at regular intervals to provide an unbiased Boltzmann-weighted ensemble of conformations. However, the number of replicas required for the simulation scales exponentially with the degrees of freedom of the system hampering its application in large biomolecular solutions. Bruce Berne and coworkers developed a powerful alternative called replica exchange with solute tempering (REST)18,19, where they designed the Hamiltonian to partially heat a subset of the system (solutes). The approach reduced the required number of replicas by multiple folds. Such a powerful design of the Hamiltonian has been successfully shown to simulate weak binding of Aβ peptide on a lipid bilayer,20 lateral equilibration of lipids in a bilayer21, and a 90-residue long SH4 unique domain protein.22 While a much wider applications in protein folding is envisioned, however, the method underperforms in proteins where the conformations are separated by large free energy barriers producing poor mixing of replicas between the high temperature regime and low temperature regime. This is likely due to the discrepancy between the energy compensation of hot solute and cold solvent. Inclusion of a subset of random water molecules to the tempering central group improves the mixing but at the expense of poor scaling with the system size.23 More recently, a generalized version of the REST (gREST) that scales the Hamiltonian based on particles as well as energy terms has also been introduced,24 which further reduces the number of replicas and alleviates the scalability issues to a large extent. However, as the extent of tempering is limited to few energy terms, the method requires longer time to converge on the conformational sampling. This suggests that crossing of the free energy barrier is still a bottle neck for this class of methods.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 4: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

In general, parallel tempering methods suffer when the barriers between folded/unfolded states and intermediate states are high or when the transition state is diffusive in nature with the large entropic barrier slowing down conformational exchanges.25–27 The unrealistically reduced residence time in the metastable states prevents switches between conformational basins. In these cases, friction between the polypeptide chain and solvent modulates the speed of protein folding or unfolding.28 In this work, we propose a hybrid replica exchange method (termed hereafter as Replica exchange with hybrid tempering, REHT) that differentially and optimally heats up both the solute and solvent. In other words, along with the Hamiltonian scaling of each replica that effectively heats up the protein solute, the replicas are also coupled to different high temperature baths that heat up the system including solvent particles. The chosen bath temperature range of the replicas is small enough such that the energy difference due to solvent self-interaction is minimal and does not lead to the scalability issue. At the same time, the optimal tempering of the solvent along with the solute ensures an efficient rewiring of the hydration shell that work in cohort with that of protein conformational change, thereby helping in crossing larger barriers and in particular the entropic barriers. We demonstrate this by applying the protocol on a diverse set of proteins that differ widely on their size and complexity of the underlying free energy landscape. The choice of protein molecules ranges from simple model system such as alanine dipeptide to fast folders such as TRP-Cage and β-hairpin and proteins with complex energy landscapes such as IDPs (example: Histatin-5) and metamorphic proteins (example: RFA-H).

2. Results and discussion:

We explore the multidimensional free energy landscapes of diversely complex proteins using the newly designed REHT method and compare its efficiency with that of the state-of-the-art REST219 simulations. Towards this, we exploited the HREX module of PLUMED, originally developed for performing the Hamiltonian replica exchange simulations.29 The module is very flexible and allows for simultaneous use of different bias in the replicas such as the Hamiltonian, collective variable, temperature and pressure. For REHT method, we include the additional temperature bias in the replicas along with the Hamiltonian scaling of the protein solute and for which we derive and apply the corresponding detailed balance exchange criteria. The derivation of the exchange criteria, honoring detailed balance, is provided in the supporting information (SI). In short, the exchange criteria for REHT is given by:

∆𝒏𝒎(𝑅𝐸𝐻𝑇) = −,(𝛽.λ0 − 𝛽1λ2)3𝐻44(𝑋.) − 𝐻44(𝑋1)6

+ 8𝛽.9λ0 − 𝛽19λ2:3𝐻4;(𝑋.) − 𝐻4;(𝑋1)6

+ (𝛽. − 𝛽1)[𝐻;;(𝑋.) − 𝐻;;(𝑋1)]>

where, 𝐻44(𝑋1) and 𝐻44(𝑋.) indicates the intramolecular energy of protein solute in mth and nth replicas, respectively. 𝐻4;(𝑋1) and 𝐻4;(𝑋.) are interaction energies between protein and water

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 5: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

sites in the two replicas. Similarly, 𝐻;;(𝑋1) and 𝐻;;(𝑋.) indicates the water-water interactions in the corresponding replicas. 𝛽1and𝛽. are inverse temperatures and λ2 , λ0 are Hamiltonian scaling of replica m and n, respectively.

Though the methodological advancement of REHT has its origin in REST2, the new method has magnificent impact on the sampling efficiency as will be demonstrated here in this section. The efficiency is not just shown on simple model proteins but also on proteins such as intrinsically disordered Histatin-5 and metamorphic RFA-H proteins. The different systems considered in this paper are shown in Fig. 1. The computational details of all the systems are presented in Table S1.

2.1 Modulated tempering of water self-interaction promotes conformational search

We first test the sampling efficiency of our parallel tempering method on well-known model systems alanine-dipeptide, TRP-cage and β-hairpin. Though TRP-cage and β-hairpin are widely recognized with two-state folding kinetics,30 several experimental and simulation studies support the presence of intermediate states indicating the complex nature of folding energy landscapes of these fast-folders.31,32 For our simulations, the completely extended starting conformations of physiologically relevant zwitterionic state of TRP-cage33 (with charged termini) and β-hairpin were obtained by simulating them at 600 K in explicit water for 10 ns timescale. For comparison against the state-of-the-art in MD-based sampling, the simulations are also performed with REST2.18,19,22 Replica mixing of two simulations in both the systems are shown in Fig. S1 and S2. During the course of the simulations, both the REHT and REST2 methods drive the transition of the unfolded starting conformation to a native folded conformation with almost a perfect match with the experimentally found native structure (0.4 Å RMSD) as shown in Fig. 2A and Fig. S3A. However, we observe faster transition with REHT between the folded and unfolded basins. The time evolution of RMSD indicates REHT method samples the folded structures of the two proteins in less than 100 ns timescales (Fig. 2B) and produces the folded state in 5 out of 12 replicas (Fig. S3B, S4). The REST2 simulations sample the folded structure of TRP-cage (Fig. 2B) and β-hairpin (Ref24) at around 300 ns. Moreover, only 1-2 replicas out of 8 replicas are independently folded in these simulations (Fig. S5 and Ref24). Free energy landscapes, as functions RMSD and radius of gyration, of the conformations sampled by the lowest temperature replica of TRP-cage simulations are presented in for REHT and REST2 in Fig. 2C and Fig. 2D, respectively. The predicted free energy barrier by the REHT method (~2 Kcal/mol) matches closely with the suggested free energy barrier of ~ 2.1 Kcal/mol,30,34 while a larger barrier of about ~ 6 Kcal/mol is observed in REST2 simulations. Further, the time evolutions of the one-dimensional free energy landscape along a single reaction coordinate, RMSD, (Fig S6) reveals that the REHT converges much faster than the REST2. The origin of the difference lies in the way water self-interaction is treated in REHT and REST2. In Fig. 3, we present the water reorientation dynamics and H-bond dynamics in REST2 and REHT

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 6: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

simulations. Fig. 3A clearly shows that the water reorientation is much faster in REHT method than the REST2 method. This difference is observed in almost all hydration shells and particularly evident in the first hydration shell. By optimally heating up the water, our method improves the orientation dynamics of water at the hydration layer. Consequently, this leads to the rewiring of H-bonds between protein and water measured here as an autocorrelation function of H-bond lifetime (Fig 3B). As illustrated, the H-bonds decay much faster in the new method in comparison to the REST2 method. The faster decay in the water orientation and H-bond allows for efficient rewiring of proteins allowing for the faster and more faithful conformational transitions.35 This expedited sampling of the “slow” reaction coordinate (collective variable) is also visible in alanine dipeptide, the quintessential model system for barrier crossing events in biophysical advanced sampling methods development field. We discuss this in detail in SI under the subheading “Dihedral switch in Alanine dipeptide”. In short, we show that relatively slow transitioning ϕ-angle of the Ramachandran map36,37 is frequently sampled in the REHT method (Fig. S7-S8) with a larger number of replicas (4 out of 5) exhibiting this transition as well. 2.2 Ensemble sampling of flexible IDPs with high charge and low hydrophobicity Accurate conformational sampling of IDPs is a major challenge in the field of molecular simulations where conventional forcefields do not seem to accurately reproduce the properties of IDPs. Several solutions towards fixing the forcefield transferability issues have been suggested and applied extensively to IDPs.38–43 What stands out in these improvements is the necessity to have a balanced protein-water interactions besides the required changes in the parameters of proteins.44–46Also, rare-event sampling methods are frequently used in conjunction with these improved force-fields to faithfully capture the conformational landscape of IDPs as shown for P53,47 alpha synuclein,48 islet amyloid polypeptide,49 amyloid beta,50 and NCBD IDPs.51 Recently, REST2 in combination with the IDP-specific forcefield (Amberff03ws) and water model (TIP4P/2005s) was used to successfully yield experimentally consistent ensemble in SH4UD.22 Despite these successes, there are set of IDPs on which the simulations data show striking deviations from the experimental results. For instance, in a 24-residue long antimicrobial peptide Histatin-5 (His-5), the simulation-generated ensemble deviates substantially from the experimental data from CD, NMR and SAXS measurements.52,53 His-5 fails to be accurately sampled even with the state-of-the-art forcefields52,53 suggesting the lack of transferability across IDPs with diverse range of charge and hydropathy characteristics. We illustrate the charge-hydropathy plot of various IDPs in Fig 4A. The plot clearly shows that the successfully characterized proteins (p53, AB42, NCBD, SH4UD) fall at or near the line separating the folded and unfolded region indicating the pre-molten globular nature of these IDPs. His-5 on the other end is located at the far end of charge-hydropathy plot with very minimal hydrophobicity and higher charges.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 7: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

We sampled the conformational landscape of His-5 using REHT and REST2 (see Fig. S9 for mixing of replicas) and verified the accuracy of the generated ensemble against various experimental ensemble average properties. To start with, we evaluated the secondary structure content of this peptide by analyzing the backbone dihedrals of all the residues. Circular dichroism data54 reveals that His-5 prefers to form polyproline II (PPII) structures and the PPII propensity is lost at higher temperatures. Fig. S10 in SI shows the Ramachandran map for His-5 indicating the most probable occurrence of PPII structures, which is generally not recapitulated in the conventional IDP sampling methods.54 Interestingly, enhancing the sampling either by REHT or REST2 recapitulates the propensity of PPII structures of His-5 in a qualitative agreement with the CD measurements. Furthermore, the temperature dependent loss of PPII structure is also captured correctly (Fig S10). We also calculate the nanoscale structural properties of His-5, often measured experimentally by means of small angle X-ray scattering (SAXS) methods that describes the overall size and shape of a protein in solution. The experimental SAXS data55 for His-5 at room temperature and at neutral pH was obtained from Prof. Marie Skepo’s laboratory in Sweden. The results are depicted in Fig. 4A as dimensionless Kratky representation, that qualitatively assess the compactness and flexibility of protein. This can be obtained from the form factor using the following equation: (𝑞𝑅𝑔)E𝐼 (𝑞) 𝐼⁄ (0) and plotted against 𝑞𝑅𝑔 . In case of well folded proteins, the Kratky plot exhibits a bell-shaped peak at low-q regime and converges to the q-axis at high-q regime, whereas in IDPs it shows monotonic increase in intensity at the high-q region.56,57 As shown in the figure, the Kratky plot of His-5 obtained from the experiment exhibits the signature of highly flexible and extended IDP. The SAXS profile from REHT ensemble data (Fig. 4B) matches closely with that of the experiment thereby reinforcing its fidelity in constructing the IDP ensemble accurately. We believe that conformations of IDPs that lie significantly away from the CH border area are far more affected by the hydration-induced stability. In such cases, REST2 is not suitable since it does not treat the surrounding water and hence produces a more compact protein configurational ensemble as evident from both the SAXS profile and Rg plot (Fig 4B:inset). We also compare the chemical shifts of hydrogen atoms linked to the Cα predicted from the REHT and REST2 methods with that of the NMR chemical shifts data (Fig. 4C). The chemical shifts predicted from REHT matches better with the experiments than the REST2 method. After verifying the REHT-generated ensemble with the experimental data from CD, SAXS and NMR measurements, we reconstitute the free energy map of His-5 at the room temperature as a function of various structural parameters (Fig. 4D, Fig S11-S12). The free energy map indicates rather a flat-bottom low-energy landscape across a wide range of Rg (from 1.0 nm to 1.8 nm) and SASA values (28 nm2 to 38 nm2). The results indicate that under solution conditions, the peptide prefers to exist in a completely disordered conformation. Such a large disorderliness endows them with an ability to bind diverse targets as confirmed from the proteomic analysis.58 We also traced for the residual alpha helical propensity as observed when the peptide approaches the biological

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 8: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

membrane during its candidacidal functioning.59 However, we did not observe any trace of helicity in the solution ensemble (Fig. S13), suggesting the helical transition may be adapted in an induced fit manner upon association with the membrane. 2.3 Deciphering the transition intermediates of metamorphic protein (RFA-H)

Unlike the weakly-multi-funneled landscapes generally found in IDPs, metamorphic proteins have deep multi-funnels with each low energy basin representing a distinct well-folded conformation. These conformations switch upon being signaled to perform different functions. Among the many members of metamorphic proteins known till date, RFA-H possesses the highly divergent folds with completely different secondary structures. RFA-H is a bacterial regulatory protein, in which its C-terminal domain alternates between all α-helical and all β-sheet conformations based on presence or absence of inter-domain contacts respectively (Fig 5A).60 While in helical conformation, it activates the transcriptional elements, in beta sheet conformation it recruits ribosomal protein and thereby couples to translation.61,62 Transition between these alternate conformations generally involve unfolding from one conformation and refolding to other conformation.63 Exploring these protein landscapes by the computational MD simulations is highly challenging and rarely attempted utilizing biasing techniques such as targeted MD or assisted GO model.64–66 We set to explore the conformational metamorphosis of RFA-H in explicit water using our unbiased REHT simulation. The same was attempted using REST2 simulation as well, which did not yield proper mixing of replicas even at 250 ns of simulation time per replica (Fig S14) and hence was dropped. REHT started showing mixing between replicas by around 100 ns and the runs were extended to 1000 ns/replica for the 25 replicas. Experimentally obtained α-helical state (PDB ID: 2OUG) of the C-terminal domain (residue number 115-162) was used as starting structure (Fig. 5B). Unlike the previous studies using targeted MD64,66 or assisted GO model with replica exchange tunneling,65 the REHT method doesn’t embed in itself any information of the target state. In spite of that, the simulations spontaneously transform the RFA-HCTD from helical structure to the beta barrel conformation. Secondary structure analysis of the ensemble generated at the lowest temperature (310K) replica reveal that this transformation occurs within 300 ns timescale (Fig. S15). The residues that were shown to form beta sheets agree well with the NMR-generated structure (PDB ID: 2LCL). The structural overlay of a snapshot obtained from the simulation on to the NMR structure is shown in Fig. 5B and their root mean squared deviation is calculated to be 3.4 Å. It should be appropriate to note here that the previous unbiased MD simulation, with implicit solvent, fails to produce the beta barrel spontaneously. Additional refinement with explicit solvent simulation was required to arrive at the barrel structure with ~5 Å RMSD from the experimental structure.67 To further validate our results, we predicted the backbone chemical shifts of the simulation-generated ensemble obtained from the room temperature replica. The last 500 ns timeframes without any reweighting or constricted selection was used for this analysis. The predicted Ca and

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 9: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Cb chemical shifts were compared with those measured from NMR spectroscopy (Fig 5C). The figure clearly demonstrates a striking agreement between the predicted and measured chemical shift values. Following the experimental validation, we traced for the molecular details of the fold switching mechanism from our atomic-resolution trajectory, the information that cannot be obtained directly from the experiments. Towards this, we plotted the free energy map as a function of RMSDs with respect to the experimentally found helical and beta-barrel structures (Fig 5D). The map reveals dual-basins, each corresponding to the different folds of the RFA-H, where the basins are separated by a free energy barrier of about 5 Kcal/mol. This barrier corresponds to the a→b transition while not favoring the reverse transition. This argument is based on our observation with increasing simulation time, the population of b conformation shows an increase suggesting a (non-converged) deeper basin for the b structure. (Fig S16) For the reverse transition b→a, experiments indicate that the N-terminal interaction is important.60 To reinforce this statement we simulated additional simulation initiated from the NMR-obtained beta structure, which show stable beta barrel structures (Fig. S17), and is consistent with the NMR observations.60 While the free energy basin corresponding to the a-helical structure exhibits a funnel like architecture, that of the b-barrel structure looks like a rugged landscape indicating possible heterogeneity in that state. The heterogeneity likely builds up the entropic barrier and adds to the complexity of the free energy map and thereby limits the spontaneous transition from one state to the other upon using conventional advanced sampling methods. REHT shows promising results in crossing both the enthalpic and entropic barriers and therefore allows for studying the complex conformational transitions of IDPs and “transformer” proteins such as RFA-H. The method also opens up avenues to study other transformer proteins such as chemokine lymphotactin protein,68 Mad2 spindle checkpoint protein,69 CLIC1.70

3. Materials and Methods:

3.1 Parallel tempering:

For the sake of completion, we provide the theoretical background about the different versions of replica exchange methods in SI under the subheading “Replica exchange simulation methodology”. Moreover, the detailed balance exchange condition for the newly developed REHT method, with a specially designed Hamiltonian that differentially treats solvent and solute across a temperature range, is also derived and presented in SI. 3.2 Systems and simulation details:

3.2.1 List of systems studied and preparation of atomistic model:

Fig. 1 and Table S1 contain information about the diverse sets of proteins used in this work.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 10: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

The initial unfolded structure of Trp-Cage and β-hairpin (C terminal hairpin of B1 domain in protein G) was obtained by simulating the NMR structures (PDB Id: 1l2Y and 1lE3, respectively) at 600K temperature for 10 ns. For Hist-5, the unfolded structure was built using VMD protein builder by feeding in the sequence information. Simulation of RFA-H C-terminal domain was initiated from the α-helical conformation as obtained from the X-ray crystallography (PDB ID:2OUG) used as initial structure for the simulation. All the proteins (system 1-5 in Fig.1) were solvated in a cubic box with a minimum distance of 1.2 nm from the surface of the protein (in case of RFA-H 1.5 nm is used). 3-site rigid TIP3P water model was used commonly for all the systems. The systems were also neutralized with physiological concentration of NaCl (0.15M). For the topological parameters of fast folding proteins, we used Amberff14SB in order to faithfully compare the efficiency with earlier REST2 and gREST simulations,24 whereas for all the rest of the systems (alanine dipeptide, His-5 and RFA-H), Charmm36m39 force field was used. 3.2.2 Details of MD simulation:

The modelled proteins solutions were initially energy minimized using steepest descent algorithm for 50,000 steps to avoid any poor contacts. The minimized structure was then thermalized and equilibrated sequentially in NVT and NPT ensembles each for 2 ns. The protein and the solvent were coupled separately to the target temperatures using modified Berendsen thermostat. The pressure was coupled at 1 bar using Parrinello-Rahman barostat. The final production simulation was performed in NVT ensemble using Nose-Hoover thermostat. A cut-off of 1 nm was used for calculating the electrostatics and VdW interactions. Particle Mesh Ewald was used for long-range electrostatics. Leap-frog integrator with the time step of 2 fs was used to integrate the equations of motions. All the hydrogen atoms were constrained using LINCS algorithm. The simulations were performed using Gromacs-2018.2 patched with Plumed-2.4.1. All the parameter files were uploaded in the GitHub repository (https://github.com/codesrivastavalab/ReplicaExchangeWithHybridTempering) 3.2.3 Replica exchange parameters

For all simulations, we ran both our newly designed REHT scheme as well as the state-of-the-art REST2 method. The two methods differ in the way the protein and water molecules are treated in the Hamiltonian. REST2 tempers only the protein but keeps the water at room temperature by scaling the potential energy function of solute. The bath temperature used in REST2 is constant across all the replicas. Whereas in REHT method the bath temperature is raised mildly up to 340K as we go up on the replica ladder. This sufficiently heats up the water molecules for its efficient dynamics in REHT method. For enhancing the protein dynamics, the potentials of the protein solute are scaled up to a maximum factor (λ) of 0.5 similar to the REST2 method. To ensure the faithful comparison, the overall effective temperature realized on the protein is identical in both the simulations (Table. 1). However, due to the extra degrees of freedom due to treatment of the water, the number of replicas used is moderately higher for the REHT method than for the REST

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 11: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

simulations. More details and the scripts for all the input preparation (including the scaling of Amber14SB and charmm36m forcefields) are given in the github link. (https://github.com/codesrivastavalab/ReplicaExchangeWithHybridTempering) 3.3 Conformational ensemble analysis details:

3.3.1 Theoretical SAXS analysis:

Theoretical SAXS profile for the HIS-5 ensemble was predicted using CRYSOL71 that calculates orientationally averaged scattering pattern using multipole expansion from atomic coordinates while considering the solvation shell using spherical harmonics. The predicted value is compared against the experimental form factor of His-5 obtained at neutral pH from Skepo et.al.55 3.3.2 Theoretical analysis of chemical shifts:

The backbone chemical shifts of RFA-H atomic coordinates were calculated using SPARTA+,72 that works based on artificial neural network. To validate our simulation, the predicted shifts were compared against the experimental NMR chemical shifts as obtained from BMRB database entry 17615.60 3.3.3 Structural analysis:

Other structural parameters such as root mean squared deviation (RMSD), radius of gyration (Rg) and backbone dihedral angles were analyzed using the respective Gromacs utilities. Analysis of protein secondary structures was performed using DSSP tool. VMD was used for visualizing the trajectory and rendering of protein cartoon images. 3.3.4 Analysis of hydration dynamics:

The dynamical properties of the hydration shell were calculated by means of water orientational relaxation and hydrogen bond lifetime decay as implemented in MDAnalysis package.72 Continuous trajectory with respect to simulation time was used for this analysis. The orientational relaxation essentially indicates the rotational freedom of water molecules in a solvation shell. This is measured by computing the second order rotational autocorrelation function of two vectors: one along the OH bond and the other along the dipole moment of water, as given as:𝐶EK̂(𝜏) =N𝑃E8𝑢QR. 𝑢QRTU:V , where, 𝑃E(𝑥) is the second Legendre polynomial and 𝑢 is the unit vector. Similarly, we measure the persistence of interaction between the protein and the water molecules by computing the autocorrelation function of Hydrogen bond lifetime, as follows: 𝐶XY(𝜏) =∑ [\](QR)[\](QRTU)\]

∑ [\](QR)\], where, the H-bond between pair i and j at time t, (ℎ_`(𝑡) is considered to be 1 if

the geometric distance (< 3.5 Å) and angle (>120º) cutoff are met. ℎ_`(𝑡) = 0 otherwise. We considered two types of protein-water H-bonds: ‘continuous H-bond’ when a water molecule is

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 12: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

continuously involved in H-bonding and ‘intermittent H-bond’ when the H-bond is retained with intermittent change of water molecules. 3.3.5 Calculations of Free energy change:

The relative Gibbs free energy of an equilibrium ensemble is computed as a function of two reaction coordinates as follows:

∆𝐺(cd,cE) = −𝐾Y𝑇 𝑙𝑛

i(jk,jl)imno

,

Where 𝐾Y represents the Boltzmann constant, T is the temperature. 𝑃(cd,cE) denotes the probability along the two reaction coordinates, which is calculated using a k-nearest neighbor scheme and 𝑃pqr denotes the maximum probability. The 3-dimensional representation of the free energy surface was plotted using Matlab. 4. Conclusions In this work, we design a new scaled Hamiltonian that differentially treats the protein and solvent interactions as a function of temperature in the replica-exchange framework. Essentially, the design of the Hamiltonian in our method (REHT) allows faster decay of water orientation dynamics, which in turn enhances the conformational sampling for proteins. We find that the accelerated thermodynamic sampling in REHT compensates for the additional computational cost incurred by the moderately higher number of replicas due to inclusion of water interactions in the redesigned Hamiltonian for replica exchanges. The high-resolution structural ensemble for a “variety of proteins” produced through our REHT simulations agree excellently with the ensemble average observables obtained from NMR, SAXS, CD and other biophysical experiments without the need of any reweighting. The method is particularly suited for highly flexible IDPs such as His-5 where solvation dynamics stabilize and drive the coexistence of multiple degenerate extended states of the peptide on a free-energy surface that have multiple shallow basins. Large free energy barriers in metamorphic proteins, with multiple well-defined basins having diversely folded conformations, are usually not easily surmountable through conventional methods. We show that REHT is capable of sampling across the barriers of metamorphic proteins, deduce possible transition intermediates and solve their conformational ensemble. As a future perspective, we believe that when appropriately coupled with the known experimental data, the high-fidelity structural ensemble information from the unconstrained REHT-simulations can be effectively used to sample and if needed further minimize the residual with experimental observables using integrative modeling framework.73–75 5. Acknowledgments: The authors are grateful to the SciNet HPC Consortium, ComputeCanada76,77 for the computational resources and Prof. Marie Skepo for the Histatin-5

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 13: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

SAXS data. A.S. and A.R. thanks Dr. Jagannath Mondal, Dr. Ashok Sekhar and Dr. Debostuti Ghoshdastidar for their valuable comments and suggestions on the manuscript. AR thanks Wellcome Trust – DBT India Alliance for Early Career fellowship. J.N. thanks DST-SERB for funding under Ramanujan Faculty Fellowship. A.S. also thanks IISc-Bangalore and the Ministry of Human Resource Development of India for the startup grant and the Department of Science and Technology of India for the early career grant. This research was also supported by the Department of Biotechnology, Government of India in the form of IISc-DBT partnership program. Support from FIST program sponsored by the Department of Science and Technology and UGC, Centre for Advanced Studies and Ministry of Human Resource Development, India is gratefully acknowledged by the authors.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 14: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

6. References: 1. Alder BJ, Wainwright TE. Phase transition for a hard sphere system. J Chem Phys.

1957;27(5):1208-1209. doi:10.1063/1.1743957 2. McCammon JA, Gelin BR, Karplus M. Dynamics of folded proteins. Nature.

1977;267(5612):585-590. doi:10.1038/267585a0 3. Hollingsworth SA, Dror RO. Molecular Dynamics Simulation for All. Neuron.

2018;99(6):1129-1143. doi:10.1016/j.neuron.2018.08.011 4. Bottaro S, Lindorff-Larsen K. Biophysical experiments and biomolecular simulations: A

perfect match? Science. 2018;361(6400):355-360. doi:10.1126/science.aat4010 5. Voelz VA, Bowman GR, Beauchamp K, Pande VS. Molecular Simulation of ab Initio

Protein Folding for a Millisecond Folder NTL9(1−39). J Am Chem Soc. 2010;132(5):1526-1528. doi:10.1021/ja9090353

6. Lane TJ, Shukla D, Beauchamp KA, Pande VS. To milliseconds and beyond: challenges in the simulation of protein folding. Curr Opin Struct Biol. 2013;23(1):58-65. doi:https://doi.org/10.1016/j.sbi.2012.11.002

7. Durrant JD, Kochanek SE, Casalino L, Ieong PU, Dommer AC, Amaro RE. Mesoscale All-Atom Influenza Virus Simulations Suggest New Substrate Binding Mechanism. ACS Cent Sci. 2020;6(2):189-196. doi:10.1021/acscentsci.9b01071

8. Singharoy A, Maffeo C, Delgado-Magnero KH, et al. Atoms to Phenotypes: Molecular Design Principles of Cellular Energy Metabolism. Cell. 2019;179(5):1098-1111.e23. doi:https://doi.org/10.1016/j.cell.2019.10.021

9. Peter L. Freddolino, Christopher B. Harrison, Yanxin Liu and KS. Challenges in protein folding simulations: Timescale, representation, and analysis. Nat Phys. 2010;6(10):751-758. doi:10.1038/jid.2014.371

10. Veitshans T, Klimov D, Thirumalai D. Protein folding kinetics: Timescales, pathways and energy landscapes in terms of sequence-dependent properties. Fold Des. 1997;2(1):1-22. doi:10.1016/S1359-0278(97)00002-3

11. Onuchic JN, Wolynes PG. Theory of protein folding. Curr Opin Struct Biol. 2004;14(1):70-75. doi:https://doi.org/10.1016/j.sbi.2004.01.009

12. Porter LL, Looger LL. Extant fold-switching proteins are widespread. Proc Natl Acad Sci U S A. 2018;115(23):5968-5973. doi:10.1073/pnas.1800168115

13. Röder K, Joseph JA, Husic BE, Wales DJ. Energy Landscapes for Proteins: From Single Funnels to Multifunctional Systems. Adv Theory Simulations. 2019;2(4):1800175. doi:10.1002/adts.201800175

14. Dyson HJ, Wright PE. Intrinsically unstructured proteins and their functions. Nat Rev Mol Cell Biol. 2005;6(3):197-208. doi:10.1038/nrm1589

15. Tompa P, Schad E, Tantos A, Kalmar L. Intrinsically disordered proteins: Emerging interaction specialists. Curr Opin Struct Biol. 2015;35:49-59. doi:10.1016/j.sbi.2015.08.009

16. Uversky VN. Dancing protein clouds: The strange biology and chaotic physics of intrinsically disordered proteins. J Biol Chem. 2016;291(13):6681-6688. doi:10.1074/jbc.R115.685859

17. Sugita Y, Okamoto Y. Replica-exchange molecular dynamics method for protein folding. Chem phys Lett. 1999;314:296-297. doi:10.1016/S0009-2614(99)01123-9

18. Liu P, Kim B, Friesner R a, Berne BJ. Replica exchange with solute tempering: a method

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 15: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

for sampling biological systems in explicit water. Proc Natl Acad Sci U S A. 2005;102(39):13749-13754. doi:10.1073/pnas.0506346102

19. Wang L, Friesner RA, Berne BJ. Replica exchange with solute scaling: A more efficient version of replica exchange with solute tempering (REST2). J Phys Chem B. 2011;115(30):9431-9438. doi:10.1021/jp204407d

20. Smith AK, Lockhart C, Klimov DK. Does Replica Exchange with Solute Tempering Efficiently Sample Aβ Peptide Conformational Ensembles? J Chem Theory Comput. 2016;12(10):5201-5214. doi:10.1021/acs.jctc.6b00660

21. Huang K, García AE. Acceleration of lateral equilibration in mixed lipid bilayers using replica exchange with solute tempering. J Chem Theory Comput. 2014;10(10):4264-4272. doi:10.1021/ct500305u

22. Shrestha UR, Juneja P, Zhang Q, et al. Generation of the configurational ensemble of an intrinsically disordered protein from unbiased molecular dynamics simulation. Proc Natl Acad Sci. 2019;116(41):20446-20452. doi:10.1073/pnas.1907251116

23. Huang X, Hagen M, Kim B, Friesner RA, Zhou R, Berne BJ. Replica exchange with solute tempering: Efficiency in large scale systems. J Phys Chem B. 2007;111(19):5405-5410. doi:10.1021/jp068826w

24. Kamiya M, Sugita Y. Flexible selection of the solute region in replica exchange with solute tempering: Application to protein-folding simulations. J Chem Phys. 2018;149(7). doi:10.1063/1.5016222

25. Nymeyer H. How efficient is replica exchange molecular dynamics? An analytic approach. J Chem Theory Comput. 2008;4(4):626-636. doi:10.1021/ct7003337

26. Lwin TZ, Luo R. Overcoming entropic barrier with coupled sampling at dual resolutions. J Chem Phys. 2005;123(19):194904. doi:10.1063/1.2102871

27. Yu T-Q, Lu J, Abrams CF, Vanden-Eijnden E. Multiscale implementation of infinite-swap replica exchange molecular dynamics. Proc Natl Acad Sci. 2016;113(42):11744-11749. doi:10.1073/pnas.1605089113

28. Pradeep L, Udgaonkar JB. Diffusional Barrier in the Unfolding of a Small Protein. J Mol Biol. 2007;366(3):1016-1028. doi:https://doi.org/10.1016/j.jmb.2006.11.064

29. Bussi G. Hamiltonian replica exchange in GROMACS: A flexible implementation. Mol Phys. 2014;112(3-4):379-384. doi:10.1080/00268976.2013.824126

30. Neidigh JW, Fesinmeyer RM, Andersen NH. Designing a 20-residue protein. Nat Struct Biol. 2002;9(6):425-430. doi:10.1038/nsb798

31. Meuzelaar H, Marino KA, Huerta-Viga A, et al. Folding Dynamics of the Trp-Cage Miniprotein: Evidence for a Native-Like Intermediate from Combined Time-Resolved Vibrational Spectroscopy and Molecular Dynamics Simulations. J Phys Chem B. 2013;117(39):11490-11501. doi:10.1021/jp404714c

32. Zhou R. Trp-cage: Folding free energy landscape in explicit water. Proc Natl Acad Sci. 2003;100(23):13280-13285. doi:10.1073/pnas.2233312100

33. English CA, García AE. Charged Termini on the Trp-Cage Roughen the Folding Energy Landscape. J Phys Chem B. 2015;119(25):7874-7881. doi:10.1021/acs.jpcb.5b02040

34. Lindorff-Larsen K, Piana S, Dror RO, Shaw DE. How fast-folding proteins fold. Science (80- ). 2011;334(6055):517-520. doi:10.1126/science.1208351

35. Schirò G, Fichou Y, Gallat FX, et al. Translational diffusion of hydration water correlates with functional motions in folded and intrinsically disordered proteins. Nat Commun. 2015;6. doi:10.1038/ncomms7490

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 16: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

36. Ravikumar A, Ramakrishnan C, Srinivasan N. Stereochemical Assessment of (φ,ψ) Outliers in Protein Structures Using Bond Geometry-Specific Ramachandran Steric-Maps. Structure. 2019;27(12):1875-1884.e2. doi:https://doi.org/10.1016/j.str.2019.09.009

37. Ramachandran GN, Ramakrishnan C, Sasisekharan V. Stereochemistry of polypeptide chain configurations. J Mol Biol. 1963;7(1):95-99. doi:https://doi.org/10.1016/S0022-2836(63)80023-6

38. Liu H, Song D, Zhang Y, Yang S, Luo R, Chen H-F. Extensive tests and evaluation of the CHARMM36IDPSFF force field for intrinsically disordered proteins and folded proteins. Phys Chem Chem Phys. 2019;21(39):21918-21931. doi:10.1039/C9CP03434J

39. Huang J, Rauscher S, Nawrocki G, et al. CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nat Methods. 2017;14(1):71-73. doi:10.1038/nmeth.4067

40. Huang J, MacKerell Jr AD. Force field development and simulations of intrinsically disordered proteins. Curr Opin Struct Biol. 2018;48:40-48. doi:10.1016/j.sbi.2017.10.008

41. Robustelli P, Piana S, Shaw DE. Developing a molecular dynamics force field for both folded and disordered protein states. Proc Natl Acad Sci. 2018;115(21):E4758--E4766. doi:10.1073/pnas.1800690115

42. Latham AP, Zhang B. Maximum Entropy Optimized Force Field for Intrinsically Disordered Proteins. J Chem Theory Comput. 2020;16(1):773-781. doi:10.1021/acs.jctc.9b00932

43. Rauscher S, Gapsys V, Gajda MJ, Zweckstetter M, de Groot BL, Grubmüller H. Structural Ensembles of Intrinsically Disordered Proteins Depend Strongly on Force Field: A Comparison to Experiment. J Chem Theory Comput. 2015;11(11):5513-5524. doi:10.1021/acs.jctc.5b00736

44. Best RB, Zheng W, Mittal J. Balanced Protein − Water Interactions Improve Properties of Disordered Proteins and Non-Speci fi c Protein Association. J Chem Theory Comput. 2014;25(5113-5124). doi:10.1021/ct500569b

45. Best RB, Mittal J. Protein Simulations with an Optimized Water Model: Cooperative Helix Formation and Temperature-Induced Unfolded State Collapse. J Phys Chem B. 2010;114(46):14916-14923. doi:10.1021/jp108618d

46. Song D, Luo R, Chen H-F. The IDP-Specific Force Field ff14IDPSFF Improves the Conformer Sampling of Intrinsically Disordered Proteins. J Chem Inf Model. 2017;57(5):1166-1178. doi:10.1021/acs.jcim.7b00135

47. Mittal J, Yoo TH, Georgiou G, Truskett TM. Structural ensemble of an intrinsically disordered polypeptide. J Phys Chem B. 2013;117(1):118-124. doi:10.1021/jp308984e

48. Wu K-P, Weinstock DS, Narayanan C, Levy RM, Baum J. Structural Reorganization of α-Synuclein at Low pH Observed by NMR and REMD Simulations. J Mol Biol. 2009;391(4):784-796. doi:https://doi.org/10.1016/j.jmb.2009.06.063

49. Zerze GH, Miller CM, Granata D, Mittal J. Free Energy Surface of an Intrinsically Disordered Protein: Comparison between Temperature Replica Exchange Molecular Dynamics and Bias-Exchange Metadynamics. J Chem Theory Comput. 2015;11(6):2776-2782. doi:10.1021/acs.jctc.5b00047

50. Das P, Matysiak S, Mittal J. Looking at the Disordered Proteins through the Computational Microscope. ACS Cent Sci. 2018;4(5):534-542. doi:10.1021/acscentsci.7b00626

51. Knott M, Best RB. A Preformed Binding Interface in the Unbound Ensemble of an Intrinsically Disordered Protein: Evidence from Molecular Simulations. PLOS Comput

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 17: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Biol. 2012;8(7):e1002605. https://doi.org/10.1371/journal.pcbi.1002605. 52. Kjaergaard M, Nørholm A, Hendus-altenburger R, Pedersen SF, Poulsen FM, Kragelund

BB. Temperature-dependent structural changes in intrinsically disordered proteins : Formation of a -helices or loss of polyproline II ? J chem Theor Comput. 2010;19:1555-1564. doi:10.1002/pro.435

53. Baul U, Chakraborty D, Mugnai ML, Straub JE, Thirumalai D. Sequence Effects on Size, Shape, and Structural Heterogeneity in Intrinsically Disordered Proteins. J Phys Chem B. 2019;123(16):3462-3474. doi:10.1021/acs.jpcb.9b02575

54. Jephthah S, Staby L, Kragelund BB, Skepö M. Temperature Dependence of Intrinsically Disordered Proteins in Simulations: What are We Missing? J Chem Theory Comput. 2019;15(4):2672-2683. doi:10.1021/acs.jctc.8b01281

55. Cragnell C, Durand D, Cabane B, Skepö M. Coarse-grained modeling of the intrinsically disordered protein Histatin 5 in solution: Monte Carlo simulations in combination with SAXS. Proteins Struct Funct Bioinforma. 2016;84(6):777-791. doi:doi:10.1002/prot.25025

56. Henriques J, Arleth L, Lindorff-Larsen K, Skepö M. On the Calculation of SAXS Profiles of Folded and Intrinsically Disordered Proteins from Computer Simulations. J Mol Biol. 2018;430(16):2521-2539. doi:https://doi.org/10.1016/j.jmb.2018.03.002

57. Hub JS. Interpreting solution X-ray scattering data using molecular simulations. Curr Opin Struct Biol. 2018;49:18-26. doi:https://doi.org/10.1016/j.sbi.2017.11.002

58. Moffa EB, Machado MAAM, Mussi MCM, et al. In Vitro Identification of Histatin 5 Salivary Complexes. PLoS One. 2015;10(11):e0142517. https://doi.org/10.1371/journal.pone.0142517.

59. Brewer D, Hunter H, Lajoie G. NMR studies of the antimicrobial salivary peptides histatin 3 and histatin 5 in aqueous and nonaqueous solutions. Biochem Cell Biol. 1998;76(2-3):247-256. doi:10.1139/o98-066

60. Burmann M, Knauer SH, Sevostyanova A, et al. An a Helix to b Barrel Domain Switch Transforms the Transcription Factor RfaH into a Translation Factor. Cell. 2012;150:291-303. doi:10.1016/j.cell.2012.05.042

61. Svetlov V, Nudler E. Unfolding the Bridge between Transcription and Translation. Cell. 2012;150(2):243-245. doi:10.1016/j.cell.2012.06.025

62. Zuber PK, Schweimer K, Rösch P, Artsimovitch I, Knauer SH. Reversible fold-switching controls the functional cycle of the antitermination factor RfaH. Nat Commun. 2019;10:702. doi:10.1038/s41467-019-08567-6

63. Tyler RC, Murray NJ, Peterson FC, Volkman BF. Native-State Interconversion of a Metamorphic Protein Requires Global Unfolding. Biochemistry. 2011;50(33):7077-7079. doi:10.1021/bi200750k

64. Li S, Xiong B, Xu Y, et al. Mechanism of the All - α to All - β Conformational Transition of RfaH- CTD : Molecular Dynamics Simulation and Markov State Model. J Chem Theory Comput. 2014;10(6):2255-2264.

65. Bernhardt NA, Hansmann UHE. Multi-Funnel Landscape of the Fold-Switching Protein RfaH-CTD. J Phys Chem B. 2019;122(5):1600-1607. doi:10.1021/acs.jpcb.7b11352.Multi-Funnel

66. Ramírez-Sarmiento CA, Noel JK, Valenzuela SL, Artsimovitch I. Interdomain Contacts Control Native State Switching of RfaH on a Dual-Funneled Landscape. PLOS Comput Biol. 2015;11(7):e1004379. https://doi.org/10.1371/journal.pcbi.1004379.

67. Gc JB, Bhandari YR, Gerstman BS, Chapagain PP. Molecular dynamics investigations of

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 18: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

the α-helix to β-Barrel conformational transformation in the RfaH transcription factor. J Phys Chem B. 2014;118(19):5101-5108. doi:10.1021/jp502193v

68. Kuloğlu ES, McCaslin DR, Markley JL, Volkman BF. Structural rearrangement of human lymphotactin, a C chemokine, under physiological solution conditions. J Biol Chem. 2002;277(20):17863-17870. doi:10.1074/jbc.M200402200

69. Mapelli M, Massimiliano L, Santaguida S, Musacchio A. The Mad2 Conformational Dimer: Structure and Implications for the Spindle Assembly Checkpoint. Cell. 2007;131(4):730-743. doi:10.1016/j.cell.2007.08.049

70. Goodchild SC, Howell MW, Littler DR, et al. Metamorphic Response of the CLIC1 Chloride Intracellular Ion Channel Protein upon Membrane Interaction. Biochemistry. 2010;49(25):5278-5289. doi:10.1021/bi100111c

71. Svergun D, Barberato C, Koch MHJ. CRYSOL - a Program to Evaluate X-ray Solution Scattering of Biological Macromolecules from Atomic Coordinates. J Appl Crystallogr. 1995;28(6):768-773. doi:10.1107/S0021889895007047

72. Shen Y, Bax A. SPARTA+: a modest improvement in empirical NMR chemical shift prediction by means of an artificial neural network. J Biomol NMR. 2010;48(1):13-22. doi:10.1007/s10858-010-9433-9

73. Koukos PI, Bonvin AMJJ. Integrative Modelling of Biomolecular Complexes. J Mol Biol. 2019. doi:https://doi.org/10.1016/j.jmb.2019.11.009

74. Viswanath S, Sali A. Optimizing model representation for integrative structure determination of macromolecular assemblies. Proc Natl Acad Sci. 2019;116(2):540-545. doi:10.1073/pnas.1814649116

75. Schneidman-Duhovny D, Pellarin R, Sali A. Uncertainty in integrative structural modeling. Curr Opin Struct Biol. 2014;28:96-104. doi:https://doi.org/10.1016/j.sbi.2014.08.001

76. Ponce M, van Zon R, Northrup S, et al. Deploying a Top-100 Supercomputer for Large Parallel Workloads: The Niagara Supercomputer. In: Proceedings of the Practice and Experience in Advanced Research Computing on Rise of the Machines (Learning). PEARC ’19. New York, NY, USA: Association for Computing Machinery; 2019. doi:10.1145/3332186.3332195

77. Loken C, Gruner D, Groer L, et al. {SciNet}: Lessons Learned from Building a Power-efficient Top-20 System and Data Centre. J Phys Conf Ser. 2010;256:12026. doi:10.1088/1742-6596/256/1/012026

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 19: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

7. Figures and Tables:

Figure 1: List of systems studied: A) Dihedral switching of alanine dipeptide, B) Folding of Trp-Cage from completely unfolded structure, C) Folding of β-hairpin , D) Intrinsically disordered Histatin-5, and E) Metamorphic switching in bacterial RFA-H are explored using the state-of-the art REST2 and the newly designed REHT approach.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 20: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure 2: Efficiency of REHT in comparison to REST2 in capturing the folding of fast-folding protein, Trp-Cage. (A) Structural overlay between the natively folded NMR structure (blue) and the REHT-generated folded structure of Trp-cage (yellow, obtained at the base replica) (B) Time evolution of protein backbone RMSD from the NMR structure along one of the successfully folded replicas of REHT (top) and REST2 (bottom) simulations. The RMSD evolution of other folded replicas are shown in Fig.S3 and S4. (C) and (D) Free energy landscape of Trp-Cage, shown as a function of Radius of gyration (Rg) and RMSD against the NMR structure. The landscape is shown for the ensembles collected at the base replica of C) REHT simulation and D) REST2 simulation.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 21: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure 3: Extent of hydration dynamics and the resulting changes in the lifetime of protein-water H-bonds in Trp-cage folding: A) Average orientational autocorrelation function of water molecules (for OH bond vector) at different hydration shells surrounding the surface of protein. B) Time correlation functions of protein-water H-bonds formed by both continuous (black) and intermittent (red) water molecules. In both A and B, solid line represents the REHT simulation and dotted line represents the REST2 simulation.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 22: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure 4: Ensemble description of Intrinsically disordered Histatin-5: A) Charge hydropathy plot showing the uniqueness of Histatin-5 that locate at the extreme disordered zone unlike other successfully-studied IDRs which exist at or near the folded zone. B) Comparison between the experimental (black) and theoretical ensemble-averaged SAXS profiles, represented as Kratky plot. The theoretical prediction was made for the last 250ns unweighted trajectories corresponding to the base replica of REHT and REST2 simulations. The distribution of Rg for the ensembles obtained from REST2 (red) and REHT (blue) simulations are shown in the inset. C) Comparison of ensemble averaged chemical shifts of H-alpha atoms predicted from the REHT and REST2 simulations (for the same 250ns trajectory) with reference to the experimental NMR chemical shifts. D) Weakly funneled diffusive energy landscape of Histatin-5 shown as a function of Rg and solvent accessible surface area.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 23: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure 5: Conformational metamorphosis of RFA-H: A) Schematic representation of the two native structures resolved in experiments, the α-helical fold of CTD in presence of NTD and beta fold in isolated CTD. B) The initial and final structures of the REHT simulation. The helical state of CTD while deleting the NTD was chosen as the starting structure (left figure in B). Structural superposition of the final simulation generated beta barrel structure (red) over the experimental beta structure (cyan) is shown on the right side of (B). C) Validation of simulation generated ensemble by comparing the predicted C-alpha chemical shifts with the experimental shifts. The fitted linear regression (blue line) indicates a precise match between the two sets of data. D) Dual basin free energy landscape of RFA-H shown as a function of RMSDs from the experimentally found α-helix (2OUG) and beta-sheet (2LCL). The conformations at each of the basins, all α-helical and all β-sheet (I and VII), early structure with the loss of helicity at both the termini (II), intermediate partially unfolded structure with residual α-helical content (III), and metastable structures with open beta sheets (V and VI) are shown along the landscape. Note the completely unfolded structure (IV) is at the other side of the dual basin and the transition of α → β structure

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 24: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

doesn’t need to step through the completely unfolded structure unlike that of metamorphic lymphotactin.63

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 25: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Supporting information

High resolution ensemble description of metamorphic and intrinsically disordered proteins using an efficient hybrid parallel tempering scheme

Authors: Rajeswari Appadurai1, Jayashree Nagesh2 and Anand Srivastava1* Affiliations: 1Molecular Biophysics Unit, Indian Institute of Science, Bangalore C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian Institute of Science, Bangalore C.V. Raman Road, Bangalore Karnataka 560 012, India Email: [email protected] ORCID Id: 0000-0002-2757-1511

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 26: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Table S1: System descriptions and details of the replica exchange simulation parameters.

System studied

Number of atoms

Number of replicas

Overall Effective temperature range

λ scaling and bath temperature

Cumulative time

(in ns)

Average exchange acceptance probability

Transit time# (in ns)

Alanine dipeptide

1864 5 (3) 300–600K (300–600K)

λ=1-0.567; T=300-340K (λ=1-0.5; T@300K)

100 (60) 0.25 (0.30) 6 (4)

Trp-cage

21038 12 (8) 300–478K (300–478K)

λ=1-0.71; T=300-340K (λ=1-0.63; T@300K)

12000 (8000)

0.12 (0.30) 10 (2)

Beta-hairpin

17203 12 (8*) 300–478K (300–478K)

λ=1-0.71; T=300-340K (λ=1-0.63; T@300K)

12000 0.12 5

Histatin-5

23195 15 (10) 300–478K (300–478K)

λ=1-0.71; T=300-340K (λ=1-0.63; T@300K)

11250 (7500)

0.15 (0.45) 3 (1)

RFA-H 40861 25 (25) 310-705K (310-705K)

λ=1-0.48; T=310-340K (λ=1-0.44; T@310K)

25000 (6250)

0.17 (0.45) 110 (Not achieved till 250ns)

Primary values correspond to that of REHT simulations, and the values in bracket is for REST2 simulation. * In case of Beta-hairpin the REHT simulation is compared with the REST2 simulation performed earlier (Ref:1) using the same forcefield. # Transit time indicates the time taken for a replica to jump over the complete state space. The complete replica mixing allows the full utilization of the temperature space and provides sufficient energy for a biomolecule to undergo conformational transition across the barrier. However, in complex systems like RFA-H, the REST2 method is shown to be ineffective where the complete mixing of replicas is not achieved.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 27: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S1: Exchange of base replica (rep 0) across the complete replica state space in TRP-cage system simulated with A) REST2 and B) REHT method. For the sake of clarity only the first 100 ns data are shown.

A

B

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 28: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S2:Exchange of base replica (rep 0) across the entire replica state space in beta hairpin system simulated with REHT method. The corresponding simulation of the peptide with the REST2 scheme using the same force field can be found in Ref.1

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 29: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S3: A) Time evolutions of backbone RMSD of REHT-generated Beta hairpin structures with reference to the corresponding NMR structure (1le3), shown for the five replicas that are successfully folded. B) Structural overlay of the simulation-generated beta hairpin (yellow) on the native NMR structure (blue).

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 30: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S4: Time evolutions of backbone RMSD of REHT-generated TRP-cage structures with reference to the native NMR structure (1l2y). All the five successfully folded replicas are shown.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 31: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S5: Time evolutions of backbone RMSD of REST2-generated TRP-cage structures with reference to the corresponding NMR structure (1l2y), shown for the only two replicas that are successfully folded.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 32: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S6: Convergence of free energy landscape of TRP-cage ensemble obtained from A) REST2 and B) REHT simulations. The free energy surface is shown along a single reaction coordinate - RMSD with respect to the native folded NMR structure and plotted as a function of simulation length. The figure clearly indicates that the results obtained from REHT converges faster and better than the REST2 simulations.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 33: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Dihedral switch in Alanine dipeptide:

Being the simplest biomolecular model that possesses multiple conformational basins, alanine dipeptide serves as one of the first choices in enhanced sampling literature to study the barrier crossing events. The peptide exhibits three basins across the phi and psi dihedral space. We employed both the conventional REST2 and the novel REHT methods to sample the switching between these dihedral basins of the peptide. The sampled basins across the phi and psi space at different time intervals are illustrated for the lowest temperature replica in Figure S4. While REST2 can sample all the basins within 4 ns timescale, REHT required 6 ns to completely sample this dihedral space. However, the relatively slow transitioning phi angle is frequently sampled in the latter method than the former method (Figure S5). Moreover, the transition was sampled by 4 out of 5 replicas in REHT method, but only once by a single replica out of three in conventional REST2 method. This result suggests that optimal heating of both solute and solvent allows effective crossing of the energy barrier and facilitates frequent transition in the conformational space.

Figure S7: Ramachandran map of alanine dipeptide. The average distribution of dihedral angles of Alanine dipeptide obtained at various time points of REST2 (top panel) and REHT (bottom panel) simulations.

A

B𝜙

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 34: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S8: Time evolution of slow transitioning Phi angle of alanine dipeptide simulated using A) three replicas in REST2 simulations and B) five replicas in REHT simulations.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 35: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S9: Exchange of base replica (Rep0) of His-5 system across the entire replica space as a function of time in A) REST2 and B) REHT simulation.

A

B

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 36: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S10: Distribution of dihedral angles projected on the Ramachandran map for all residues of Histatin-5 structures obtaned from REHT simulations. The plot is shown for three replicas (replicas 0, 3 and 9) arranged in increasing order of their respective temperatures (300K, 308K, 325K)

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 37: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S11: Free energy landscape of Histatin-5 ensemble collected at the base replica of REHT simulation plotted as a function of end to end distance and Rg.

End to end distance (nm)

Rg (n

m)

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 38: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S12: Convergence of His-5 free energy surface along the radius of gyration plotted as a function of simulation length from REHT simulations.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 39: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S13: Time series of secondary structure of Histatin-5 ensemble collected at the base replica of REHT simulations.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 40: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S14: Exchange of lowest rank replica across the complete replica state space in RFA-H system simulated with A) REST2 and B) REHT method.

A

B

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 41: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S15: Time evolution of secondary structure for the RFA-H ensemble generated from REHT simulations using all-𝛼-helix (2OUG) as starting structure.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 42: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S16: Convergence of free energy surface of RFA-H shown along a single reaction coordinate - RMSD with respect to the helical form (pdb IDL 2OUG). The results are shown for different lengths of REHT simulations.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 43: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Figure S17: Secondary structure of all residues of RFA-H along the time series. The plot corresponds to the lowest rank ensemble (310K replica) of REHT simulation initiated from the experimental beta barrel structure.

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 44: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Replica exchange simulation methodology Replica exchange simulations allows for studying equilibrium properties with lesser computational time by its rapid relaxation and improved conformational sampling. Originally developed in Monte-Carlo background (Swendsen and Wang, 1986),2 the method was later introduced to Molecular dynamics simulations by Sugita and Okomoto in 19993 (Temperature replica exchange molecular dynamics (TREM)). The method achieves effective sampling by simulating a series of low and high temperature replicas, while allowing the exchange of configurations at regular intervals. Moreover, due to its stochastic nature of exchanges that ensures the detailed balance, it generates Boltzmann weighted ensemble from which it is straight forward to obtain the thermodynamic averages. The probability of accepting the exchange between replica m and n depends on the difference in the Boltzmann weight factor, that is exponentially related to the difference in energy and temperature (Eqn.1).

𝑃qvv1↔.(𝑇𝑅𝐸𝑀) = y 1, ∆.1≤ 0exp(−∆.1), ∆.1> 0

∆02 =(𝛽. − 𝛽1)3𝐸�1 − 𝐸�.6,

Eqn. 1

Where 𝛽. and 𝛽1 are reciprocal temperatures (1/(𝐾Y𝑇.) and 1/(𝐾Y𝑇1) respectively). 𝐸�1 and 𝐸�. are potential energy of replicas m and n respectively. For a larger system the temperature differences between two replicas should be chosen minimal to yield viable exchange acceptance ratio. This demands large number of replicas to cover a sufficient range of temperatures. 3.1.1 Replica exchange solute tempering (REST2):

Instead of changing the temperature across the replica ladder the advanced replica exchange solute tempering (REST2) method scales the energy function in a particle-wise manner such that the solute is effectively heated up while keeping the water cold. For instance, the potential energy function of replica m is broken down into intramolecular protein interactions (𝐻44), protein-water interactions (𝐻4;), and water self-interactions (𝐻;;)whose potentials are scaled as shown in Eqn. 2.

𝐻c���E1 = λ𝐻44 +√λ𝐻4; + 𝐻;; Eqn. 2

Where λ = 𝛽𝑚𝛽0

, in which 𝛽�and 𝛽1 are reciprocal temperatures of 0th and mth replicas. While

running all the replicas at the same temperature, the method cancels out the energy difference in water self-interaction energy that otherwise hugely contribute for the poor scaling as in TREM. Thus, the acceptance probability of REST2 as shown in Eqn. 3 depends only on the energy differences of intramolecular solute energy and intermolecular energy between protein and water.

∆.1(𝑅𝐸𝑆𝑇2) = (𝛽1 − 𝛽.)�𝐻44(𝑋.) − 𝐻44(𝑋1) +9𝛽�

9𝛽1 +9𝛽.�𝐻4;(𝑋.)− 𝐻4;(𝑋1)�� Eqn. 3

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 45: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

Where, 𝐻44(𝑋1) and 𝐻44(𝑋.) indicates the intramolecular energy of protein solute in mth and nth replicas. 𝐻4;(𝑋1) and 𝐻4;(𝑋.) are interaction energies of protein and water in the two replicas. 𝛽1, 𝛽. and 𝛽� are reciprocal temperatures of replicas m, n and 0th replicas respectively. 3.1.2 Replica exchange hybrid tempering (REHT):

Though the REST2 claims to be efficient for studying larger proteins including that of IDPs as shown recently by Shrestha et.al, it suffers from inability to explore the complex energy landscape with larger energy barriers. We speculated that this could be due to the imbalance between hot solute and cold solvent that causes differential dynamics of central protein and surrounding bulk water. In general, the biomolecular folding and conformational transition is tightly coupled to its surrounding water dynamics. Hence in this work, we introduce a hybrid method that optimally treats the protein as well as the surrounding water. We achieve this by associating the replicas to different bath temperatures in addition to scaling down the potential function of protein in contrary to REST2. Treating the replicas in such a combination doesn’t violate the detailed balance condition. More importantly the method yields expedited protein conformational sampling by allowing efficient rewiring of water. For the viable exchanges the temperature gaps between the adjacent replicas are kept minimal. At the same time the protein is allowed to effectively heated-up to a larger extent by additional REST2 scaling factor. Derivation of REHT method: Let us consider the collection of replicas simulated in replica exchange simulation as {X1, X2 … Xn}. In general, the replicas differ in temperature (Ti) while uses an identical Hamiltonian function (Temperature replica exchange) or vice versa (Hamiltonian replica exchange). In our hybrid approach, we change both the temperature as well as the Hamiltonian across the replicas. Hence the replicas can be denoted as {𝑋1, 𝐻1(𝑋1), 𝑇1}, where 𝑋1,𝐻1(𝑋1), 𝑎𝑛𝑑𝑇1 respectively are configuration, potential energy function and temperature of replica m. Since the replicas are non-interacting, the equilibrium probability of this larger ensemble can simply be obtained by the product of Boltzmann factors of each replica.

𝑃c�p(𝑋) =�1𝑍_exp8−𝛽_𝐻_(𝑋_): ,

_�d

Eqn. 4

Where, 𝛽_ denotes the inverse temperature (1 𝐾Y𝑇⁄ ) and 𝑍_ represents the configurational partition function. Consider an exchange of configurations between a pair of replicas m and n.

{𝑋1,𝐻1(𝑋1), 𝑇1} → {𝑋., 𝐻1(𝑋.), 𝑇1} {𝑋.,𝐻.(𝑋.), 𝑇.} → {𝑋1, 𝐻.(𝑋1), 𝑇.}

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 46: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

The corresponding transition probability for this exchange (𝑋1 → 𝑋.) would be given by, 𝛱(𝑋1 → 𝑋.) = 𝑒𝑥𝑝(−𝛽1𝐻1(𝑋.) + 𝛽.𝐻.(𝑋1))

Imposing the detailed balance condition, where the reverse exchange is allowed with equal probability,

𝛱(𝑋1 → 𝑋.)𝛱(𝑋. → 𝑋1)

= 𝑒𝑥𝑝(−𝛽1𝐻1(𝑋.) + 𝛽.𝐻.(𝑋1))𝑒𝑥𝑝(−𝛽1𝐻1(𝑋1) + 𝛽.𝐻.(𝑋.))

Eqn. 5

= 𝑒𝑥𝑝−[𝛽1𝐻1(𝑋.) + 𝛽.𝐻.(𝑋1) − 𝛽1𝐻1(𝑋1) − 𝛽.𝐻.(𝑋.)]

Eqn. 6

= 𝑒𝑥𝑝(−∆.1) Eqn. 7

Where ∆.1

= 𝛽.[𝐻.(𝑋1) − 𝐻1(𝑋1)] +𝛽1[𝐻1(𝑋.) − 𝐻.(𝑋.)] − (𝛽.− 𝛽1)[𝐻.(𝑋.) − 𝐻1(𝑋1)]

Eqn. 8

With the Metropolis criteria, the probability of accepting the exchange X2 → 𝑋. becomes,

𝛱(𝑋1 → 𝑋.) = y1𝑖𝑓∆.1≤ 0𝑒𝑥𝑝(−∆.1) 𝑖𝑓∆.1> 0 Eqn. 9

We used a deformed Hamiltonian identical to the one used in the REST2 approach.

𝐻1 = 𝜆1𝐻44 +9𝜆1𝐻4; + 𝐻;; Eqn. 10 Applying Eqn.10 on Eqn.8, ∆.1becomes

= 𝛽.3(𝜆. − 𝜆1)(𝐻44(𝑋1) + (9𝜆. − 9𝜆1)𝐻4;(𝑋1)6+𝛽13(𝜆1 − 𝜆.)(𝐻44(𝑋.) + (9𝜆1 −9𝜆.)𝐻4;(𝑋.)6 − (𝛽.− 𝛽1)3𝜆.𝐻44(𝑋.) + 9𝜆.𝐻4;(𝑋.) + 𝐻;;(𝑋.) − 𝜆1𝐻44(𝑋1)

− 9𝜆1𝐻4;(𝑋1) − 𝐻;;(𝑋1)6

Eqn. 11

Coefficient of 𝐻44:

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint

Page 47: High resolution ensemble description of metamorphic and ... · 02/06/2020  · C.V. Raman Road, Bangalore Karnataka 560 012, India 2 Solid State & Structural Chemistry Unit, Indian

= 𝛽.[(λ0 − λ2) +(𝛽. − 𝛽1)λ2]𝐻44(𝑋1)+𝛽1[(λ2 − λ0) − (𝛽. − 𝛽1)λ0]𝐻44(𝑋.)

= −(𝛽.λ0 − 𝛽1λ2)3𝐻44(𝑋.) − 𝐻44(𝑋1)6 Eqn. 12 Coefficient of 𝐻4;:

= 3𝛽.89λ0 − 9λ2: +(𝛽. − 𝛽1)9λ26𝐻4;(𝑋1)

+ 3𝛽189λ2 − 9λ0: − (𝛽. − 𝛽1)9λ06𝐻4;(𝑋.)

=−8𝛽.9λ0 −𝛽19λ2:3𝐻4;(𝑋.) − 𝐻4;(𝑋1)6 Eqn. 13 Coefficient of 𝐻;;:

−(𝛽. − 𝛽1)[𝐻;;(𝑋.) − 𝐻;;(𝑋1)] Eqn. 14 Combining Eqn. 12, 13 and 14,

∆𝒏𝒎(𝑹𝑬𝑯𝑻) = −,(𝛽.λ0 − 𝛽1λ2)3𝐻44(𝑋.) − 𝐻44(𝑋1)6

+8𝛽.9λ0 −𝛽19λ2:3𝐻4;(𝑋.) − 𝐻4;(𝑋1)6

+ (𝛽. − 𝛽1)[𝐻;;(𝑋.) − 𝐻;;(𝑋1)]>

Eqn. 15

References: 1. Kamiya M, Sugita Y. Flexible selection of the solute region in replica exchange with solute

tempering: Application to protein-folding simulations. J Chem Phys. 2018;149(7). doi:10.1063/1.5016222

2. Tyler RC, Murray NJ, Peterson FC, Volkman BF. Native-State Interconversion of a Metamorphic Protein Requires Global Unfolding. Biochemistry. 2011;50(33):7077-7079. doi:10.1021/bi200750k

3. Sugita Y, Okamoto Y. Replica-exchange molecular dynamics method for protein folding. Chem phys Lett. 1999;314:296-297. doi:10.1016/S0009-2614(99)01123-9

.CC-BY-NC-ND 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

The copyright holder for this preprint (whichthis version posted June 3, 2020. ; https://doi.org/10.1101/2020.06.02.124628doi: bioRxiv preprint