8
Achieving Rigorous Accelerated Conformational Sampling in Explicit Solvent Urmi Doshi and Donald Hamelberg* Department of Chemistry and the Center for Biotechnology and Drug Design, Georgia State University, P.O. Box 3965, Atlanta, Georgia 30302-3965, United States * S Supporting Information ABSTRACT: Molecular dynamics simulations can provide valuable atomistic insights into biomolecular function. However, the accuracy of molecular simulations on general-purpose computers depends on the time scale of the events of interest. Advanced simulation methods, such as accelerated molecular dynamics, have shown tremendous promise in sampling the conformational dynamics of biomolecules, where standard molecular dynamics simulations are nonergodic. Here we present a sampling method based on accelerated molecular dynamics in which rotatable dihedral angles and nonbonded interactions are boosted separately. This method (RaMD-db) is a dierent implementation of the dual-boost accelerated molecular dynamics, introduced earlier. The advantage is that this method speeds up sampling of the conformational space of biomolecules in explicit solvent, as the degrees of freedom most relevant for conformational transitions are accelerated. We tested RaMD-db on one of the most dicult sampling problems protein folding. Starting from fully extended polypeptide chains, two fast folding α-helical proteins (Trpcage and the double mutant of C-terminal fragment of Villin headpiece) and a designed β-hairpin (Chignolin) were completely folded to their native structures in very short simulation time. Multiple folding/unfolding transitions could be observed in a single trajectory. Our results show that RaMD-db is a promisingly fast and ecient sampling method for conformational transitions in explicit solvent. RaMD-db thus opens new avenues for understanding biomolecular self-assembly and functional dynamics occurring on long time and length scales. SECTION: Biophysical Chemistry and Biomolecules M olecular dynamics (MD) simulations have now become a routine tool in understanding conformational dynamics in biomolecules and gaining atomic-level insights into mechanisms of biomolecular functions. However, the accuracy of the results from MD depends on two factors: how accurately the force eld is able to model the intra- and intermolecular physical forces in solvated biomolecules and how accurately (and desirably eciently) the relevant conformational sampling space is explored. Addressing the force eld and sampling problems is interdependent and expected to continue for a long time, as there is room for substantial improvements in both areas. Ongoing eorts to extend MD to time scales beyond microseconds have resulted in many notable advanced sampling techniques, 1,2 remarkable progress in specialized computational architectures, 35 and parallel and distributed computing. 6 Concurrently, continued validation of existing physical force elds for proteins 7 and nucleic acids 8,9 has led to notable improvements in key backbone 1012 and side-chain 13 torsional parameters. These advances have opened avenues to investigate processes that are of biological importance and occur on long time scales and length scales. Of particular interest has been the folding of proteins, a long-standing problem in biophysical chemistry. 14 Starting from an unstructured polypeptide chain, a protein self- assembles into a well-dened 3-D conformation, as a result of simultaneous transitions of several degrees of freedom over large length scales. Protein folding thus presents a daunting challenge of conformational sampling and an ideal model process for testing enhanced sampling methodologies. Many natural and designed small, fast folding proteins (or domains) that typically fold on the microsecond time scale have now been characterized experimentally and investigated computa- tionally. 1517 With improvements in the force elds, it has been possible to fold and unfold representatives from dierent Received: January 26, 2014 Accepted: March 12, 2014 Published: March 12, 2014 Letter pubs.acs.org/JPCL © 2014 American Chemical Society 1217 dx.doi.org/10.1021/jz500179a | J. Phys. Chem. Lett. 2014, 5, 12171224

Achieving Rigorous Accelerated Conformational Sampling in Explicit Solvent

  • Upload
    donald

  • View
    218

  • Download
    5

Embed Size (px)

Citation preview

Page 1: Achieving Rigorous Accelerated Conformational Sampling in Explicit Solvent

Achieving Rigorous Accelerated Conformational Sampling in ExplicitSolventUrmi Doshi and Donald Hamelberg*

Department of Chemistry and the Center for Biotechnology and Drug Design, Georgia State University, P.O. Box 3965, Atlanta,Georgia 30302-3965, United States

*S Supporting Information

ABSTRACT: Molecular dynamics simulations can provide valuable atomistic insights into biomolecular function. However, theaccuracy of molecular simulations on general-purpose computers depends on the time scale of the events of interest. Advancedsimulation methods, such as accelerated molecular dynamics, have shown tremendous promise in sampling the conformationaldynamics of biomolecules, where standard molecular dynamics simulations are nonergodic. Here we present a sampling methodbased on accelerated molecular dynamics in which rotatable dihedral angles and nonbonded interactions are boosted separately.This method (RaMD-db) is a different implementation of the dual-boost accelerated molecular dynamics, introduced earlier. Theadvantage is that this method speeds up sampling of the conformational space of biomolecules in explicit solvent, as the degreesof freedom most relevant for conformational transitions are accelerated. We tested RaMD-db on one of the most difficultsampling problems − protein folding. Starting from fully extended polypeptide chains, two fast folding α-helical proteins(Trpcage and the double mutant of C-terminal fragment of Villin headpiece) and a designed β-hairpin (Chignolin) werecompletely folded to their native structures in very short simulation time. Multiple folding/unfolding transitions could beobserved in a single trajectory. Our results show that RaMD-db is a promisingly fast and efficient sampling method forconformational transitions in explicit solvent. RaMD-db thus opens new avenues for understanding biomolecular self-assemblyand functional dynamics occurring on long time and length scales.

SECTION: Biophysical Chemistry and Biomolecules

Molecular dynamics (MD) simulations have now becomea routine tool in understanding conformational

dynamics in biomolecules and gaining atomic-level insightsinto mechanisms of biomolecular functions. However, theaccuracy of the results from MD depends on two factors: howaccurately the force field is able to model the intra- andintermolecular physical forces in solvated biomolecules andhow accurately (and desirably efficiently) the relevantconformational sampling space is explored. Addressing theforce field and sampling problems is interdependent andexpected to continue for a long time, as there is room forsubstantial improvements in both areas. Ongoing efforts toextend MD to time scales beyond microseconds have resultedin many notable advanced sampling techniques,1,2 remarkableprogress in specialized computational architectures,3−5 andparallel and distributed computing.6 Concurrently, continuedvalidation of existing physical force fields for proteins7 andnucleic acids8,9 has led to notable improvements in keybackbone10−12 and side-chain13 torsional parameters. These

advances have opened avenues to investigate processes that areof biological importance and occur on long time scales andlength scales. Of particular interest has been the folding ofproteins, a long-standing problem in biophysical chemistry.14

Starting from an unstructured polypeptide chain, a protein self-assembles into a well-defined 3-D conformation, as a result ofsimultaneous transitions of several degrees of freedom overlarge length scales. Protein folding thus presents a dauntingchallenge of conformational sampling and an ideal modelprocess for testing enhanced sampling methodologies. Manynatural and designed small, fast folding proteins (or domains)that typically fold on the microsecond time scale have nowbeen characterized experimentally and investigated computa-tionally.15−17 With improvements in the force fields, it has beenpossible to fold and unfold representatives from different

Received: January 26, 2014Accepted: March 12, 2014Published: March 12, 2014

Letter

pubs.acs.org/JPCL

© 2014 American Chemical Society 1217 dx.doi.org/10.1021/jz500179a | J. Phys. Chem. Lett. 2014, 5, 1217−1224

Page 2: Achieving Rigorous Accelerated Conformational Sampling in Explicit Solvent

structural classes in long atomistic MD simulations on thespecialized supercomputer, Anton.18,19 Except for these and ahandful of other unbiased MD studies,16,20 protein foldingsimulations on the more commonly available general-purposecomputational resources have been implemented with the useof enhanced sampling methods, including replica exchangeMD,21−23 transition path sampling,24 bias-exchange metady-namics,25,26 replica exchange with solute tempering,27 andintegrated tempering MD.28 While some of the simulationsutilized coarse-grained29,30 models or implicit solvation,31−34

others that were carried out in all-atom details have mainlyemployed replica exchange.24,35 Despite being the mostextensively used method for protein folding, replica exchangesuffers from the limitation of using large number of replicas foreven small solvated proteins and spending vast amount ofsimulation time at undesired temperatures.We have developed an improved enhanced sampling

approach36 based on accelerated MD37 that allows simulatingwithout constraining the phase space of biomolecules. Oneneeds not specify the reaction coordinates or collectivevariables before setting up an accelerated MD run becauseprior knowledge of the topography of the potential energylandscape is not required. By introducing a bias potential, theescape rates from minima on the potential energy surface areaccelerated. Subsequently, thermodynamic properties on theoriginal potential can be accurately recovered in a straightfor-ward post-reweighting procedure. In principle, kinetic proper-ties of the system on the original potential can also be retrievedif certain conditions are met.38,39 When acceleration is applied,typically the slower modes are sped up, thus allowing theinvestigation of long time-scale events. Accelerated MD hasbeen extensively used for studying a variety of biologicalproblems including native-state dynamics in proteins,40−42

catalytic mechanisms of HIV-1 protease,43 and cyclophilin A (acis−trans isomerase)44 and the role of conformationaldynamics in the catalytic function of cyclophilin A.45 It hasbeen utilized for testing and refining AMBER force-fieldparameters for peptide bond torsions.11 One other majoradvantage of accelerated MD is that one can tune the level ofacceleration applied, which is generally not possible in otherbiasing methods. This feature has permitted increasing the rateof conformational sampling in proteins and test the appropriatelevel of acceleration required to reproduce experimental NMRobservables such as chemical shifts, S2 order parameters,residual dipolar and scalar J couplings.40,46 In the previousimplementations of accelerated MD, the boost potential hasbeen applied ordinarily to all torsional degrees of freedom.37

However, recently, we have shown that accelerating only therotatable torsions (the RaMD method), the degrees of freedommost pertinent to conformational changes, one can reproducethe original free-energy surface with significantly fewerstatistical errors, introduced due to reweighting of theconfigurations.36 In addition to the internal rotational dihedrals,the slow diffusion of the surrounding solvent can also impedeconformational transitions, especially when large length-scaledisplacements are involved. The diffusive solvent motions aredominantly controlled by the nonbonded interactions, that is,the van der Waals and the pairwise electrostatic interactionenergy terms of the force field. Therefore, in the past, anaccelerated MD approach has been implemented in which thetotal torsional degrees of freedom and the diffusive motionswere accelerated simultaneously by applying two bias potentials− one only to the total dihedral energy and the other one to

the total potential energy (see later).47 This dual boostapproach was shown to perform better in sampling theconformational space (i.e., backbone ϕ−ψ space) in modelpeptides with increased efficiency and faster convergence ratesas compared with conventional MD. Moreover, the rate of loopformation in a model peptide consisting of nine residues wasalso increased as a result of speeding the diffusive motions.We present the RaMD method in combination with dual

boosting that differs from the previous implementation. Wefurther report on the performance evaluation of this RaMD-dual boost (RaMD-db) method for the most notoriousconformational sampling problem, that is, protein folding.The main focus of our current study is to address the problemof efficient conformational sampling. Therefore, we use theforce field that has been proven to be reasonably robust and tosuccessfully fold proteins.13,48 Specifically, we employ two α-helical proteins − the designed Trp-cage 20-residue mini-protein49 and the engineered double mutant (i.e., Lys65Nle/Lys70Nle with Asn68His) of the 35-residue C-terminalsubdomain of the villin headpiece (henceforth referred to asVillin-Nle/Nle)50 and two β-hairpins, the designed hairpinChignolin variant51 and the β-hairpin-forming peptide derivedfrom Nuclear factor erythroid 2-related factor 2 (Nrf2hairpin).52 Experimentally, Trpcage exhibits a folding rate of4 μs at 296 K53 and Villin-Nle/Nle is estimated to fold in 0.7 μsat 300 K50 (as compared with the wild-type villin that folds in4.3 ± 0.5 μs at 300 K). The Chignolin variant has the samecentral sequence as the originally designed Chignolin, exceptfor the terminal Gly residues that are modified to Tyr. Despitebeing a 10-residue peptide, Chignolin could be crystallized andhas been regarded as a mini-protein.51 The 16-mer peptidederived from the Nrf2 protein has also been shown toindependently form a stable β-hairpin structure.52 Because oftheir small sizes and fast folding nature, they are idealcandidates for atomistic simulations and have been the subjectof several computational studies in implicit28,31−33,54,55 andexplicit19,24−26,35,52,56−61 solvent. For each protein or peptide,we started with a fully extended structure in our all-atomexplicit solvent simulations with the RaMD-db method. Inbrief, a continuous and non-negative bias potential, ΔV(r), isadded to the potential energy, V(r), when it falls below a boostenergy, E, which is set prior to starting the aMD run.37 Thepotential energy surface is not modified when V(r) ≥ E. In theaMD method, the bias potential is expressed as ΔV(r) = (E −V(r))2/(α + E − V(r)), where α is a preset tuning parameterthat, along with E, defines the level of acceleration. This form ofthe bias potential prevents the derivative of the potential energy(and hence the force) from being discontinuous at points on rwhen V(r) = E. The potential energy functional mostcommonly used in fixed-charged physical force fields, that is,V(r) = VB(r) + Vθ(r) + VD(r) + VNB(r), includes terms arisingfrom oscillations around equilibrium bonds, VB(r), oscillationsaround equilibrium angles, Vθ(r), rotations around torsionalangles, VD(r), and pairwise nonbonded van der Waals andelectrostatic interactions, VNB(r). In the dual-boost methodpreviously introduced,47 all degrees of freedom were explicitlyaccelerated by adding a boost to the total potential, with anextra secondary boost applied only to the total dihedral angles:V*(r) = [VB(r) + Vθ(r) + [VD(r) + ΔVD(r)] + VNB(r)]+ΔV(r). In the RaMD-db method, a bias potential, ΔVRotD(r),is applied only to the rotatable torsional energy, VRotD(r), and asecond, separate boost, ΔVNB(r), is added to only the potentialenergy of the nonbonded interactions, that is, VRaMD‑db* (r)=

The Journal of Physical Chemistry Letters Letter

dx.doi.org/10.1021/jz500179a | J. Phys. Chem. Lett. 2014, 5, 1217−12241218

Page 3: Achieving Rigorous Accelerated Conformational Sampling in Explicit Solvent

VT(r) + [VRotD(r)+ ΔVRotD(r)] + [VNB(r) + ΔVNB(r)]. HereVT(r) represents the total potential energy arising fromoscillations around equilibrium bonds and angles as well asfrom nonrotatable and improper torsions, that is, VT(r) = VB(r)+ Vθ(r) + VnonRotD(r) + VImprD(r). Therefore, only the rotatable

torsions and nonbonded interactions directly experience theacceleration, while the rest of the degrees of freedom aresimulated on their original potential. The force on the modifiedpotential is then given by F* = −∇VRaMD‑db* (r) = −∇VT(r) −[∇VRotD(r)(αRotD/(αRotD + ERotD − VRotD(r)))

2] −

Figure 1. Folded structure of Trp-cage and root-mean-squared deviation with respect to the experimental structure along the trajectory. (Left panel)Simulations starting from an extended configuration sampled a fully folded structure (red) with root-mean-squared deviation (RMSD) of 0.45 Åfrom the reference experimental structure (green), that is, the first structure from PDB ID: 1L2Y. (Right panel) RMSD to the native structure as afunction of time in the 1.7 μs long trajectory. RMSDs reported in this Figure were calculated using Cα atoms of all residues and heavy atoms of Trp6(shown as sticks in the left panel).

Figure 2. Reweighted 2-D free-energy landscapes of Trp-cage and test for convergence. Free-energy profiles projected onto (A) RMSDCα,Trp, that is,RMSD to native structure 1L2Y, calculated with Cα atoms of all residues and heavy atoms of Trp6 and radius of gyration, Rg, computed using allatoms. (B) RMSD of the polyproline region (green in the right panel) after rms fitting the backbone of all the residues (Supporting Information)and RMSD of the α-helical region (RMSDhelix) shown in red in the Trp-cage structure (top right). (C) RMSDCα,Trp and RMSDhelix. (D) RMSD ofresidues involved in forming the hydrophobic core (Trp6, Pro12, Pro17−19) shown in yellow in the Trp-cage structure and distance between Cγ andCζ atoms (cyan spheres in the Trp-cage structure) of Asp9 and Arg16 (cyan sticks), respectively. Asp9 and Arg16 are known to form a stabilizing saltbridge (orange). Two-dimensional Gaussian kernel was used for probability density estimation (Supporting Information). Contour lines are drawnevery 5 kcal/mol. (E) Folding free energy, ΔGf, as a function of simulation lengths converged to a value of −0.88 kcal/mol after 500 ns.

The Journal of Physical Chemistry Letters Letter

dx.doi.org/10.1021/jz500179a | J. Phys. Chem. Lett. 2014, 5, 1217−12241219

Page 4: Achieving Rigorous Accelerated Conformational Sampling in Explicit Solvent

[∇VNB(r)(αNB/(αNB + ENB − VNB(r)))2]. Equilibrium proper-

ties on the original potential are retrieved by reweighting eachconfiguration by β[ΔVRotD(r) + ΔVNB(r)], where β is 1/kBT.As compared with boosting all degrees of freedom, theadvantage of accelerating only the diffusive degrees of freedombetween protein−solvent, (intra) protein−protein, solvent−solvent, and rotational torsions of protein is the relativelyreduced magnitudes of the statistical weights and therebyincreased statistical accuracy in obtaining the true equilibriumproperties. Moreover, applying acceleration to the already fastermotions such as bond vibrations and angle rotations ornonrotatable torsions that do not alter significantly uponconformational changes will have less impact on speeding upthe overall conformational sampling of a biomolecule.From a single RaMD-db trajectory, we could reproducibly

observe several folding−unfolding transitions in Trpcage withan average folding time (Supporting Information) of 78 ns at300 K for the trajectory of ∼1700 ns shown in Figure 1. Aprevious study on Trpcage using normal MD on Antonobtained an average folding time of 14 μs at 296 K.19

Therefore, under similar conditions (i.e., at 300 K), the RaMD-db method sped up the average folding time in Trpcage by afactor of ∼180 times. Configurations with RMSD (calculatedwith Cα and heavy atoms of Trp6) <2 Å were sampled in∼27% of the total simulation time. In agreement withexperimental folding free energy of approximately −0.65kcal/mol at 296 K, we estimated ΔGf of −0.88 kcal/mol at300 K (Figure 2E) using a simple two-state approximation(Supporting Information). We further divided the trajectoryinto three parts and calculated ΔGf for each part, yielding anaverage value of −1.2 ± 1.1 kcal/mol. Although the trajectoryas a whole had reached convergence, the three individual partsof the trajectory had not, thus, yielding overestimated errors inΔGf. The other source of error arises from reweighting theconformations from biased simulations to retrieve equilibrium

properties of the original potential. Because of reweighting,there is potential loss in the number of sampled points leadingto an increase in statistical error in the retrieved properties,which can be qualitatively assessed from the smoothness of thefree-energy contour lines (Figure 2A−D). Such errors can beestimated by a quantitative theory developed by us in the pastand that relates the loss in the sampling size to the decrease instatistical accuracy.36

With multiple folding−unfolding events, relevant areas of theconformational space were sufficiently sampled, as can be seenin the 2-D free-energy landscapes in Figure 2A−D. Aconvergence test was performed by calculating ΔGf as afunction of increasing simulation lengths (Figure 2E). The free-energy plots were projected onto various local and globalcollective variables that provide information about the foldingof the different regions of the protein as well as the overallprotein. Despite several experimental53,62−70 and theoreti-cal19,24,25,28,32,34,35,54, investigations on Trp-cage folding,consensus for its folding mechanism is still lacking. Althoughit is now becoming increasingly clear that no proteins foldstrictly via a two-state mechanism, direct evidence of anintermediate has been lacking from equilibrium and kineticexperiments that provide low structural resolution.53,64,70 Whileother experimental and computational studies do suggest thepresence of a partially folded or collapsed intermediate alongthe folding pathway,24,25,35,62,63,65,66,68,69 there is still noagreement on the nature of this intermediate.When probed by Trp fluorescence53 and at multiple amide I′

frequencies (i.e., at 1612 and 1580 cm−1) in infraredspectroscopy66 for a Trpcage variant, first-order T-jumprelaxation kinetics were observed, consistent with a two-statemodel. However, when probed at 1664 cm−1 amide Ifrequency, the T-jump-induced relaxation exhibited two kineticphases, the slow phase that was attributed to the global folding/unfolding of Trp-cage and the fast phase to the melting of the

Figure 3. Superposition of experimental and folded structures of Villine-Nle/Nle and time series of Cα-RMSD with respect to the experimentalstructure. (Left) Starting from an extended polypeptide chain, Villin-Nle/Nle folded to a structure (red) with RMSD of 0.46 Å with respect to thePDB structure 2F4K (green). (Right) Cα-RMSD of residues 4−32 along two independent RaMD-db trajectories (top and bottom).

The Journal of Physical Chemistry Letters Letter

dx.doi.org/10.1021/jz500179a | J. Phys. Chem. Lett. 2014, 5, 1217−12241220

Page 5: Achieving Rigorous Accelerated Conformational Sampling in Explicit Solvent

310-helix.66 Recent infrared experiments monitored T-jump-

induced relaxation at different final temperatures and atfrequencies that selectively probe the folding of the polyprolineregion in addition to other structural elements (α-helix and 310-helix) of Trp-cage.71 They observed biexponential kineticbehavior with the slow phase comprising significant contribu-tions from the polyproline and α-helical regions at alltemperatures. During re-equilibration, the fluctuations in thepolyproline region were found to progressively dominate thefast phase with the increase in the final T-jump temperature.These results were interpreted along with observations fromreplica-exchange MD, and the presence of a native-likeintermediate was suggested. Remarkably, the free-energyprofiles obtained from our RaMD-db method were inqualitative agreement with those calculated with replica-exchange MD by Meuzelaar et al.,71 indicating that RaMD-dbwas able to sample the correct conformational space. In ourcase, the projection of free-energy profiles on variables verysimilar to those chosen by Meuzelaar et al. exhibited basin(s)corresponding to compact native-like intermediate(s). How-ever, the exact positions of the free-energy barriers basins andthe heights of free-energy barriers do not match because theenergy landscape obtained from RaMD-db is significantly muchbetter sampled than that by the replica-exchange MD. Theresults clearly show that RaMD-db is capable of providing amicroscopic picture of protein folding − identifying inter-mediates that are not detected in some equilibrium unfoldingstudies or implicit solvation MD simulations. Future detailedanalysis of trajectories and characterization of multiple foldingroutes involving such intermediates will be required to

reconcile our observation of intermediates with experimentalresults interpreted with two- or three-state models.To further test the performance of RaMD-db method, we

also simulated the folding of Villin-Nle/Nle. In twoindependent RaMD-db simulations of over 400 ns each andcarried out at 360 K, Villin-Nle/Nle successfully folded to itsnative structure, having sampled Cα-RMSD of 0.46 Å withrespect to the PDB structure 2F4K (Figure 3). While the firsttrajectory took less than 300 ns for the first folding transitionand ∼35 ns for the second one, the second trajectory yielded afolded structure only after 380 ns. Although we did notsufficiently sample folding−unfolding transitions in this case, ascompared with the average folding time of 2.8 μs obtained fromnormal MD simulations on Anton,19 RaMD-db yielded at leastseven times speed up in folding Villin-Nle/Nle.After RaMD-db was successful in folding α-helical proteins,

we next tested it for the β-hairpin forming peptides, theChignolin variant and the Nrf2 peptide. We simulated theChignolin variant at 340 K, starting from a fully extended chainin two independent trajectories, where each was run for close to300 ns. We could observe several folding−unfolding transitionsin each trajectory, with an average folding time of 4 ns (Figure4, top) and 3.4 ns (Figure 4, bottom). Unbiased simulations ofthe Chignolin variant at 340 K on Anton obtained a meanfolding time of 0.6 μs, which was approximately 150 and 170times slower than the average folding times calculated fromRaMD-db trajectories (Supporting Information). Similarly, wealso observed β-sheet formation over the course of RaMD-dbsimulations of the Nrf2 peptide, initiated at a fully extendedconformation (Supporting Information). The β-hairpin was

Figure 4. Experimental and folded structures of the Chignolin β-hairpin and root-mean-square deviation with respect to the experimental structurealong two trajectories. (Left panel) Superposition of the folded Chignolin β-hairpin (red) and its reference structure PDB id 1UAO (with residues 1and 10 mutated to TYR, Supporting Information) (green) having Cα-RMSD of (top) 0.43 Å and (bottom) 0.36 Å. (Right panel) Cα-RMSD timeseries for two independent trajectories of lengths 308 and 280 ns, respectively.

The Journal of Physical Chemistry Letters Letter

dx.doi.org/10.1021/jz500179a | J. Phys. Chem. Lett. 2014, 5, 1217−12241221

Page 6: Achieving Rigorous Accelerated Conformational Sampling in Explicit Solvent

observed to have formed in very short simulation times (of<100 ns).In conclusion, we presented an accelerated MD method that

applies individual acceleration on rotatable dihedrals andnonbonded terms of the potential energy functional. We testedthis method to fold two fast-folding α-helical proteins and twoβ-hairpins in all-atom detail and explicit solvent. Boostingrotatable torsions and nonbonded interactions helps inspeeding up conformational transitions and diffusive degreesof freedom of the solvent. While accelerating rotatable dihedralsmay affect the speed of forming α-helices with local interactionsand β-sheet alike, boosting nonbonded interactions that governdiffusive motions has an added advantage in forming nonlocalinteractions as in β-sheet-containing peptides and 3-Darrangement of secondary structures in proteins. As a result,multiple folding−unfolding transitions were observed in asingle trajectory. RaMD-db was able to extensively sample thefolding landscape of Trpcage, the features of which areconsistent with previous experimental and computationalstudies. In all of the systems considered here, RaMD-db wasfound to speed up folding significantly over brute-force MD.However, the extent of speed up achieved depends on theacceleration parameters used and sampling of folding-unfoldingevents. Our results indicated that RaMD-db could serve as avaluable tool for sampling biomolecules in explicit solvent aswell as potentially for faster protein structure prediction andforce-field assessment. Unlike other constrained samplingmethods that require selecting a collective variable, RaMD-dbcan be performed in an unrestrained manner over the entirephase space. This provides the advantage of projecting theenergy landscape on any order parameters of choice after post-processing and computing the properties of interest. The speedwith which RaMD-db samples events is dependent on theextent of the boost applied, that is, perturbation of the systemfrom the desired potential. The more the modified potentialdiffers from the original one, the more error is introduced inrecovering the properties of the original landscape. AlthoughRaMD-db significantly improves the statistical accuracy overprevious versions of aMD,36 the reweighting problem is notcompletely alleviated. Therefore, depending on the question ofinterest, there is a trade-off between sampling speed andaccuracy. RaMD-db, however, can potentially be combined withreplica-exchange methodology to prevent reweighting alto-gether.72 Accelerated MD-based methods have been shown tosuccessfully retrieve dynamics on the unmodified potential andactual kinetic rates for Markovian processes that have barriersmuch higher than kBT, provided the boost applied whentransitioning from one well to the other is close to zero.38

However, satisfying this condition for fast folding proteins thathave marginal barriers is possible in principle but nontrivial toimplement currently. Nonetheless, RaMD-db is an efficient andfast sampling method that is fairly simple to set up andstraightforward to retrieve accurate equilibrium properties.

■ ASSOCIATED CONTENT

*S Supporting InformationComputational methods and the results for the Nrf2 peptide.This material is available free of charge via the Internet athttp://pubs.acs.org.

■ AUTHOR INFORMATIONCorresponding Author*E-mail: [email protected]. Tel: 404-413-5564. Fax: 404-413-5505.NotesThe authors declare no competing financial interest.

■ ACKNOWLEDGMENTSThis work is partially supported by the National ScienceFoundation (MCB-0953061) and the Georgia ResearchAlliance. This work was also supported by Georgia State’sIBM System p7 supercomputer, acquired through a partnershipof the Southeastern Universities Research Association and IBMsupporting the SURA grid initiative.

■ REFERENCES(1) Elber, R. Long-Timescale Simulation Methods. Curr. Opin. Struct.Biol. 2005, 15, 151.(2) Lei, H.; Duan, Y. Improved Sampling Methods for MolecularSimulation. Curr. Opin. Struct. Biol. 2007, 17, 187.(3) Stone, J. E.; Hardy, D. J.; Ufimtsev, I. S.; Schulten, K. GPU-Accelerated Molecular Modeling Coming of Age. J. Mol. GraphicsModell. 2010, 29, 116.(4) Shaw, D. E.; Dror, R. O.; Salmon, J. K.; Grossman, J. P.;Mackenzie, K. M.; Bank, J. A.; Young, C.; Deneroff, M. M.; Batson, B.;Bowers, K. J.; Chow, E.; Eastwood, M. P.; Ierardi, D. J.; Klepeis, J. L.;Kuskin, J. S.; Larson, R. H.; Lindorff-Larsen, K.; Maragakis, P.; Moraes,M. A.; Piana, S.; Shan, Y.; Towles, B. In Proceedings of the Conference onHigh Performance Computing Networking, Storage and Analysis; ACM:Portland, OR, 2009; p 1.(5) Friedrichs, M. S.; Eastman, P.; Vaidyanathan, V.; Houston, M.;Legrand, S.; Beberg, A. L.; Ensign, D. L.; Bruns, C. M.; Pande, V. S.Accelerating Molecular Dynamic Simulation on Graphics ProcessingUnits. J. Comput. Chem. 2009, 30, 864.(6) Shirts, M.; Pande, V. S. COMPUTING: Screen Savers of theWorld Unite! Science 2000, 290, 1903.(7) Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.;Simmerling, C. Comparison of Multiple Amber Force Fields andDevelopment of Improved Protein Backbone Parameters. Proteins2006, 65, 712.(8) Perez, A.; Marchan, I.; Svozil, D.; Sponer, J.; Cheatham, T. E.;Laughton, C. A.; Orozco, M. Refinement of the Amber Force Field forNucleic Acids: Improving the Description of Alpha/Gamma Con-formers. Biophys. J. 2007, 92, 3817.(9) Zhu, X.; Lopes, P. E. M.; MacKerell, A. D. Recent Developmentsand Applications of the CHARMM Force Fields. Wiley Interdiscip.Rev.: Comput. Mol. Sci. 2012, 2, 167.(10) Best, R. B.; Hummer, G. Optimized Molecular Dynamics ForceFields Applied to the Helix-Coil Transition of Polypeptides. J. Phys.Chem. B 2009, 113, 9004.(11) Doshi, U.; Hamelberg, D. Re-Optimization of the AMBERForce Field Parameters for Peptide Bond (Omega) Torsions UsingAccelerated Molecular Dynamics. J. Phys. Chem. B 2009, 113, 16590.(12) MacKerell, A. D., Jr.; Feig, M.; Brooks, C. L., 3rd. ImprovedTreatment of the Protein Backbone in Empirical Force Fields. J. Am.Chem. Soc. 2004, 126, 698.(13) Lindorff-Larsen, K.; Piana, S.; Palmo, K.; Maragakis, P.; Klepeis,J. L.; Dror, R. O.; Shaw, D. E. Improved Side-Chain Torsion Potentialsfor the Amber ff99SB Protein Force Field. Proteins 2010, 78, 1950.(14) Freddolino, P. L.; Harrison, C. B.; Liu, Y.; Schulten, K.Challenges in Protein-Folding Simulations. Nat. Phys. 2010, 6, 751.(15) Prigozhin, M. B.; Gruebele, M. Microsecond Folding Experi-ments and Simulations: A Match Is Made. Phys. Chem. Chem. Phys.2013, 15, 3372.(16) Lane, T. J.; Shukla, D.; Beauchamp, K. A.; Pande, V. S. ToMilliseconds and beyond: Challenges in the Simulation of ProteinFolding. Curr. Opin. Struct. Biol. 2013, 23, 58.

The Journal of Physical Chemistry Letters Letter

dx.doi.org/10.1021/jz500179a | J. Phys. Chem. Lett. 2014, 5, 1217−12241222

Page 7: Achieving Rigorous Accelerated Conformational Sampling in Explicit Solvent

(17) Kubelka, J.; Hofrichter, J.; Eaton, W. A. The Protein Folding’Speed Limit. Curr. Opin. Struct. Biol. 2004, 14, 76.(18) Shaw, D. E.; Maragakis, P.; Lindorff-Larsen, K.; Piana, S.; Dror,R. O.; Eastwood, M. P.; Bank, J. A.; Jumper, J. M.; Salmon, J. K.; Shan,Y.; Wriggers, W. Atomic-Level Characterization of the StructuralDynamics of Proteins. Science 2010, 330, 341.(19) Lindorff-Larsen, K.; Piana, S.; Dror, R. O.; Shaw, D. E. HowFast-Folding Proteins Fold. Science 2011, 334, 517.(20) Best, R. B.; Mittal, J. Microscopic Events in Β-Hairpin Foldingfrom Alternative Unfolded Ensembles. Proc. Natl. Acad. Sci. U.S.A.2011, 108, 11087.(21) Best, R. B.; Mittal, J. Balance between α and β Structures in AbInitio Protein Folding. J. Phys. Chem. B 2010, 114, 8790.(22) Best, R. B.; Mittal, J. Protein Simulations with an OptimizedWater Model: Cooperative Helix Formation and Temperature-Induced Unfolded State Collapse. J. Phys. Chem. B 2010, 114, 14916.(23) Mittal, J.; Best, R. B. Tackling Force-Field Bias in ProteinFolding Simulations: Folding of Villin HP35 and Pin WW Domains inExplicit Water. Biophys. J. 2010, 99, L26.(24) Juraszek, J.; Bolhuis, P. G. Sampling the Multiple FoldingMechanisms of Trp-cage in Explicit solvent. Proc. Natl. Acad. Sci. U.S.A.2006, 103, 15859.(25) Marinelli, F.; Pietrucci, F.; Laio, A.; Piana, S. a Kinetic Model oftrp-cage Folding from Multiple Biased Molecular DynamicsSimulations. PLoS Comput. Biol. 2009, 5, e1000452.(26) Piana, S.; Laio, A. a Bias-Exchange Approach to Protein Folding.J. Phys. Chem. B 2007, 111, 4553.(27) Wang, L.; Friesner, R. A.; Berne, B. J. Replica Exchange withSolute Scaling: A More Efficient Version of Replica Exchange withSolute Tempering (REST2). J. Phys. Chem. B 2011, 115, 9431.(28) Shao, Q.; Shi, J.; Zhu, W. Enhanced Sampling MolecularDynamics Simulation Captures Experimentally Suggested Intermediateand Unfolded States in the Folding Pathway of Trp-cage Miniprotein.J. Chem. Phys. 2012, 137.(29) Han, W.; Schulten, K. Characterization of Folding Mechanismsof Trp-cage and WW-domain by Network Analysis of Simulations witha Hybrid-Resolution Model. J. Phys. Chem. B 2013, 117, 13367.(30) Ding, F.; Buldyrev, S. V.; Dokholyan, N. V. Folding Trp-Cage toNMR Resolution Native Structure Using a Coarse-Grained ProteinModel. Biophys. J. 2005, 88, 147.(31) Duan, L.; Mei, Y.; Li, Y.; Zhang, Q.; Zhang, D.; Zhang, J.Simulation of the Thermodynamics of Folding and Unfolding of theTrp-cage Mini-Protein TC5b Using Different Combinations of ForceFields and Solvation Models. Sci. China Chem. 2010, 53, 196.(32) Chowdhury, S.; Lee, M. C.; Duan, Y. Characterizing the Rate-Limiting Step of Trp-cage Folding by All-Atom Molecular DynamicsSimulations. J. Phys. Chem. B 2004, 108, 13855.(33) Pitera, J. W.; Swope, W. Understanding Folding and Design:Replica-Exchange Simulations of “Trp-cage” Miniproteins. Proc. Natl.Acad. Sci. U.S.A. 2003, 100, 7587.(34) Snow, C. D.; Zagrovic, B.; Pande, V. S. The Trp Cage: FoldingKinetics and Unfolded State Topology via Molecular DynamicsSimulations. J. Am. Chem. Soc. 2002, 124, 14548.(35) Zhou, R. Trp-cage: folding Free Energy Landscape in ExplicitWater. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 13280.(36) Doshi, U.; Hamelberg, D. Improved Statistical Sampling andAccuracy with Accelerated Molecular Dynamics on RotatableTorsions. J. Chem. Theor. Comput. 2012, 8, 4004.(37) Hamelberg, D.; Mongan, J.; McCammon, J. A. AcceleratedMolecular Dynamics: A Promising and Efficient Simulation Methodfor Biomolecules. J. Chem. Phys. 2004, 120, 11919.(38) Xin, Y.; Doshi, U.; Hamelberg, D. Examining the Limits of TimeReweighting and Kramers’ Rate Theory to Obtain Correct Kineticsfrom Accelerated Molecular Dynamics. J. Chem. Phys. 2010, 132,224101.(39) Doshi, U.; Hamelberg, D. Extracting Realistic Kinetics of RareActivated Processes from Accelerated Molecular Dynamics UsingKramers’ Theory. J. Chem. Theor. Comput. 2011, 7, 575.

(40) Markwick, P. R.; Bouvignies, G.; Blackledge, M. ExploringMultiple Timescale Motions in Protein GB3 Using AcceleratedMolecular Dynamics and NMR Spectroscopy. J. Am. Chem. Soc. 2007,129, 4724.(41) Markwick, P. R.; McCammon, J. A. Studying FunctionalDynamics in Bio-Molecules Using Accelerated Molecular Dynamics.Phys. Chem. Chem. Phys. 2011, 13, 20053.(42) McGowan, L. C.; Hamelberg, D. Conformational Plasticity of anEnzyme during Catalysis: Intricate Coupling between Cyclophilin ADynamics and Substrate Turnover. Biophys. J. 2013, 104, 216.(43) Hamelberg, D.; McCammon, J. A. Fast Peptidyl Cis-TransIsomerization within the Flexible Gly-Rich Flaps of HIV-1 Protease. J.Am. Chem. Soc. 2005, 127, 13778.(44) Hamelberg, D.; McCammon, J. A. Mechanistic Insight into theRole of Transition-State Stabilization in Cyclophilin A. J. Am. Chem.Soc. 2009, 131, 147.(45) Doshi, U.; McGowan, L. C.; Ladani, S. T.; Hamelberg, D.Resolving the Complex Role of Enzyme Conformational Dynamics inCatalytic Function. Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 5699.(46) Markwick, P. R.; Cervantes, C. F.; Abel, B. L.; Komives, E. A.;Blackledge, M.; McCammon, J. A. Enhanced Conformational SpaceSampling Improves the Prediction of Chemical Shifts in Proteins. J.Am. Chem. Soc. 2010, 132, 1220.(47) Hamelberg, D.; de Oliveira, C. A.; McCammon, J. A. Samplingof Slow Diffusive Conformational Transitions with AcceleratedMolecular Dynamics. J. Chem. Phys. 2007, 127, 155102.(48) Lindorff-Larsen, K.; Maragakis, P.; Piana, S.; Eastwood, M. P.;Dror, R. O.; Shaw, D. E. Systematic Validation of Protein Force Fieldsagainst Experimental Data. PLoS One 2012, 7, e32131.(49) Neidigh, J. W.; Fesinmeyer, R. M.; Andersen, N. H. Designing a20-Residue Protein. Nat. Struct. Biol. 2002, 9, 425.(50) Kubelka, J.; Chiu, T. K.; Davies, D. R.; Eaton, W. A.; Hofrichter,J. Sub-Microsecond Protein Folding. J. Mol. Biol. 2006, 359, 546.(51) Honda, S.; Akiba, T.; Kato, Y. S.; Sawada, Y.; Sekijima, M.;Ishimura, M.; Ooishi, A.; Watanabe, H.; Odahara, T.; Harata, K.Crystal Structure of a Ten-Amino Acid Protein. J. Am. Chem. Soc.2008, 130, 15327.(52) Cino, E. A.; Choy, W. Y.; Karttunen, M. Comparison ofSecondary Structure Formation Using 10 Different Force Fields inMicrosecond Molecular Dynamics Simulations. J. Chem. Theor.Comput. 2012, 8, 2725.(53) Qiu, L.; Pabit, S. A.; Roitberg, A. E.; Hagen, S. J. Smaller andFaster: the 20-Residue Trp-cage Protein Folds in 4 Micros. J. Am.Chem. Soc. 2002, 124, 12952.(54) Simmerling, C.; Strockbine, B.; Roitberg, A. E. All-AtomStructure Prediction and Folding Simulations of a Stable Protein. J.Am. Chem. Soc. 2002, 124, 11258.(55) Son, W. J.; Jang, S.; Pak, Y.; Shin, S. Folding Simulations withNovel Conformational Search Method. J. Chem. Phys. 2007, 126,104906.(56) Day, R.; Paschek, D.; Garcia, A. E. Microsecond Simulations ofthe Folding/Unfolding Thermodynamics of the Trp-Cage Mini-protein. Proteins 2010, 78, 1889.(57) Kannan, S.; Zacharias, M. Folding Simulations of Trp-Cage MiniProtein in Explicit Solvent Using Biasing Potential Replica-ExchangeMolecular Dynamics Simulations. Proteins: Struct., Funct., Bioinf. 2009,76, 448.(58) Freddolino, P. L.; Schulten, K. Common Structural Transitionsin Explicit-Solvent Simulations of Villin Headpiece Folding. Biophys. J.2009, 97, 2338.(59) Duan, Y.; Kollman, P. A. Pathways to a Protein FoldingIntermediate Observed in a 1-Microsecond Simulation in AqueousSolution. Science 1998, 282, 740.(60) Ensign, D. L.; Kasson, P. M.; Pande, V. S. Heterogeneity Even atthe Speed Limit of Folding: Large-Scale Molecular Dynamics Study ofa Fast-Folding Variant of the Villin Headpiece. J. Mol. Biol. 2007, 374,806.

The Journal of Physical Chemistry Letters Letter

dx.doi.org/10.1021/jz500179a | J. Phys. Chem. Lett. 2014, 5, 1217−12241223

Page 8: Achieving Rigorous Accelerated Conformational Sampling in Explicit Solvent

(61) Kuhrova, P.; De Simone, A.; Otyepka, M.; Best, R. B. Force-Field Dependence of Chignolin Folding and Misfolding: Comparisonwith Experiment and Redesign. Biophys. J. 2012, 102, 1897.(62) Ahmed, Z.; Beta, I. A.; Mikhonin, A. V.; Asher, S. A. UV-Resonance Raman Thermal Unfolding Study of Trp-Cage Shows ThatIt Is Not a Simple Two-State Miniprotein. J. Am. Chem. Soc. 2005, 127,10943.(63) Neuweiler, H.; Doose, S.; Sauer, M. a Microscopic View ofMiniprotein Folding: Enhanced Folding Efficiency through Formationof an Intermediate. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 16650.(64) Streicher, W. W.; Makhatadze, G. I. Unfolding Thermodynamicsof Trp-Cage, A 20 Residue Miniprotein, Studied by DifferentialScanning Calorimetry and Circular Dichroism Spectroscopy. Bio-chemistry 2007, 46, 2876.(65) Mok, K. H.; Kuhn, L. T.; Goez, M.; Day, I. J.; Lin, J. C.;Andersen, N. H.; Hore, P. J. a Pre-Existing Hydrophobic Collapse inthe Unfolded State of an Ultrafast Folding Protein. Nature 2007, 447,106.(66) Culik, R. M.; Serrano, A. L.; Bunagan, M. R.; Gai, F. AchievingSecondary Structural Resolution in Kinetic Measurements of ProteinFolding: A Case Study of the Folding Mechanism of Trp-Cage. Angew.Chem. 2011, 50, 10884.(67) Barua, B.; Lin, J. C.; Williams, V. D.; Kummler, P.; Neidigh, J.W.; Andersen, N. H. The Trp-Cage: Optimizing the Stability of aGlobular Miniprotein. Protein Eng., Des. Sel. 2008, 21, 171.(68) Rovo, P.; Straner, P.; Lang, A.; Bartha, I.; Huszar, K.; Nyitray, L.;Perczel, A. Structural Insights into the Trp-Cage Folding IntermediateFormation. Chemistry 2013, 19, 2628.(69) Lai, Z.; Preketes, N. K.; Mukamel, S.; Wang, J. Monitoring theFolding of Trp-Cage Peptide by Two-Dimensional Infrared (2dir)Spectroscopy. J. Phys. Chem. B 2013, 117, 4661.(70) Heyda, J.; Kozisek, M.; Bednarova, L.; Thompson, G.;Konvalinka, J.; Vondrasek, J.; Jungwirth, P. Urea and GuanidiniumInduced Denaturation of a Trp-Cage Miniprotein. J. Phys. Chem. B2011, 115, 8910.(71) Meuzelaar, H.; Marino, K. A.; Huerta-Viga, A.; Panman, M. R.;Smeenk, L. E. J.; Kettelarij, A. J.; van Maarseveen, J. H.; Timmerman,P.; Bolhuis, P. G.; Woutersen, S. Folding Dynamics of the Trp-CageMiniprotein: Evidence for a Native-Like Intermediate from CombinedTime-Resolved Vibrational Spectroscopy and Molecular DynamicsSimulations. J. Phys. Chem. B 2013, 117, 11490.(72) Fajer, M.; Hamelberg, D.; McCammon, J. A. Replica-ExchangeAccelerated Molecular Dynamics (REXAMD) Applied to Thermody-namic Integration. J. Chem. Theor. Comput. 2008, 4, 1565.

The Journal of Physical Chemistry Letters Letter

dx.doi.org/10.1021/jz500179a | J. Phys. Chem. Lett. 2014, 5, 1217−12241224