10
Direct Observation of Microscopic Reversibility in Single-molecule Protein Folding Ryan Day and Valerie DaggettBiomolecular Structure and Design Program, Department of Medicinal Chemistry, University of Washington, Seattle, WA 98195-7610, USA Both folded and unfolded conformations should be observed for a protein at its melting temperature (T m ), where ΔG between these states is zero. In an all-atom molecular dynamics simulation of chymotrypsin inhibitor 2 (CI2) at its experimental T m , the protein rapidly loses its low-temperature native structure; it then unfolds before refolding to a stable, native-like conforma- tion. The initial unfolding follows the unfolding pathway described previously for higher-temperature simulations: the hydrophobic core is disrupted, the β-sheet pulls apart and the α-helix unravels. The unfolded state reached under these conditions maintains a kernel of structure in the form of a non-native hydrophobic cluster. Refolding simply reverses this path, the side-chain interactions shift, the helix refolds, and the native packing and hydrogen bonds are recovered. The end result of this refolding is not the initial crystal structure; it contains the proper topology and the majority of the native contacts, but the structure is expanded and the contacts are long. We believe this to be the native state at elevated temperature, and the change in volume and contact lengths is consistent with experimental studies of other native proteins at elevated temperature and the chemical denaturant equivalent of T m . © 2006 Elsevier Ltd. All rights reserved. *Corresponding author Keywords: protein folding/unfolding; microscopic reversibility; nonnative interactions; protein dynamics; protein refolding Introduction The unfolding pathway of chymotrypsin inhibitor 2 (CI2) has been characterized extensively by molecular dynamics (MD) simulations at a range of tempera- tures. The unfolding was first characterized by simulation at 498 K, 1,2 and these simulations were compared extensively with experimental data. 36 The good agreement with experiment achieved by these simulations indicated that the unfolding process in simulations at high temperature is similar to the expe- rimentally observed folding and unfolding processes at lower temperatures. More recent simulations as a function of temperature for CI2 and several other proteins demonstrate that the overall unfolding path- way is essentially independent of temperature, 715 and that the unfolding pathway observed in a small number of simulations is similar to the average unfolding pathway calculated from 100 independent simulations. 16 But, lowering the temperature does have a minor effect on the unfolding pathway: partially unfolded conformations take a longer time to completely expose their hydrophobic cores to solvent. During this time, elements of secondary structure can slide relative to one another in a move- ment that is rare at extremely high temperatures. 7,16 Our force field and simulation protocols yield temperature-dependent, or temperature-activated, conformational behavior consistent with the experi- ment melting temperature (T m ) for a variety of proteins, including CI2, 7 the WW domain, 14 α 3 D, 15 the engrailed homeodomain, 9,10 and, most recently and most systematically, the Trp cage. 17 In contrast, replica-exchange molecular dynamics (MD) simula- tions and the use of other force fields lead to overestimates of T m by 65125 K. 1720 So, given that we do obtain unfolding within the proper temperature regimes, we now explore the detailed interactions within CI2 in simulations at its T m . Present address: R. Day, Physics, Applied Physics and Astronomy, SC 1st Fl, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180, USA. Abbreviations used: CI2, chymotrypsin inhibitor 2; MD, molecular dynamics; TS, transition state; RMSD, root-mean-square deviation; NOE, nuclear Overhauser effect. E-mail address of the corresponding author: [email protected] doi:10.1016/j.jmb.2006.11.043 J. Mol. Biol. (2007) 366, 677686 0022-2836/$ - see front matter © 2006 Elsevier Ltd. All rights reserved.

Direct Observation of Microscopic Reversibility in Single-molecule Protein Folding

Embed Size (px)

Citation preview

Page 1: Direct Observation of Microscopic Reversibility in Single-molecule Protein Folding

doi:10.1016/j.jmb.2006.11.043 J. Mol. Biol. (2007) 366, 677–686

Direct Observation of Microscopic Reversibilityin Single-molecule Protein Folding

Ryan Day and Valerie Daggett⁎

Biomolecular Structure andDesign Program, Departmentof Medicinal Chemistry,University of Washington,Seattle, WA 98195-7610, USA

Present address: R. Day, Physics,Astronomy, SC 1st Fl, Rensselaer Pol8th Street, Troy, NY 12180, USA.Abbreviations used: CI2, chymotry

molecular dynamics; TS, transition sroot-mean-square deviation; NOE, neffect.E-mail address of the correspondi

[email protected]

0022-2836/$ - see front matter © 2006 E

Both folded and unfolded conformations should be observed for a protein atits melting temperature (Tm), where ΔG between these states is zero. In anall-atom molecular dynamics simulation of chymotrypsin inhibitor 2 (CI2)at its experimental Tm, the protein rapidly loses its low-temperature nativestructure; it then unfolds before refolding to a stable, native-like conforma-tion. The initial unfolding follows the unfolding pathway describedpreviously for higher-temperature simulations: the hydrophobic core isdisrupted, the β-sheet pulls apart and the α-helix unravels. The unfoldedstate reached under these conditions maintains a kernel of structure in theform of a non-native hydrophobic cluster. Refolding simply reverses thispath, the side-chain interactions shift, the helix refolds, and the nativepacking and hydrogen bonds are recovered. The end result of this refoldingis not the initial crystal structure; it contains the proper topology and themajority of the native contacts, but the structure is expanded and thecontacts are long. We believe this to be the native state at elevatedtemperature, and the change in volume and contact lengths is consistentwith experimental studies of other native proteins at elevated temperatureand the chemical denaturant equivalent of Tm.

© 2006 Elsevier Ltd. All rights reserved.

Keywords: protein folding/unfolding; microscopic reversibility; nonnativeinteractions; protein dynamics; protein refolding

*Corresponding author

Introduction

The unfolding pathway of chymotrypsin inhibitor 2(CI2) has been characterized extensively bymoleculardynamics (MD) simulations at a range of tempera-tures. The unfolding was first characterized bysimulation at 498 K,1,2 and these simulations werecompared extensively with experimental data.3–6 Thegood agreement with experiment achieved by thesesimulations indicated that the unfolding process insimulations at high temperature is similar to the expe-rimentally observed folding and unfolding processesat lower temperatures. More recent simulations as afunction of temperature for CI2 and several other

Applied Physics andytechnic Institute, 110

psin inhibitor 2; MD,tate; RMSD,uclear Overhauser

ng author:

lsevier Ltd. All rights reserve

proteins demonstrate that the overall unfolding path-way is essentially independent of temperature,7–15

and that the unfolding pathway observed in a smallnumber of simulations is similar to the averageunfolding pathway calculated from 100 independentsimulations.16 But, lowering the temperature doeshave a minor effect on the unfolding pathway:partially unfolded conformations take a longer timeto completely expose their hydrophobic cores tosolvent. During this time, elements of secondarystructure can slide relative to one another in a move-ment that is rare at extremely high temperatures.7,16

Our force field and simulation protocols yieldtemperature-dependent, or temperature-activated,conformational behavior consistent with the experi-ment melting temperature (Tm) for a variety ofproteins, including CI2,7 the WW domain,14 α3D,15

the engrailed homeodomain,9,10 and, most recentlyand most systematically, the Trp cage.17 In contrast,replica-exchange molecular dynamics (MD) simula-tions and the use of other force fields lead tooverestimates of Tm by ∼65–125 K.17–20 So, giventhat we do obtain unfolding within the propertemperature regimes, we now explore the detailedinteractions within CI2 in simulations at its Tm.

d.

Page 2: Direct Observation of Microscopic Reversibility in Single-molecule Protein Folding

678 Direct Observation of Microscopic Reversibility

It seems reasonable to assume that microscopicreversibility holds for protein folding and unfolding,such that the conformational changes observed inunfolding are reversed in the refolding process. Weuse the extended IUPAC definition of microscopicreversibility,21 which is a structural definition:

“The principle was widely applied to the analysesof reaction mechanisms, in particular of substitu-tion reactions. In the case of SN2 reactions attetrahedral centers implying a formation of thetrigonal bipyramid transition state (or intermedi-ate structure), the original formulation of theprinciple was extended in the following way: if amolecule or reactant enters a trigonal bipyramidat an apical position, this (or another) molecule orreactant must likewise leave the trigonal bipyr-amid from an apical position.”

This definition has been in use for decades in thefields of physical organic chemistry22,23 and enzy-mology.24 By considering the principle of micro-scopic reversibility structurally, we may apply it tothe protein folding-unfolding reaction. Conforma-tions, or ensembles of conformations such as thetransition state and any intermediate states, thathave relatively low free energies will be populated inboth the folding and unfolding directions. Thus, weconsider folding to be microscopically reversible, inthe sense that the conformations that are observed inthe unfolding process will be observed also in therefolding process. This structural view is the naturalway to evaluate an MD trajectory. With atomisticMD simulations of a solvated system, the statisticsare too poor to use the definition of microscopicreversibility based on detailed balance, whichrequires that the transitions between any two statestake place with equal frequency in both directions atequilibrium. Furthermore, we desire an in-depthview of the process, of the transitions per se andtherefore favor a structural approach.The qualitative and quantitative agreement bet-

ween our unfolding simulations and experimentsprobing refolding intermediates and the foldingtransition state supports our contention that unfold-ing and refolding are linked.25–27 But, the principle ofmicroscopic reversibility requires that the conditionsare the same for the forward and reverse processes,which is clearly not the case when comparing high-temperature unfoldingwith low-temperature refold-ing. We aim to test this issue more directly bysimulation of a protein at its Tm, where the ΔGbetween the native and denatured states is zero. Inthis way, we can track the unfolding and refolding ofa single-protein molecule under identical conditionsin a single continuous trajectory.The approach is distinct from annealing: the

temperature is not raised and lowered. Limitedrefolding has been observed with such annealingapproacheswith all-atommodels and explicit solventby simulating higher-temperature unfolded struc-tures under folding conditions. The first suchsimulations were performed on ubiquitin.28,29 Struc-

tures were taken from various points along a 498 Ktrajectory and simulated at 335 K. Structures takenfrom early in the unfolding trajectory collapsed andbecamemore native-like. Structures taken later alongthe unfolding pathway tended to undergo multiplecycles of unfolding and hydrophobic collapse, withsuccessive collapse events gaining more native-likestructure. Further, 335 K quench simulations of CI2have been performed.30 These simulations wereintended to probe the location of the transition state(TS) of folding/unfolding (a Pfold-like analysis).Simulations that were started from structures beforethe TS became more native-like, while those begin-ningwith structures after the TSdid not. In all of thesecases, however, refolding was limited to a shortportion of the reaction coordinate. A very longcontinuous folding simulation was performed onthe villin headpiece, in which the protein refoldedfrom a disrupted high-temperature (1000K) structureto a stable folding intermediate at 298 K.31 Whilesignificant, but incomplete, folding was observed inthis simulation, folding events were not compared tounfolding events, unfolding and refolding weremonitored under different conditions, and the struc-tural properties of the intermediate were not vali-dated experimentally. More recently, a replica-exchange method has been used to study proteinfolding and unfolding simultaneously.18 Thismethodallows sampling of the entire folding/unfoldingreaction coordinate at a range of temperatures.While this method may be useful for determiningthermodynamic properties of folding/unfolding, it isa fundamentally discontinuous method, in that nosingle replica folds and unfolds at a single tempera-ture and should not be used to compare microscopicevents in unfolding and refolding. So, while thesevarious studies have implicitly assumed that theprinciple of microscopic reversibility is valid, no oneof them has been able to monitor the forward andreverse processes directly under the same conditions.Here, we present a single-molecule simulation of

CI2 at 348 K, in which the protein unfolds to a highlydistorted structure and then refolds to a long-livednative-like conformation. This temperature is justbelow the experimental Tm of the protein (approxi-mately 353 K),32 so that both the native anddenatured states should be populated at equili-brium. This simulation allows for direct comparisonof the unfolding and refolding processes underidentical conditions. As the unfolding and refoldingprocesses take place under the same conditions, truedifferences between the unfolding and refoldingpathways can be separated from temperature orother environmental effects.

Results

The crystalline conformation of CI233 (N) is notstable in simulation at 348 K. Instead, it deformsrapidly, reaching an expanded, native-like confor-mation (N′) before unfolding to a structure that isnearly 9 Å Cα root-mean-square deviation (RMSD)

Page 3: Direct Observation of Microscopic Reversibility in Single-molecule Protein Folding

Figure 1. Cα RMSD of the MD structures during the348 K simulation relative to different reference states. Thereference structures are: crystal structure33 (blue), the200 ns 348 K N′ structure (red), and a partially unfolded398 K MD structure (green). The reference structures aregiven below the plot.

Figure 2. All versus all Cα RMSD matrices betweenstructures along the 348 K trajectory. (a) Cα RMSD matrixfor the full 200 ns. Structures are compared every 100 ps.(b) Cα RMSD matrix for the first 60 ns, blocked out inyellow in (a). Structures are compared every 20 ps. Thedotted lines indicate the times of representative structuresgiven in Figure 4. (c) As (b), but residues in the N and Ctermini (residues 1–12 and 53–64) are not included in theCα RMSD calculation. The color of a point gives the Cα

RMSD between the two time-points given on the X and Yaxes according to the scale at the right. Regions of black toblue along the diagonal indicate clusters of highly similarstructures. Actual conformational states were determinedusing three-dimensional projections with higher granu-larity of time-points.

679Direct Observation of Microscopic Reversibility

from the starting crystal structure. The protein thenrefolds to the N′ conformation (blue curve, Figure 1).This final N′ ensemble is relatively stable for theremainder of the simulation, although there aresome large excursions over time (for example, seethe peak at 170 ns, red curve, Figure 1). The averageCα RMSD between structures over the last 150 ns ofthe simulation is 2.6(±0.5) Å (see the large black andblue cluster in Figure 2(a)). This conformationalensemble satisfies 89% of the short-range (five orfewer residue sequence separation) nuclear Over-hauser effect (NOE) restraints from NMR experi-ments at 310 K.34 Only 47% of the long-range NOEsare satisfied. Nevertheless, the conformations overthe last 150 ns of simulation are topologically thesame as the crystal structure, and they have native,though loosened, packing (Figures 3). The averageVoronoi volume of these structures is 8670 Å3, whereasthe average volume in a 298 K simulation is8360 Å3, corresponding to a volume increase of4%. If the NOE-related cutoff is increased from5.5 Å to 8.5 Å, 96% of the short-range and 71% of thelong-range NOEs are satisfied, illustrating theexpansion but retention of the topology. Unfortu-nately, experimental NOEs for CI2 at, or near, its Tmhave not been determined.Many native or near-native interactions are pre-

sent in the final, refolded 200 ns structure, but thedistances tend to be longer (Figure 3). Interactionsthat define the native α-helix and those betweenstrands 1 and 2 of the sheet are present in therefolded structure, as are contacts between the Nand C termini. The active site loop has formednative-like contacts with strand 2 and the C terminusof the protein. Long-range interactions across the

Page 4: Direct Observation of Microscopic Reversibility in Single-molecule Protein Folding

Figure 3. Contact maps of the starting, maximally unfolded, and final structures. (a) Contacts in the starting (crystal)structure are plotted above the diagonal, and contacts in the maximally unfolded (25.6 ns) structure are plotted below thediagonal. (b) Contacts in the crystal structure and in the final structure (below diagonal). Ribbon diagrams of therespective structures are provided next to the appropriate contact map. A contact between two residues is defined by theshortest distance between heavy atoms: blue≤3.5 Å separation; green≤5.5 Å; and red≤7.5 Å.

680 Direct Observation of Microscopic Reversibility

protein core are present, although the distancesbetween these residues are greater. Additionally,Trp5 is buried, which is notable as it is thefluorescence probe used in experimental studies todistinguish the native and denatured states. In otherwords, when the fluorescence of this residue changesupon exposure to solvent, the protein is, by defini-tion, denatured (or unfolded). In any case, in thesimulations the protein passes through conforma-tions in which Trp5 is buried, then becomes exposedto solvent, and then is re-buried in the hydrophobiccore. In the maximally distorted structure, seen at25.6 ns, the termini are not in contact, and the activesite loop is no longer packed against the sheet.Additionally, the α-helix is unfolded and theremaining hydrophobic interactions are non-native.For these reasons, and the other structural issuesraised above, we denote the early and late structuresduring the simulation the native state at 348 K, or N′.

Unfolding/refolding pathway

The unfolding from N/N′ and refolding to N′proceeds through a series of minimally stableconformations before settling into the final, stablestate. Representative structures from the process of:

NYNVYTSYDYTSYNV

are show in Figure 4. The unfolding is similar to theearly stages of the unfolding pathway at hightemperature described previously.7 There is initiallysome movement in the termini and loops, as well asa loss of well-defined β-sheet structure. The corebecomes partially solvated as the N and C terminiseparate. The N-terminal end of the α-helix sepa-rates from the β-sheet and the helix unfolds. The C-terminal turn of the α-helix is generally preserved(i.e. it fluctuates but it is persistent) in the denaturedstate in the 348 K simulation, as is the case in thedenatured state as probed by both experiment andhigh-temperature MD.5 In addition, many hydro-

phobic contacts between the C terminus of the helixand residues within strands 1 and 2 from the β-sheetare present, although they are non-native (Figures 5and 6). By 25.6 ns, this kernel of non-native structureis essentially all that remains of the hydrophobiccore, and the Cα RMSD to the crystal structure is8.9 Å. The N and C termini are separated from oneanother, and the core of the protein and the activesite loop have become highly deformed. Further-more, Trp5 is exposed to solvent (Figure 5).The unfolding pathway (approximately 8–36 ns) is

very similar to the previously described unfoldingtrajectory at 398 K (Figure 4).7 The structures duringthis period of the 348 K simulation have Cα RMSDsof 3.6–6 Å to the unfolded structures in the 398 Ksimulation (see the green curve in Figure 1 andstructures in Figure 4). The unfolded structures inthe 348 K simulation are more similar to theunfolded structures at 398 K than they are to eitherN or N′ (compare the three curves in Figure 1).Unlike the behavior at 398 K, rather than continu-

ing to unfold to achieve complete solvation of thecore, the protein begins to refold at 348 K from thenon-native nucleus of structure formed between theC terminus of the α-helix and the N termini ofstrands 1 and 2 (Figures 4, 5 and 6). In the crystalstructure, Leu21 is on the surface of the α-helix andIle20 is packed against Ile29, Val31, Val47, and Leu49on strands 1 and 2 in the core of the protein. As theprotein unfolds and Trp5 is exposed to solvent, thehelix twists and unravels, bringing Leu21 into thecore. Leu21 makes contact with Ile29, Ile30, andLeu49, forming a nucleus for refolding (Figure 6).Refolding begins when contacts between strands 1and 2 recover their native register, bringing Leu21into contact with Val47 (Figure 5). The active siteloop then assumes a more native-like conformation.The α-helix reforms and twists back to its nativeconformation, bringing Ile20 back into contact withIle29, Val31, Val47, and Leu49 on strands 1 and 2.Some contacts between the N and C termini arereformed, but they do not recover their tight packing

Page 5: Direct Observation of Microscopic Reversibility in Single-molecule Protein Folding

Figure 4. Representative structures from the simulation. Structures are colored from blue to red according to theirrelative degree of native structure. The lowercase letters correspond to those used in Figure 2. Structures from a 398 Ksimulation are given as an unfolding reference.7 Structures that are in the same column are judged similar by Cα RMSDand position along the reaction coordinate.

681Direct Observation of Microscopic Reversibility

with the helix and strands 1 and 2. Nonetheless,Trp5 is buried in this process. The active site loopremains fairly distorted and does not adopt theprecise conformation of the crystal structure.Experimentally, the active site loop and the N-terminal tail comprise the most dynamic portions ofthe native protein.35,36 As would be expected, therefolding to N′ yields large deviations from the398 K denatured state (Figure 1).The Cα RMSD between conformational clusters

partially reflects these changes, though the changesare masked by the mobility in the termini and activesite loop (Figure 2). The time-point (structure) e at25.6 ns represents the maximally distorted proteinconformation observed at 348 K (Figures 2 and 4).Conformations in this cluster, represented bystructure f at 28 ns, are within 6 Å Cα RMSD ofstructures in the clusters preceding e, such as c and d(Figure 2). If the N and C termini are excluded fromthe calculation, the Cα RMSD drops below 4 Å formany pairs of structures. In the cluster from about35–45 ns, represented by structure g, the protein isrefolding, leading to an increase in the Cα RMSDbetween these structures and those in the unfoldingportion of the simulation (c and d, Figures 2 and 4).Furthermore, the TS ensemble (represented bystructure b) traversed during unfolding is verysimilar to that of refolding (represented by structureg) (Figures 2(c) and 4). Between 45 ns and 50 ns (h),the N′ state is recovered, as evidenced by the low

RMSD values between the structures before theunfolding transitions state and after the foldingtransition state (i.e. see the off-diagonal blue regionlinking the 3–5 ns and the 45–60 ns periods in Figure2(c) and structures in Figure 4).The drastic overall changes that the protein

undergoes can be seen clearly in Figure 7. Thestructures are all aligned at the α-helix, with residues16 and 20 pointing towards the right in every case.From the crystal structure to the 5 ns snapshot, theprotein expands, but the four highlighted residuesremain in contact. In contrast, the C-terminal part ofthe protein unravels and moves around to the otherside of the helix in the 25.6 ns structure. In the 200 nssnapshot the refolding is evident, the C-terminal halfof the protein has moved back to the other face of thehelix and the correct topology is recovered alongwith the packing of the hydrophobic residues in thecore. The change in the shape and surface of theprotein throughout this process is displayed in thebottom panel of Figure 7.

Discussion

At temperatures above 448 K, the unfolding of CI2is essentially a unidirectional process. The nativestructure expands rapidly, allowing the core to be-come completely solvated and leading to a statisticalcoil-like denatured state.1,5,7 At lower temperatures,

Page 6: Direct Observation of Microscopic Reversibility in Single-molecule Protein Folding

Figure 5. Hydrophobic core interactions in the unfolding and refolding pathways. The side-chains of residues Trp5,Ile20, Leu21, Ile29, Val31, Val47, and Leu49 are shown. Trp5 is colored red and Leu21 is orange. As the protein unfolds, theα-helix twists and Leu21 is buried in the core of the protein and the α-helix unfolds. In the most distorted structure, at25.6 ns, the protein is held together by a non-native cluster consisting of Leu21, Ile29, Ile30, and Leu49. The α-helix twistsback as it refolds, and Ile20 reforms its native contacts in the core.

682 Direct Observation of Microscopic Reversibility

373 K and 398 K, unfolding is less direct. Core con-tacts are lost in a more piecemeal fashion, allowingfor sliding of the helix along the sheet beforecomplete unfolding is achieved.7 These smaller-scalemovements require less solvent rearrangement,and are thus more favorable at lower temperaturesand higher solvent densities. The volume of thecavity required to separate one of the terminalstrands from the core is much smaller than thatrequired to expose the entire cavity to solvent. In thesimulation described here at the Tm of CI2, unfoldingproceeds until only a small non-native hydrophobiccluster remains, then the protein refolds to a stablenative-like state, which we denote N′.The initial unfolding process in this 348 K simula-

tion is similar to the unfolding observed at higher

Figure 6. Core interactions in the denatured state at 348 K.core interactions in the 25.6 ns structure. Side-chain atoms ofand all solvent molecules within 5 Å are displayed. (b) Space-fi

temperatures.7 At the beginning of the 348 Ksimulation there is movement in the termini andloops, as well as a loss of well-defined β-sheetstructure. The core becomes partially solvated as theN and C termini separate. The N-terminal end of theα-helix moves away from the β-sheet and the helixunfolds, aside from its C-terminal turn. The C-terminal turn of the α-helix is also preserved in thedenatured state, as probed by previous MD simula-tions and NMR experiments.5 The remaining hydro-phobic contacts between this segment of the helix(Leu21, which was originally on the surface of thehelix pointing away from the core in the native state)and strand residues (particularly Ile29, Ile30, andLeu49) are non-native, and this hydrophobic clusteris essentially all that remains of the protein's

(a) A stereoview of water molecules around the non-nativeresidues Leu21, Ile29, and Leu49 are rendered as spheres,lling view of the structure with solvating water molecules.

Page 7: Direct Observation of Microscopic Reversibility in Single-molecule Protein Folding

Figure 7. Changes in structure, packing, topology, shape and surface properties as CI2 unfolds and refolds. (a) Thecrystal structure and MD snapshots are aligned at the α-helix and the same orientation is shown for each image withAla16 and Ile20 of the helix pointing toward the right. (b) The same structures are displayed as in (a) but the protein iscolored from the N terminus to the C terminus, from blue to red. (c) The same structures are shown in the same orientationas in (a) and (b). The rainbow coloring scheme from (b) is shown but in space-filling mode.

683Direct Observation of Microscopic Reversibility

hydrophobic core (Figures 4, 5 and 6). Furthermore,Trp5, the fluorescence probe of folding and unfold-ing, is exposed to the solvent. Consistent with ourfindings, the residues involved in the non-nativecluster have chemical shift deviations distinct fromrandom coil values, indicative of residual structurein the chemically denatured state,5 as well as infragments used to model the denatured state inwater.37

Whereas at high temperatures the unfoldingcontinues from this point with the complete solva-tion of the hydrophobic core, in this simulation theprotein does not unfold completely. Instead, itrefolds in a process that mirrors the initial unfolding.Refolding begins when contacts between strands 1and 2 recover their native register, bringing Leu21

into contact with Val47. The active site loop alsoassumes a more native-like conformation. The α-helix refolds and twists back to its native conforma-tion. Leu21 returns to its position on the surface ofthe protein, and Ile20 comes back into contact withIle29, Val31, Val47, and Leu49 on strands 1 and 2.Trp5 is re-buried in the process (Figure 5). Thus,structure coalesces around a nucleus of hydrophobicresidues in the reverse of the order that it is lost.Furthermore, the protein passes through the sametransition-state ensemble in both the unfolding andfolding processes. That is, the unfolding pathway is,structurally, microscopically reversible to give therefolding pathway.The order of structure loss in unfolding observed

here is similar to the order of structure loss observed

Page 8: Direct Observation of Microscopic Reversibility in Single-molecule Protein Folding

684 Direct Observation of Microscopic Reversibility

in multiple simulations at 498 K (100 independentruns), where we observe a statistically favored meanunfolding pathway in an ensemble of similar possiblepathways.16 Similarly, the unfolding and refoldingpathways observed in this simulation are likely torepresent only one of a variety of pathways. It is, ofcourse, impossible to show from this single simula-tion that refolding is always the exact mirror ofunfolding, but the partial unfolding and refoldingobserved here do present a concrete model formicroscopic reversibility in proteins. Namely, theorder of structure formation from a given nucleus ofstructure in refolding will be similar to the order ofstructure loss in unfolding to that nucleus of structure.A complete description of the refolding processwould necessarily include all possible paths to allpossible nuclei, but the statistically favored pathwayobserved at 498 K suggests that a smaller set ofpathways will allow a reasonable description of thefolding and unfolding process.The refolding is not a complete mirror of the

unfolding, principally in that the crystal structuredetermined at low temperature,33 is not recovered,nor would we expect it to be. There are stillsignificant differences between N′ and the crystalstructure in the active site loop and the N and Ctermini. Core packing is not as tight in the final N′ensemble, but the residues in the native hydropho-bic core are buried and interacting, and, impor-tantly, Trp5 is buried. The N′ ensemble is very long-lived in the simulation.In crystallographic studies of the structural and

dynamic behavior of ribonuclease A and metmyo-globin, Petsko and co-workers found that increasingtemperature leads to a linear increase in the proteinvolume.38,39 The crystal structure of myoglobinincreases in volume by about 3% over a 120 Ktemperature range, and that of ribonuclease Aincreases by 0.4% per 100 K. The volume of oursimulated N′ state is 4% greater than the native stateat 298 K. This volume increase is probably reason-able, given the difference in the environments: theprotein volume probed experimentally is con-strained in the crystalline state. Regardless, thechange in volume affects the packing interactions;with longer interaction distances observed in highertemperature crystal structures of ribonuclease A.39

Experimentally, there is a shift in the native state tolooser packing and longer intermolecular interac-tions with increasing temperature, consistent withour findings. In addition, a distinct high-tempera-ture folded state has been proposed as an explana-tion for the unfolding behavior of CspA in lasertemperature-jump experiments.40 Similarly, Rhoa-des et al.41 observed a shift from the tightly packednative structure to a more loosely packed nativestate, N′, for folding events at the GndHClequivalent of Tm in single-molecule foldingexperiments. Thus, it is our contention that ourrefolded conformer (N′) represents the native stateat 348 K.In previous temperature-quenched simulations of

CI2 near its unfolding transition state,30 the extent of

core solvation was observed to be the majordeterminant of whether a conformation refoldedquickly. In pre-transition state conformations, watermolecules were not well bound in the protein coreand could be easily expelled to allow rapid refolding.In later structures, water molecules remained in thecore, deterring refolding. In the simulation presentedhere, the protein becomes highly distorted, achievingnearly complete core solvation. Although specificnative core interactions are lost, non-native interac-tions between hydrophobic residues prevent solventfrom totally penetrating between residues compris-ing the α-helix and strands 1 and 2 in the native state,and thus prevents complete unfolding. This non-native cluster facilitates rapid refolding precipitatedby a simple rearrangement of hydrophobic residues,allowing us to observe partial unfolding and refold-ing events, despite the fact that the timescale ofcomplete unfolding and refolding is much longerthan the simulation time. But, we note that this is arare process and we were fortunate to see it. Morerecently, we have been investigating this phenom-enon using an ultrafast folding and unfoldingprotein, the engrailed homeodomain. So far, inseparate simulations of this protein at its Tm(unpublished results) we have observed reversibleunfolding and refolding by the same structuralpathway, which supports our position that MDsimulations can display microscopic reversibility.

Conclusions

In a single, all-atom simulation of CI2 at 348 K, theprotein unfolds and refolds to a stable, native-likestructure. The protein unfolds through a highlydistorted state but unfolding is incomplete, with asmall kernel of non-native hydrophobic contactsremaining in the maximally distorted structure.Folding from this nucleus of non-native structureis the reverse of unfolding: the non-native contactsshift back to their native positions, the helix refolds,and the hydrophobic core is recovered. The sametransition state ensemble is traversed during bothunfolding and refolding. Thus, the order in whichstructure is reformed mirrors the order in which itwas lost under identical conditions, thereby satisfy-ing the principle of microscopic reversibility.

Methods

MD was performed using the program ENCAD.42 Thepotential energy function and the protocols for MD wereas described.43–45 The initial structure used in all simula-tions was the crystal structure solved at 1.7 Å resolution(1YPC).33 The simulation was carried out at neutral pH(Asp and Glu negatively charged, His uncharged, Lys andArg positively charged).The protocols for the MD simulation were as follows.

The starting structure was minimized for 1000 steps. Afterminimization, water molecules (1 g/ml) were added tosolvate the protein in a rectangular box extending at least8 Å in all directions resulting in 2596 water molecules. The

Page 9: Direct Observation of Microscopic Reversibility in Single-molecule Protein Folding

685Direct Observation of Microscopic Reversibility

water density was set to the experimental value for 348 K,0.975 g/ml, by adjusting the volume of the box.46 Thesolvent water was then subjected to conjugate gradientminimization of 1000 cycles followed by 1000 steps of MD.The water was then minimized again for 1000 cycles.Finally, the protein was minimized for 1000 steps,followed by 1000 steps of minimization of the entireprotein–water system. After these preparatory steps, thesystem was heated to 348 K by initially assigning lowatomic velocities from a Maxwellian distribution. Atomswere allowed to move according to Newton's equations ofmotion, and the velocities of the atoms were adjustedintermittently until the system reached the desiredtemperature. At this point the microcanonical ensemblewas used (NVE), with the number of particles and volumeheld constant. Under such conditions energy should be,and was, conserved. An 8 Å force-shifted, non-bondedinteraction cut-off was used,42 and the non-bonded listwas updated every five cycles. The simulation wasperformed for 200 ns using a 2 fs integration time step,yielding 106 structures for analysis.For comparison with the experimental NOEs,34 r−6

weighted distances between the appropriate hydrogenpairs were averaged over the specified time period.Figures were rendered using PyMOL†.47

Acknowledgement

We are grateful for financial support from theNational Institutes of Health (R01 GM 50789).

References

1. Li, A. & Daggett, V. (1994). Characterization of thetransition state of protein unfolding using moleculardynamics: chymotrypsin inhibitor 2. Proc. Natl Acad.Sci. USA, 91, 10430–10434.

2. Li, A. & Daggett, V. (1996). Identification andcharacterization of the unfolding transition state ofchymotrypsin inhibitor 2 by molecular dynamicssimulations. J. Mol. Biol. 257, 412–429.

3. Daggett, V., Li, A., Itzhaki, L. S., Otzen, D. E. & Fersht,A. R. (1996). Structure of the transition state for foldingof a protein derived from experiment and simulation.J. Mol. Biol. 257, 430–440.

4. Ladurner, A. G., Itzhaki, L. S., Daggett, V. & Fersht,A. R. (1998). Synergy between simulation and experi-ment in describing the energy landscape of proteinfolding. Proc. Natl Acad. Sci. USA, 95, 8473–8478.

5. Kazmirski, S. L., Wong, K.-B., Freund, S. M. V., Tan,Y.-J., Fersht, A. R. & Daggett, V. (2001). Protein foldingfrom a highly disordered denatured state: the foldingpathway of chymotrypsin inhibitor 2 at atomicresolution. Proc. Natl Acad. Sci. USA, 98, 4349–4354.

6. Pan, Y. P. & Daggett, V. (2001). Direct comparison ofexperimental and calculated folding free energies forhydrophobic deletion mutants of chymotrypsin inhibi-tor 2: free energy perturbation calculations usingtransition and denatured states from moleculardynamics simulations of unfolding. Biochemistry, 40,2723–2731.

†http://pymol.sourceforge.net/

7. Day, R., Bennion, B. J., Ham, S. & Daggett, V. (2002).Increasing temperature accelerates protein unfoldingwithout changing the pathway of unfolding. J. Mol.Biol. 322, 189–203.

8. Day, R. & Daggett, V. (2005). Sensitivity of thefolding/unfolding transition state ensemble of chy-motrypsin inhibitor 2 to changes in temperature andsolvent. Protein Sci. 14, 1242–1252.

9. Mayor, U., Johnson, C. M., Daggett, V. & Fersht, A. R.(2000). Protein folding and unfolding in microsecondsto nanoseconds by experiment and simulation. Proc.Natl Acad. Sci. USA, 97, 13518–13522.

10. Mayor, U., Johnson, C. M., Grossmann, J. G., Sato, S.,Jas, G. S., Freund, S. M. V. et al. (2003). The completefolding pathway of a protein from nanoseconds tomicroseconds. Nature, 421, 863–867.

11. Gianni, S., Guydosh, N. R., Khan, F., Caldas, T. D.,Mayor, U., White, G. W. N. et al. (2003). Unifyingfeatures in protein-folding mechanisms. Proc. NatlAcad. Sci. USA, 100, 13286–13291.

12. White, G. W. N., Gianni, S., Grossman, J. G., Jemth, P.,Fersht, A. R., & Daggett, V. (2005). Simulation andexperiment conspire to reveal cryptic intermediatesand the slide from the nucleation-condensation toframework mechanism of folding. J. Mol. Biol., 350,757-775.

13. Ferguson, N., Day, R., Johnson, C. M., Allen, M. D.,Daggett, V. & Fersht, A. R. (2005). Simulation andexperiment at high temperatures: ultrafast folding ofa thermophilic protein by nucleation-condensation.J. Mol. Biol. 347, 855–870.

14. Ferguson, N., Pires, J. R., Toepert, F., Johnson, C. J.,Pan, Y. P., Volkmer-Engert, R. et al. (2001). Usingflexible loop mimetics to extend Φ-value analysis tosecondary structure interactions. Proc. Natl Acad. Sci.USA, 98, 13008–13013.

15. Zhu, Y., Alonso, D. O. V., Maki, K., Huang, C.-Y., Lahr,S. J., Daggett, V. et al. (2003). Ultrafast folding of α3D, ade novo designed three-helix bundle protein. Proc. NatlAcad. Sci. USA, 100, 15486–15491.

16. Day, R. & Daggett, V. (2005). Ensemble versus single-molecule protein unfolding. Proc. Natl Acad. Sci. USA,102, 13445–13450.

17. Beck, D. A. C., White, G. W. N., & Daggett, V. (2006).Exploring the energy landscape of protein folding usingreplica-exchange and conventionalmolecular dynamicssimulations. J. Struct. Biol. In press andavailable on-line.doi:10.1016/j.jsb.2006.10.002

18. Garcia, A. E. & Onuchic, J. N. (2003). Folding a proteinin a computer: an atomic description of the folding/unfolding of protein A. Proc. Natl Acad. Sci. USA, 100,13898–13903.

19. Pitera, J. W. & Swope, W. (2003). Understandingfolding and design: Replica-exchange simulations of“Trp-cage” fly miniproteins. Proc. Natl Acad. Sci. USA,100, 7587–7592.

20. Zhou, R. H. (2003). Trp-cage: folding free energylandscape in explicit water. Proc. Natl Acad. Sci. USA,100, 13280–13285.

21. Minkin, V. I. (1999). Commission on physical organicchemistry. Pure Appl. Chem. 71, 1919–1981.

22. Westheimer, F. H. (1968). Pseudo-rotation in the hy-drolysis of phosphate esters. Accts. Chem. Res. 1,70–78.

23. Lahti, D. W. & Espenson, J. H. (2001). Issues ofmicroscopic reversibility and an isomeric intermediatein ligand substitution reactions of five-coordinateoxorhenium(V) dithiolate complexes. J. Am. Chem.Soc. 123, 6014–6024.

Page 10: Direct Observation of Microscopic Reversibility in Single-molecule Protein Folding

686 Direct Observation of Microscopic Reversibility

24. Fersht, A. R. (1999). Structure and Mechanism in ProteinScience. W.H. Freeman and Company, New York.

25. Fersht, A. R. & Daggett, V. (2002). Protein folding andunfolding at atomic resolution. Cell, 108, 573–582.

26. Daggett, V. & Fersht, A. R. (2003). Is there a unifyingmechanism for protein folding? Trends Biochem. Sci.28, 18–25.

27. Daggett, V. & Fersht, A. R. (2003). The present view ofthe mechanism of protein folding. Nature Rev. Mol.Cell. Biol. 4, 497–502.

28. Alonso, D. O. V. & Daggett, V. (1995). Moleculardynamics simulations of protein unfolding andlimited refolding: characterization of partially unfol-ded states of ubiquitin in methanol and in pure water.J. Mol. Biol. 247, 501–520.

29. Alonso, D. O. V. & Daggett, V. (1998). Moleculardynamics simulations of hydrophobic collapse ofubiquitin. Protein Sci. 7, 860–874.

30. De Jong, D., Riley, R., Alonso, D. O. V. & Daggett, V.(2002). Probing the energy landscape of proteinfolding/unfolding transition states. J. Mol. Biol. 319,229–242.

31. Duan, Y. & Kollman, P. A. (1998). Pathways to aprotein folding intermediate observed in a 1-micro-second simulation in aqueous solution. Science, 282,740–744.

32. Jackson, S. E. & Fersht, A. R. (1991). Folding ofchymotrypsin inhibitor 2. 1. Evidence for a two-statetransition. Biochemistry, 30, 10428–10435.

33. Harpaz, Y., elMasry, N., Fersht, A. R. & Henrick, K.(1994). Direct observation of a better hydration at theN-terminus of an α-helix with glycine rather thanalanine as the N-cap residue. Proc. Natl Acad. Sci. USA,91, 3–15.

34. Ludvigsen, S., Shen, H., Kjaer, M., Madsen, J. C. &Poulsen, F. M. (1991). Refinement of the three-dimensional solution structure of barley serine pro-teinase inhibitor 2 and comparison with the structuresin crystals. J. Mol. Biol. 222, 621–635.

35. Shaw, G. L., Davis, B., Keeler, J. & Fersht, A. R. (1995).Backbone dynamics of chymotrypsin inhibitor 2:effect of breaking the active site bond and itsimplications for the mechanism of inhibition of serineproteases. Biochemistry, 34, 2225–2233.

36. Li, A. & Daggett, V. (1995). Investigation of thesolution structure of chymotrypsin inhibitor 2 using

molecular dynamics: comparison to X-ray and NMRdata. Protein Eng. 8, 1117–1128.

37. de Prat Gay, G., Ruiz-Sanz, J., Davis, B. & Fersht, A. R.(1994). The structure of the transition state for theassociation of two fragments of the barley chymo-trypsin inhibitor 2 to generate native-like protein:implications for mechanisms of protein folding. Proc.Natl Acad. Sci. USA, 91, 10943–10946.

38. Frauenfelder, H., Hartmann, H., Karplus, M., Kuntz,I. D., Kuriyan, J., Parak, F. et al. (1987). Thermalexpansion of a protein. Biochemistry, 26, 254–261.

39. Tilton, R. F., Dewan, J. C. & Petsko, G. A. (1992).Effects of temperature on protein structure anddynamics: X-ray crystallographic studies of the pro-tein ribonuclease-A at nine different temperaturesfrom 98 to 320 K. Biochemistry, 31, 2469–2481.

40. Leeson, D. T., Gai, F., Rodriguez, H. M., Gregoret,L. M. & Dyer, R. B. (2000). Protein folding andunfolding on a complex energy landscape. Proc. NatlAcad. Sci. USA, 97, 2527–2532.

41. Rhoades, E., Gussakovsky, E. & Haran, G. (2003).Watching proteins fold one molecule at a time. Proc.Natl Acad. Sci. USA, 100, 3197–3202.

42. Levitt, M. (1990). ENCAD-Energy Calculations andDynamics, Yeda, Rehovat, Israel and Stanford Uni-versity, Stanford, CA.

43. Levitt, M., Hirshberg, M., Sharon, R. & Daggett, V.(1995). Potential energy function and parameters forsimulations of the molecular dynamics of proteins andnucleic acids in solution. Comput. Phys. Commun. 91,215–231.

44. Levitt, M., Hirshberg, M., Sharon, R., Laidig, K. E. &Daggett, V. (1997). Calibration and testing of a watermodel for simulation of the molecular dynamics ofproteins and nucleic acids in solution. J. Phys. Chem. B,101, 5051–5061.

45. Beck, D. A. C., Armen, R. S., & Daggett, V. (2005).Cutoff size need not strongly influence moleculardynamics results on solvated polypeptides. Biochem-istry, 44, 609–616.

46. Haar, L., Gallagher, J. S. & Kell, G. S. (1984).NBS/NRCSteamTables:Thermodynamic andTransportProperties andComputerPrograms forVapor andLiquid StatesOfWater inSI units.Hemisphere Pub. Corp., Washington, DC.

47. DeLano, W. L. (2002). The PyMOL molecular graphicssystem. DeLano Scientific, San Carlos, CA.

Edited by C. R. Matthews

(Received 28 June 2006; received in revised form 1 November 2006; accepted 9 November 2006)Available online 15 November 2006