9
Dynamics of the Hck–SH3 domain: Comparison of experiment with multiple molecular dynamics simulations DAVID A. HORITA, 1,3 WEIXING ZHANG, 2,4 THOMAS E. SMITHGALL, 2,5 WILLIAM H. GMEINER, 2 and R. ANDREW BYRD 1,3 1 ABL-Basic Research Program, National Cancer Institute–Frederick Cancer Research and Development Center, Frederick, Maryland 21702-1201 2 Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, Nebraska 68198-6805 ~Received June 15, 1999; Final Revision October 24, 1999; Accepted November 5, 1999! Abstract Molecular dynamics calculations provide a method by which the dynamic properties of molecules can be explored over timescales and at a level of detail that cannot be obtained experimentally from NMR or X-ray analyses. Recent work ~ Philippopoulos M, Mandel AM, Palmer AG III, Lim C, 1997, Proteins 28:481– 493! has indicated that the accuracy of these simulations is high, as measured by the correspondence of parameters extracted from these calculations to those determined through experimental means. Here, we investigate the dynamic behavior of the Src homology 3 ~SH3! domain of hematopoietic cell kinase ~ Hck! via 15 N backbone relaxation NMR studies and a set of four independent 4 ns solvated molecular dynamics calculations. We also find that molecular dynamics simulations accurately reproduce fast motion dynamics as estimated from generalized order parameter ~ S 2 ! analysis for regions of the protein that have experimentally well-defined coordinates ~i.e., stable secondary structural elements!. However, for regions where the coordinates are not well defined, as indicated by high local root-mean-square deviations among NMR-determined structural family members or high B-factors 0 low electron density in X-ray crystallography determined structures, the parameters calculated from a short to moderate length ~ less than 5–10 ns! molecular dynamics trajectory are dependent on the particular coordinates chosen as a starting point for the simulation. Keywords: 15 N relaxation; conformational sampling; molecular dynamics simulations; order parameters; SH3 domain Flexibility and dynamics are integral aspects of protein function ~ Weber, 1975; Karplus & McCammon, 1983!. Folding of major structural elements can contribute a large DS from burial of hy- drophobic surface area and, hence, drive intermolecular binding ~Spolar & Record, 1994!. Conformational flexibility at a ligand- binding surface can allow a protein to recognize members of a class of ligands with various degrees of affinity and may be a mechanism by which a single molecule can have pluripotent ef- fects. Changes in conformational flexibility induced by ligand bind- ing may play a role in allostery ~Cooper & Dryden, 1984!. There is a wide range in both the type and timescale of motions that typically occur in proteins from fast local backbone and side-chain reorientations ~ lifetimes of picoseconds! to segmental movement of secondary or tertiary structural elements ~ lifetimes of microsec- onds to seconds! to slow folding 0unfolding equilibria ~ lifetimes of hours or more!. It is frequently possible to infer which regions of a protein are conformationally flexible in structures solved either by X-ray crystallography ~e.g., high B-factors or missing electron density! or NMR methods ~ high local RMSDs among structural family members!. Solution-state NMR also provides, through relaxation measurements ~ Palmer, 1997!, for explicit determination that local dynamical processes are present. Nonetheless, determination of the specific motional pathways of flexible regions of proteins is currently beyond the capabilities of either the X-ray or NMR structural determination method. To address this latter issue, com- putational molecular dynamics ~ MD! methods can be used ~Chan- drasekhar et al., 1992; Smith et al., 1995; Chatfield et al., 1998!. Such methodology is, with rare exceptions ~ Duan & Kollman, Reprint request to: R. Andrew Byrd, NCI-FCRDC, Bldg. 538, Box B, Frederick, Maryland 21702-1201; e-mail: [email protected]. 3 Present address: Structural Biophysics Laboratory, Program in Struc- tural Biology, Bldg. 538, POB B, National Cancer Institute–Frederick Can- cer Research and Development Center, Frederick, Maryland 21702-1201. 4 Present address: Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, Tennessee 38105. 5 Present address: Department Molecular Genetics and Biochemistry, University of Pittsburgh School of Medicine, E1240 Biomedical Science Tower, Pittsburgh, Pennsylvania 15261. Abbreviations: Hck, hematopoietic cell kinase; NOE, nuclear Over- hauser effect; MD, molecular dynamics; RMSD, root-mean-square devia- tion; SH3, Src homology 3. Protein Science ~2000!, 9:95–103. Cambridge University Press. Printed in the USA. Copyright © 2000 The Protein Society 95

Dynamics of the Hck-SH3 domain: Comparison of experiment with multiple molecular dynamics simulations

Embed Size (px)

Citation preview

Dynamics of the Hck–SH3 domain: Comparison ofexperiment with multiple molecular dynamics simulations

DAVID A. HORITA, 1,3 WEIXING ZHANG,2,4 THOMAS E. SMITHGALL,2,5

WILLIAM H. GMEINER, 2 and R. ANDREW BYRD1,3

1ABL-Basic Research Program, National Cancer Institute–Frederick Cancer Research and Development Center,Frederick, Maryland 21702-1201

2Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center,Omaha, Nebraska 68198-6805

~Received June 15, 1999;Final Revision October 24, 1999;Accepted November 5, 1999!

Abstract

Molecular dynamics calculations provide a method by which the dynamic properties of molecules can be explored overtimescales and at a level of detail that cannot be obtained experimentally from NMR or X-ray analyses. Recent work~Philippopoulos M, Mandel AM, Palmer AG III, Lim C, 1997,Proteins 28:481–493! has indicated that the accuracy ofthese simulations is high, as measured by the correspondence of parameters extracted from these calculations to thosedetermined through experimental means. Here, we investigate the dynamic behavior of the Src homology 3~SH3!domain of hematopoietic cell kinase~Hck! via 15N backbone relaxation NMR studies and a set of four independent 4 nssolvated molecular dynamics calculations. We also find that molecular dynamics simulations accurately reproduce fastmotion dynamics as estimated from generalized order parameter~S2! analysis for regions of the protein that haveexperimentally well-defined coordinates~i.e., stable secondary structural elements!. However, for regions where thecoordinates are not well defined, as indicated by high local root-mean-square deviations among NMR-determinedstructural family members or highB-factors0 low electron density in X-ray crystallography determined structures, theparameters calculated from a short to moderate length~less than 5–10 ns! molecular dynamics trajectory are dependenton the particular coordinates chosen as a starting point for the simulation.

Keywords: 15N relaxation; conformational sampling; molecular dynamics simulations; order parameters; SH3 domain

Flexibility and dynamics are integral aspects of protein function~Weber, 1975; Karplus & McCammon, 1983!. Folding of majorstructural elements can contribute a largeDS from burial of hy-drophobic surface area and, hence, drive intermolecular binding~Spolar & Record, 1994!. Conformational flexibility at a ligand-binding surface can allow a protein to recognize members of aclass of ligands with various degrees of affinity and may be amechanism by which a single molecule can have pluripotent ef-

fects. Changes in conformational flexibility induced by ligand bind-ing may play a role in allostery~Cooper & Dryden, 1984!. Thereis a wide range in both the type and timescale of motions thattypically occur in proteins from fast local backbone and side-chainreorientations~lifetimes of picoseconds! to segmental movementof secondary or tertiary structural elements~lifetimes of microsec-onds to seconds! to slow folding0unfolding equilibria~lifetimes ofhours or more!.

It is frequently possible to infer which regions of a protein areconformationally flexible in structures solved either by X-raycrystallography~e.g., highB-factors or missing electron density!or NMR methods~high local RMSDs among structural familymembers!. Solution-state NMR also provides, through relaxationmeasurements~Palmer, 1997!, for explicit determination thatlocal dynamical processes are present. Nonetheless, determinationof the specific motional pathways of flexible regions of proteinsis currently beyond the capabilities of either the X-ray or NMRstructural determination method. To address this latter issue, com-putational molecular dynamics~MD! methods can be used~Chan-drasekhar et al., 1992; Smith et al., 1995; Chatfield et al., 1998!.Such methodology is, with rare exceptions~Duan & Kollman,

Reprint request to: R. Andrew Byrd, NCI-FCRDC, Bldg. 538, Box B,Frederick, Maryland 21702-1201; e-mail: [email protected].

3Present address: Structural Biophysics Laboratory, Program in Struc-tural Biology, Bldg. 538, POB B, National Cancer Institute–Frederick Can-cer Research and Development Center, Frederick, Maryland 21702-1201.

4Present address: Department of Structural Biology, St. Jude Children’sResearch Hospital, Memphis, Tennessee 38105.

5Present address: Department Molecular Genetics and Biochemistry,University of Pittsburgh School of Medicine, E1240 Biomedical ScienceTower, Pittsburgh, Pennsylvania 15261.

Abbreviations: Hck, hematopoietic cell kinase; NOE, nuclear Over-hauser effect; MD, molecular dynamics; RMSD, root-mean-square devia-tion; SH3, Src homology 3.

Protein Science~2000!, 9:95–103. Cambridge University Press. Printed in the USA.Copyright © 2000 The Protein Society

95

1998!, currently restricted to the study of motions that occur withhigh ~.109 Hz! frequency. A significant issue is the validation ofexperimental and computational dynamics studies. Recent work~Philippopoulos et al., 1997! indicates that there can be good agree-ment. Our study extends the comparison with an examination of asmall protein domain that has flexible regions as an importantcomponent of its mechanism of action.

The canonical Src homology 3~SH3! domain spans roughly 65residues and contains two orthogonalb-sheets, each of which con-sists of three antiparallelb-strands with one shared strand. The fivestrands are connected to each other by two large loops~tradition-ally labeled as the RT and n-Src loops!, a turn, and a 310 helix~Cohen et al., 1995!. Both the n-Src and RT loops have beenimplicated in directing SH3-ligand specificity and have frequentlybeen observed to be flexible or conformationally disordered. Pre-viously, we reported the solution structure of the Hck–SH3 domain~Horita et al., 1998!. In that study, we also found conformationaldisorder of the n-Src and RT loops and observed an interactionbetween the n-Src loop and a peptide ligand.

NMR-solved structures are typically presented as a family ofbetween roughly 20 and 50 conformers, all of which are consistentwith experimentally derived constraints, and there is often no in-dication as to whether a particular conformer is to be preferredover another. Hence, choosing a starting conformation for MDsimulations is not an unambiguous procedure. For regions of stablesecondary structural elements, the atomic RMSD between familymembers is small@Hck–SH3 strand and helix backbone~all heavy!atoms, RMSD for 25 structures5 0.3 ~0.8! Å#, but for loop re-gions, such as the n-Src and RT loops, the RMSD is higher@Hck–SH3 loop and turn backbone~all heavy! atoms, RMSD for 25structures5 0.7 ~1.6! Å# ~see below!. Because we were particu-larly interested in the dynamic properties of the loop regions, wewanted to investigate the effect of starting conformation on cal-culated dynamic behavior.

Here, we report the results of experimental measurements ofbackbone dynamics for Hck–SH3 as determined through15N re-laxation measurements and compare them with the results of fourindependent 4 ns solvated MD simulations. These dynamics tra-jectories differed from each other in the coordinate sets used forthe starting conformations. All other computational and method-ological parameters were identical. Three of these starting confor-mations were chosen from the 25 low-energy conformers determinedpreviously in our laboratory, while the fourth conformation wasobtained from the crystal coordinates of Hck. Hence, all startingconformations are energetically reasonable. We have also calcu-lated two 1 ns simulations using the same coordinates and condi-tions as used in the first simulation, but with different randomnumber seeds. For each trajectory, we have calculated time corre-lation functions and extracted generalized order parameters~Lipari& Szabo, 1982a; Clore et al., 1990!, which we have compared toexperimentally determined order parameters.

Results and discussion

Determination of motional models andparameters from relaxation data

The canonical SH3 domain has a low degree of rotational asym-metry, and the principal axes of inertia for Hck–SH381–136, whichspans the first and lastb-strands, have a ratio of 1.0:1.0:0.9. How-ever, the particular construct that we have used for both structural

and dynamic studies, Hck–SH372–143, has several unstructured res-idues at the N- and C-termini. Because the formalism for axiallysymmetric rotation~Lipari & Szabo, 1982a, 1982b! is functionallyequivalent to the extended formalism for isotropic rotation~Cloreet al., 1990!, it is important to identify the impact of rotationalanisotropy on measured relaxation data~Tjandra et al., 1995!.Calculation of 40 structures of Hck–SH372–143 yields a familyof structures with principal axes ratios typically 0.6:0.8:1.0~range 0.5–0.6:0.7–1.0:1.0, depending on conformation of N- andC-termini!. Because the N- and C-terminal residues are conforma-tionally mobile, as elucidated from15N relaxation data, it was notimmediately apparent whether the isotropic or axially symmetricrotational model was appropriate. Analysis ofR20R1 ratios forthose residues with15N heteronuclear NOE values.0.6, however,showed a statistically significant~F3,375 8.96,p , 0.00015! cor-relation with the N–H bond angle from the principal axis,u ~Bev-ington & Robinson, 1992; Tjandra et al., 1995!. Hence, the axiallysymmetric rotational model is appropriate for Hck–SH372–143.Figure 1 shows a plot of theR20R1 ratio vs.u.

15N R1, R2, and NOE relaxation data were used to fit fivemotional models~1, S2; 2, S2 andt9s; 3, S2 andRex; 4, S2, t9s, andRex; and 5,Sf

2, Ss2, andt9s! based on the extended formalism de-

scribed by Clore et al.~1990! using the program Modelfree v.4 andthe axially symmetric rotational model. This model assumes threecorrelation times: the overall molecular correlation timetm, theeffective correlation time for fast motionstf ~less than severalhundred ps!, and the effective correlation time for slow motionsts

~tf , ts , tm!. The time constant for slow motionst9s 5 tstm0~ts 1 tm! is associated withS2 ~models 2 and 4! or Ss

2 ~model 5!.The time constant for fast motionst9f 5 tftm0~tf 1 tm!, associatedwith S2 in models 1 and 3 and withSf

2 in model 5, is consideredto be sufficiently short that the term exp~2t0tf! approaches zero.The rotational correlation timetm was determined to be 5.7 ns andD50D4 to be 1.3. Model selection followed the outline described byMandel et al.~1995!. The results are shown in Figure 2 and con-firm the mobility of the three loops common to SH3 domains. Inour previous study, both the central residues of the RT loop~91–97! and the n-Src loop~110–113! were observed to be disorderedin the family of structures determined from NMR data, whereas the

Fig. 1. Experimentally determined values~at 11.7 T! of R20R1 vs. u ~deg!for Hck–SH3. Residues with a substantial contribution fromRex or residueswith 1H–15N NOE , 0.65 are excluded from this plot. The solid linerepresents the calculated value ofR20R1 vs. q assumingtm 5 5.7 ns andD50D4 5 1.34.

96 D.A. Horita et al.

distal loop~121–124! was well constrained by1H NOE data~Fig. 3in Horita et al., 1998!. The absence of several1H–15N resonancesin the RT loop and the low order parameters for the n-Src~S2 50.42! and distal~S2 5 0.72! loops indicate that all three loopsundergo substantial motion relative to the fixed molecular axessystem. Hence, it appears that the NOE data overconstrain theconformation of the distal loop of Hck–SH3.

Consistency of MD simulations

Dynamics parameters, such as order parameters, and correlationtimes, can be extracted from MD trajectories with significantlyhigher precision than they can be determined from experimentalrelaxation measurements. However, it is difficult to assess theaccuracy of these MD-extracted parameters, and the collection ofMD simulations published to date indicates that the accuracy issomewhat lower than the precision. Consequently, the detection of

small dynamical differences among residues is difficult. In partic-ular, it is difficult to determine whether any given computed timecorrelation is best described by one or two exponential decays andorder parameters~i.e., the original Lipari–Szabo model vs. theextended Clore model!. Figure 3A–H shows plots of the dot prod-ucts,m~t!{m~t!, and the time correlation functionsC~t! for resi-dues 105~strand bb! and 121~distal loop! as calculated fromsimulations MD2 and MD3.

Although anF-test can, in principle, determine whether a timecorrelation function is best described by a one or two exponentialdecay, rigorous analysis of time correlation functions is made com-plicated due to the dependence ofx2 on the error ascribed toC~t!,the time over whichC~t! is computed, and the values oftf andts.Therefore, we have opted for a qualitative analysis. BecauseS2 5Sf

2{Ss2, the ratio~S20Sf

2! ranges from 0 to 1. The relative differencebetweenS2 andSf

2 can be readily visualized by examination of theterm @1 2 ~S20Sf

2!#. Lower values indicate a mono-exponentialcharacteristic of the time correlation function, whereas highervalues are evidence for a bi-exponential characteristic. Figure 4shows a plot of@1 2 ~S20Sf

2!# vs. residue number for each of thetrajectories MD1–MD4. The RT, n-Src, and distal loops show thehighest evidence for motion on multiple timescales. Of greatersignificance, however, are substantial differences across the tra-jectories for particular residues, which implies that the simulationsare sampling different dynamical behavior. For instance, the timecorrelation function for residues in the distal loop is noticeablybi-exponential in MD2 but not MD3~Fig. 3G,H!, and the spacesampled by the NH vectors is substantially different between sim-ulations ~Fig. 3K,L!. Similarly, the bi-exponential character oftime correlation functions for residues in the n-Src and RT loopsvary markedly across the simulations~Fig. 4!.

Because the occurrence of rare transitions in the MD trajectorycan lower the calculated value ofS2, the time correlation functionshould be computed from a section of the trajectory that is free ofsuch transitions~Philippopoulos & Lim, 1995!. This can be justi-fied by considering that the time correlation function should bewell sampled over the length of the calculation. Transitions thatoccur only several times in a 4 nssimulation, consequently, shouldnot be considered in calculation of a 500 ps correlation function.Excluding residues in the N- and C-termini, 12 residues in MD1,11 residues in MD2, 3 residues in MD3, and 5 residues in MD4required calculation of the time correlation function from a trun-cated trajectory from which rare transitions were excluded. Mostresidues that exhibited rare transitions clustered in the RT, n-Src,and distal loops. Only S111 and G112 showed rare transitions ineach of the four trajectories. K101 and G102, which are located inthe turn between the RT loop and strandbb, exhibited a confor-mational transition in MD2 only. Consequently, although the pres-ence of rare transitions for a particular residue may be indicativeof slower motions~lifetimes in the nanosecond range!, the absenceof rare transitions is not indicative of an absence of slower mo-tions. These differences between dynamics trajectories further high-light the sensitivity to the starting conformation of dynamic behaviorthroughout a trajectory.

Figure 5 shows plots ofDS2 5 S2 2 PS2 as a function of residuefor each MD simulation. Figure 5A–D shows residues that are instable secondary structural elements—specifically,b-strands, the310 helix, and residues~85–90, 98–101! in the RT loop that do nottypically exhibit flexibility or disorder in the numerous SH3 do-main structures solved to date. Overall,DS2 is small~s 5 0.023,calculated over all simulations!, indicating a high degree of agree-

Fig. 2. S2, t9s, andRex determined for Hck–SH3. Values of 0 fort9s andRex

indicate that these parameters were not used in the relaxation model for thatparticular residue. Missing values forS2, t9s, or Rex indicate that relaxationdata could not be obtained for that particular residue.

Dynamics of the Hck–SH3 domain 97

ment, across simulations, of the order parameter calculated foreach residue. In contrast, Figure 5E–H showsDS2 calculated forthe loop regions of Hck–SH3. The substantially larger deviation ofDS2 ~s 5 0.072! for many residues indicates that the dynamicalbehavior is not consistently described. Figure 6 shows histogramsplotted for each of the corresponding subgraphs of Figure 5. Re-gions where the structure is well defined show narrow, stronglypeaked distributions~Fig. 6A–D!, while regions where the struc-ture is not well defined show broad distributions~Fig. 6E–H!.Residues 82–90, 98–108, 116–119, and 124–135 have an aggre-gate backbone RMSD~determined from the family of 25 structuresin 4HCK! of 0.3 Å and a range ofs ~over simulations MD1–MD4!for DS2 of 0.019–0.025. In contrast, residues 91–97, 109–115, and120–124 have a backbone RMSD of 0.7 Å and a range ofs forDS2 of 0.047–0.074.

To test the effects of random number seed on calculated orderparameter, we calculated two additional 1 ns simulations using thesame initial conditions and coordinates as MD1, but with differentrandom number seeds. Several residues in the loop regions did notreach equilibrium within a sufficient time to allow calculation of atime correlation function. Excluding these residues from consid-eration, the order parameters calculated from these two simulationswere generally similar to each other. Analyzed over residues 82–93, 95–108, 116–119, and 123–135, the standard deviation,s, ofthe by-residue difference in calculated order parameter was 0.026.

Analogous values ofs for MD1 vs. MD2, MD1 vs. MD3, andMD2 vs. MD3 were 0.051, 0.067, and 0.058, respectively. Use ofdifferent random number seeds does result in substantial differ-ences in dynamical behavior for residues in, and proximal to, then-Src loop~109–115!. In one of the 1 ns simulations, the n-Srcloop does not reach equilibrium during the course of the simula-tion, and several residues undergo dihedral angle flips. In the other1 ns simulation, the n-Src loop is relatively quiescent. Althoughuse of different random number seeds will have an effect on thecalculated order parameter, in the case of Hck–SH3, this effect issmaller than that observed from use of different starting coordinates.

Using configuration space analysis of MD simulations of myo-globin, Andrews et al.~1998! observed that differences betweenstructural clusters were primarily localized to loop regions con-necting myoglobin’s helices. These authors also noted that evenminuscule differences in starting coordinates caused trajectories tosample different coordinate subspaces. Taken in concert with ourresults, these data imply that small differences in starting condi-tions can lead to different sampling of the transient local confor-mations of loop regions and concomitant significant differences inthe extracted rapid timescale dynamical parameters from theseregions.

Our results thus contrast somewhat with those of Philippopouloset al.~1997!, who found the results of MD simulations sufficientlyaccurate and precise to be of use in distinguishing between exper-

A B C D

E F G H

I J K LFig. 3. ~A–D! Dot product,~E–H! time correlation function, and~I–L ! two-dimensional projection of the NH bond vector forresidues 105 and 121 in simulations MD2 and MD3, sampled at 20 ps intervals over the last 3 ns of each simulation. NH bond vectorprojections are presented such that the amide N atom is at the origin with the mean NH vector pointing directly out of the page andthe XY projection of the mean NCa vector pointing to the right.

98 D.A. Horita et al.

imental relaxation models in an analysis ofEscherichia coliRNaseHI. We find that the differences among our MD simulations makesthis impossible with Hck–SH3. It should be noted that the twosimulations of RNase HI analyzed by Philippopoulos et al.~1997!differed from each other in computational details, including forcefields, solvation methods, and treatment of nonbonded inter-actions, but utilized the same~1.48 Å resolution! starting structure.In contrast, the four simulations reported here utilized the samecomputational protocols and force fields, but different starting struc-tures. Hck–SH3 is also a more flexible molecule than RNase HI.Of the 155 residues in RNase HI, only the two C-terminal residuescontain backbone amide nitrogen atoms withB-factors greater than40 Å2. In contrast, our NMR-determined structure family contains

20 residues~excluding N- and C-termini!, with backbone RMSDgreater than 0.4 Å. Furthermore, the six molecules in the asym-metric unit of the X-ray–determined structure of Hck–SH3~Aroldet al., 1998; Protein Data Bank~PDB! accession code 1BU1! havean average of 13~range of 2–34! backbone amide nitrogen atomswith B-factors greater than 40 Å2, nearly all of which cluster in theRT, n-Src, and distal loops. Under such conditions—namely, flex-ible regions of a protein where there is no unique set of startingcoordinates—the choice of coordinates one uses as a starting pointfor a short to moderate~less than;10 ns! MD simulation willapparently have an impact on the values of the dynamics param-eters one extracts from the simulation.

Comparison of NMR- and simulation-derived dynamics data

A comparison ofS2 determined from experiment and from simu-lations MD1–MD4 is shown in Figure 7. Regions of stable sec-ondary structure are accurately and consistently described by thesimulations. Regions that are not well defined, however, show ahigh degree of variation. Hence, the four simulations do not yieldthe same values for order parameters after 4 ns of calculation~Fig. 7B–E!. This is especially evident with the RT, n-Src, anddistal loops of Hck–SH3. These calculations indicate that the RTloop undergoes substantial motion on the nanosecond-to-picosecondtimescale. The absence, presumably because of exchange broad-ening, of observable1H–15N resonances for residues I92, H93, andH94, further indicates motion on the millisecond-to-microsecondtimescale. Figure 7F shows a comparison of the experimentallyderived order parameters with the means of the MD-derived orderparameters. Analysis of multiple MD simulations can yield calcu-lated values ofS2 that are in closer agreement with experimentallydetermined values, especially for ordered regions of a protein. Anexample is found in the behavior of residues W114 and W115.Although W114 shows little mobility in MD1, MD2, or MD4, itshows a considerable degree of mobility in MD3. This behavior isevident in the calculated order parameters for W114~0.78! andW115 ~0.62! in MD3 and can be readily visualized from directanalysis of the coordinates~Fig. 8!. These order parameters differsubstantially from the calculated order parameters from MD1, MD2,and MD4 for W114~0.86, 0.85, and 0.86! and W115~0.87, 0.83,and 0.87!. AveragingS2 over the four simulations brings the meanvalues closer to the experimentally measured values for W114~mean, 0.84; experimental, 0.85! and W115~mean, 0.80; experi-mental, 0.83!. @1H–1H NOE data are also not consistent with thesubstantialc2 rotation seen for W114 in MD3~Fig. 8C!, and wefeel that the behavior of this residue in MD3 is not representativeof actual dynamic behavior.# Averaging over the four simulationswe have performed does not substantially improve the fit betweenexperimentally and computationally determined order parametersfor the more flexible regions of the protein. Specifically, the flex-ibility of the n-Src and distal loops are consistently underestimatedin the MD simulations.

As has been noted in other studies, computationally derivedbackbone NH bond vector motions in proteins typically fall intotwo classes. Residues within stable secondary structural elementsshow a rapid initial decay inC~t! ~time constant approximately0.04–0.1 ps! to a value of 0.8–0.9. No further departures from thisvalue of the correlation function are observed. Residues withinloops frequently show a bi-exponential decay. The time constantfor the faster decay also ranges from 0.04 to 0.1 ps, whereas thetime constant for the slower decay is frequently 50 ps or longer

A

B

C

D

Fig. 4. @1 2 ~S20Sf2!# for each residue in Hck–SH3 as calculated for tra-

jectories MD1–MD4~A–D, respectively!.

Dynamics of the Hck–SH3 domain 99

~see Fig. 3E–H!. We have observed this type of behavior withHck–SH3. As shown in Figure 4, the loop regions are those withhigher values of@1 2 ~S20Sf

2!#, i.e., more bi-exponential character.However, the differences in@1 2 ~S20Sf

2!# for any given loopresidue vs. simulation are substantial, and the time constants as-sociated with these residues also differ substantially. This impliesthat at the outset of the simulation, these regions are trapped inconformationally distinct substates that are not interconverted bythe extensive energy minimization and 4 ns of MD simulations thatwe have applied. Hence, under situations where conformationalaveraging is substantial~i.e., in flexible loops! and no uniquestarting structure is appropriate, the dynamic properties sampledby any given MD simulation will be biased by the choice ofstarting structure and may not adequately describe the actual dy-namic properties of the molecule.

Conclusions

Through a combination of NMR-based structural and dynamicsanalyses, we can determine regions of a protein that are flexibleand establish limits on the timescales of these motions. We are,however, unable to determine the motional pathways or readilyvisualize the extent of motion through experimental means. OurMD calculations indicate that the n-Src and RT loops of Hck–SH3are highly mobile; these calculations also depict the fast timescaleranges of motion of the various elements of this domain. By com-puting four MD trajectories, using identical computational param-eters but different starting conformations, we probed the impact ofinitial conformation on the observed dynamical behavior. That ourfour simulations yielded different values of order parameters after4 ns indicates that the choice of starting structure can have a

A B C D

E F G H

Fig. 5. DifferenceDS2 of S2 from the mean PS2 by residue.~A–D! Residues involved in stable structural elements~82–90, 98–108,116–119, and 125–135! for MD1–MD4, respectively.~E–H! Residues not involved in stable structural elements~91–97, 109–115, and120–124! for MD1–MD4, respectively.

A B C D

E F G H

Fig. 6. ~A–H! Histograms ofDS2 corresponding to each subgraph in Figure 4.

100 D.A. Horita et al.

significant impact on the results of MD calculations. For regions ofproteins that are determined with high resolution, the effect issmall. However, for regions that are less well defined, such as turnsand loops, the initial simulation conditions, including random num-ber seed and starting coordinates, can have a significant effect on

the frequency and amplitude of observed motions. Because theseflexible regions are frequently involved in defining the specificityof protein interactions, care must be taken in interpreting such MDdata.

Materials and methods

Sample preparation

15N-labeled Hck–SH372–143 ~numbering corresponds to p60hck,SWISS-PROT AC P08631! was expressed and purified as de-scribed previously~Horita et al., 1998!. Samples were prepared in90% H2O010% D2O containing 20 mM sodium phosphate, 50 mMNaCl, pH 6.25. The protein concentration was approximately 1 mM.

NMR spectroscopy and data analysis

Backbone15N R1, R2 and heteronuclear NOEs were measured at298 K at 11.7 T on a Varian Unity1 spectrometer using sequencesobtained from Lewis Kay~Farrow et al., 1994!. R1 andR2 spectrawere collected with 1.5 s relaxation delays. NOE spectra werecollected with 3.5 s relaxation delays. Time points were collectednominally at 50, 100, 200, 400, 800, 1,200, and 1,600 ms forR1

and 16, 32, 48, 80, 128, 192, and 256 ms forR2. Precise determi-nation of the relaxation delay was measured using the in-line pulsemonitor ~Byrd & Digennaro, 1995!. 1H saturation for the mea-surement of the heteronuclear NOE was accomplished by a 3 strain of 1208 pulses separated by 5 ms. Spectra were collected with1,024 complex points in the acquisition dimension and either 64 or80 complex points in the indirect dimension. Indirect data wereextended once with linear prediction. Both dimensions were apo-dized with a Lorentz–Gauss transformation window and zero filledprior to Fourier transformation. Data were processed and analyzedusing NMRPipe and its associated programs~Delaglio et al., 1995!.R1, R2, and the NOE were fit to the extended Lipari–Szabo modelusing ModelFree v.4~Palmer et al., 1991!.

MD calculations

Structures of Hck–SH378–138 ~NMR! or Hck–SH381–135 ~X-ray!were solvated by overlaying an equilibrated 453 453 45 Å cubefilled with TIP3P water molecules~Jorgensen et al., 1983! andremoving all waters less than 2.6 Å or greater than 6.0 Å fromprotein atoms~Steinbach & Brooks, 1993; Chatfield et al., 1998!.This ensemble was minimized and subjected to 4 ns of MD usingthe CHARMm~Brooks et al., 1983! version 22 all-atom force field~MacKerell et al., 1992! as implemented in X-PLOR 3.851~Brünger,1987!. Simulations were carried out at 298 K, with a 1 fs stepsize. Temperature was regulated by coupling to a 298 K bath~FBETA5 1!. A constant dielectric~E5 1! was used with a shiftedpotential ~8–12 Å! for nonbonded interactions. Nonbond cutoffwas at 13 Å. Water molecules were allowed to evaporate but wereconstrained to the X-PLOR coordinate system by application ofstochastic boundary conditions at 9,000 Å from the origin. Typi-cally, zero to two water molecules evaporated per trajectory. Tra-jectories were written at 0.1 ps intervals. The three initial startingcoordinates chosen for Hck–SH378–136were numbers 1, 10, and 20from the 25 low energy structures calculated as described previ-ously~PDB accession code 4HCK; Horita et al., 1998!. The fourthtrajectory used the coordinates for Hck–SH381–135 as extracted

A

B

C

D

E

F

Fig. 7. A: Order parameterS2 calculated from experimental data.B–E:Difference between the experimentally determinedS2 of calculated orderparameter and the MD simulations MD1–MD4, respectively.F: Differencebetween the experimentally determinedS2 of calculated order parameterand the mean of MD1–MD4.

Dynamics of the Hck–SH3 domain 101

from the X-ray structure of Hck~PDB accession code 1AD5;Sicheri et al., 1997!. These trajectories are referred to as MD1–MD4, respectively. In all cases, the same random number seed wasused. Restart schedules were identical for MD1–MD3 through theentire simulation. Computer system instability resulted in a differ-ent restart schedule for MD4 starting at 2 ns.

Rotation and translation were removed from the trajectories byreorienting each structure on the basis of minimized RMSD to thestable secondary structure elements of the first structure in eachtrajectory. Time correlation functions were calculated for N–Hbond vectors as

C~t! 5 K P2~m~t!•m~t 1 t!!

r 3~t!r 3~r 1 t!L ~1!

where P2~x! 5 ~3x2 2 1!02 and is the second order Legendrepolynomial ~Chandrasekhar et al., 1992!. C~t! was calculated fort 5 0 to 500 ps over the last 3 ns of each trajectory. For bondvectors that underwent rare transitions during the simulation timecourse, the time correlation function was calculated over that partof the trajectory from which such transitions were absent. Raretransitions were identified visually from plots ofm~1,000 ps!{m~x!,wherex spanned 1,000 to 4,000 ps. Order parameters~S2 andSf

2!and internal correlation times~ts andtf! were calculated by non-linear least-squares fitting~Press et al., 1988! of the calculatedcorrelation function to

C~t! 5 S2 1 ~12 Sf2!exp~2t0tf ! 1 ~Sf

2 2 S2 !exp~2t0ts!. ~2!

A two-parameter fit~Sf2 fixed at 1.0! corresponds to the original

model of Lipari and Szabo~1982a!, whereas a three-parameter fit@assuming exp~2t0tf! approaches 0# corresponds to the extendedmodel of Clore et al.~1990!.

Acknowledgments

This research was sponsored in part by the National Cancer Institute,DHHS, under contract with ABL, and in part by NIH Grant CA81398. Thecontents of this publication do not necessarily reflect the views or policiesof the Department of Health and Human Services, nor does mention oftrade names, commercial products, or organizations imply endorsement bythe U.S. Government.

References

Andrews BK, Romo T, Clarage JB, Pettitt BM, Phillips GN Jr. 1998. Charac-terizing the global substates of myoglobin.Structure 6:587–594.

Arold S, O’Brien R, Franken P, Strub M-P, Hoh F, Dumas C, Ladbury JE. 1998.RT loop flexibility enhances the specificity of Src family SH3 domains forHIV-1 Nef. Biochemistry 37:14683–14691.

Bevington PR, Robinson DK. 1992.Data reduction and error analysis for thephysical sciences. Boston, Massachusetts: WCB0McGraw-Hill.

Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M.1983. CHARMm: A program for macromolecular energy, minimization, anddynamics calculations.J Comp Chem 4:187–217.

Brünger AT. 1987.X-PLOR version 3.1. A system for X-ray crystallography andNMR. New Haven, Connecticut: Yale University Press.

Byrd RA, Digennaro FS. 1995. A real-time in-line RF pulse sequence andacquisition monitor for NMR spectrometers.J Magn Reson Ser A 112:250–254.

Chandrasekhar I, Clore GM, Szabo A, Gronenborn AM, Brooks BR. 1992. A500 ps molecular dynamics simulation study of interleukin-1b in water.Correlation with nuclear magnetic resonance spectroscopy and crystallog-raphy.J Mol Biol 226:239–250.

Chatfield DC, Szabo A, Brooks BR. 1998. Molecular dynamics of staphylococ-cal nuclease: Comparison of simulation with15N and 13C NMR relaxationdata.J Am Chem Soc 120:5301–5311.

Clore GM, Szabo A, Bax A, Kay LE, Driscoll PC, Gronenborn AM. 1990.Deviations from the simple two-parameter model-free approach to the in-terpretation of nitrogen-15 nuclear magnetic relaxation in proteins.J AmChem Soc 112:4989–4991.

Cohen GB, Ren R, Baltimore D. 1995. Modular binding domains in signaltransduction proteins.Cell 80:237–248.

Cooper A, Dryden DTF. 1984. Allostery without conformational change.EurBiophys J 11:103–109.

Delaglio F, Grzesiek S, Vuister GW, Zhu G, Pfeifer J, Bax A. 1995. NMRPipe:A multidimensional spectral processing system based on UNIX pipes.J Bio-mol NMR 6:277–293.

Duan Y, Kollman PA. 1998. Pathways to a protein folding intermediate observedin a 1-microsecond simulation in aqueous solution.Science 282:740–744.

Farrow NA, Muhandiram R, Singer AU, Pascal SM, Kay CM, Gish G, ShoelsonSE, Pawson T, Forman-Kay JD, Kay LE. 1994. Backbone dynamics of afree and a phosphopeptide-complex Src homology 2 domain studied by15NNMR relaxation.Biochemistry 33:5984–6003.

Horita DA, Baldisseri DM, Zhang W, Altieri AS, Smithgall TE, Gmeiner WH,Byrd RA. 1998. Solution structure of the human Hck SH3 domain andidentification of its ligand binding site.J Mol Biol 278:253–265.

Jorgensen WL, Chandrasekhar J, Medura JD, Impey RW, Klein ML. 1983.Comparison of simple potential functions for simulating liquid water.J ChemPhys 79:926–935.

Karplus M, McCammon JA. 1983. Dynamics of proteins: Elements and func-tion. Annu Rev Biochem 53:263–300.

Lipari G, Szabo A. 1982a. Model-free approach to the interpretation of nuclearmagnetic resonance relaxation in macromolecules. 1. Theory and range ofvalidity. J Am Chem Soc 104:4546–4559.

Lipari G, Szabo A. 1982b. Model-free approach to the interpretation of nuclearmagnetic resonance relaxation in macromolecules. 2. Analysis of experi-mental results.J Am Chem Soc 104:4559–4570.

MacKerell AD Jr, Bashford D, Bellott M, Dunbrack RL Jr, Field MJ, Fischer S,Gao J, Guo H, Ha S, Joseph D, et al. 1992. Self-consistent parameterizationof biomolecules for molecular modeling and condensed phase simulations.FASEB J 6:A143.

Mandel AM, Akke M, Palmer AG III. 1995. Backbone dynamics ofEscherichiacoli ribonuclease HI: Correlations with structure and function in an activeenzyme.J Mol Biol 246:144–163.

Fig. 8. ~A–D! Overlay of the n-Src loop for structures taken at 50 ps intervals between 1 and 4 ns from MD1–MD4, respectively. E109,E110, and E113 are colored cyan, S111 is colored yellow, G112 is colored red, and W114 and W115 are colored green.

102 D.A. Horita et al.

Palmer AG III. 1997. Probing molecular motion by NMR.Curr Opin Struct Biol7:732–737.

Palmer AG III, Rance M, Wright PE. 1991. Intramolecular motions of a zincfinger DNA-binding domain from Xfin characterized by proton-detectednatural abundance13C heteronuclear NMR spectroscopy.J Am Chem Soc113:4371–4380.

Philippopoulos M, Lim C. 1995. Molecular dynamics simulation ofE. coliribonuclease H 1 in solution: Correlation with NMR and X-ray data andinsights into biological function.J Mol Biol 254:771–792.

Philippopoulos M, Mandel AM, Palmer AG III, Lim C. 1997. Accuracy andprecision of NMR relaxation experiments and MD simulations for charac-terizing protein dynamics.Proteins 28:481–493.

Press WH, Flannery BP, Teukolsky SA, Vetterling WT. 1988.Numerical recipesin C. The art of scientific computing. Cambridge, Massachusetts: CambridgeUniversity Press.

Sicheri F, Moarefi I, Kuriyan J. 1997. Crystal structure of the Src family tyrosinekinase Hck.Nature 385:602–609.

Smith PE, Schaik RC, Szyperski T, Wüthrich K, van Gunsteren WF. 1995.Internal mobility of the basic pancreatic trypsin inhibitor in solution: Acomparison of NMR spin relaxation measurements and molecular dynamicssimulations.J Mol Biol 246:356–365.

Spolar RS, Record MT Jr. 1994. Coupling of local folding to site-specificbinding of proteins to DNA.Science 263:777–784.

Steinbach PJ, Brooks BR. 1993. Protein hydration elucidated by moleculardynamics simulation.Proc Natl Acad Sci USA 90:9135–9139.

Tjandra N, Feller SE, Pastor RW, Bax A. 1995. Rotational diffusion anisotropyof human ubiquitin from15N NMR relaxation.J Am Chem Soc 117:12562–12566.

Weber G. 1975. Energetics of ligand binding to proteins.Adv Protein Chem29:2–83.

Dynamics of the Hck–SH3 domain 103