6
Normal mode refinement of anisotropic thermal parameters for a supramolecular complex at 3.42-Å crystallographic resolution Billy K. Poon , Xiaorui Chen , Mingyang Lu § , Nand K. Vyas § , Florante A. Quiocho § , Qinghua Wang § , and Jianpeng Ma †‡§¶ Department of Bioengineering, Rice University, Houston, TX 77005; Graduate Program of Structural and Computational Biology and Molecular Biophysics and § Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, BCM-125, Houston, TX 77030 Edited by William N. Lipscomb, Harvard University, Cambridge, MA, and approved March 27, 2007 (received for review February 8, 2007) Here we report a normal-mode-based protocol for modeling aniso- tropic thermal motions of proteins in x-ray crystallographic refine- ment. The foundation for this protocol is a recently developed elastic normal mode analysis that produces much more accurate eigenvec- tors without the tip effect. The effectiveness of the procedure is demonstrated on the refinement of a 3.42-Å structure of formimino- transferase cyclodeaminase, a 0.5-MDa homooctameric enzyme. Us- ing an order of magnitude fewer adjustable thermal parameters than the conventional isotropic refinement, this protocol resulted in a decrease of the values of Rcryst and R free and improvements of the density map. Several poorly resolved regions in the original isotro- pically refined structure became clearer so that missing side chains were fitted easily and mistraced backbone was corrected. Moreover, the distribution of anisotropic thermal ellipsoids revealed functionally important structure flexibility. This normal-mode-based refinement is an effective way of describing anisotropic thermal motions in x-ray structures and is particularly attractive for the refinement of very large and flexible supramolecular complexes at moderate resolutions. anisotropic temperature factor conformational flexibility elastic normal mode analysis tip effect x-ray crystallographic refinement S tandard x-ray crystallographic refinement assumes that atomic motions are isotropic, harmonic, and independent of each other. These assumptions are reasonable for many small globular proteins, but they become inadequate for large com- plexes that carry highly flexible structural components (1, 2). These mobile components often undergo orientation-specific (anisotropic) movements, such as hinge-bending motion be- tween two rigid bodies, that are highly collective and interde- pendent (3). To properly represent those motions in refinement, ideally, anisotropic thermal parameters should be used. How- ever, large complexes containing highly flexible components often yield crystals that only diffract to moderate resolutions. Thus, limited by the relatively small number of observed dif- fraction intensities, a full-scale anisotropic refinement that re- quires three positional and six thermal parameters for each atom is impractical for many systems. Consequently, the structures usually are refined by using one isotropic thermal parameter for each atom instead. These isotropically refined models carry large errors because of the inability to accurately model large-scale anisotropic motions. The convergence of refinement also is slow. Because normal modes are collective variables, a small set of low-frequency modes can effectively approximate the overall aniso- tropic motions of structures (for reviews, see refs. 4 and 5). The concept of using low-frequency normal modes as x-ray crystallo- graphic refinement parameters to model thermal factors dates back to the early 1990s (6–11). In subsequent years, several attempts have been made to use normal modes in combination with x-ray data for molecular replacement (12, 13), positional refinement (14), and protein dynamics (15–17). Despite the obvious advantage, normal-mode-based refinement in x-ray crystallography has never been popularized for a number of reasons. One is that the tradi- tional normal mode analysis is based on a molecular mechanics force field. It requires an initial energy minimization that shifts the atomic coordinates, especially those of residues on the protein surface (18, 19). Such positional shifts are detrimental to x-ray refinement because they very often move the atomic coordinates out of the densities. In principle, the problem with this initial minimization could be bypassed with the elastic normal mode analysis (eNMA) that treats the starting structure as the minimized structure (20). However, in the conventional eNMA, such as the anisotropic network model (21), the so-called tip effect in eigen- vectors always contaminates the low-frequency modes, making them difficult to use as a continuous basis set, as required in normal-mode-based refinement. This tip effect is particularly se- vere in large protein complexes. Another reason is that for normal- mode refinement to be effective, the system has to be intrinsically f lexible, such as hinged domain motions that are substantially larger than the overall rigid-body motions (22). Those structures usually are the large complexes and have not been very abundant until recent years. Another issue related to the effectiveness of normal- mode-based refinement is the resolution of diffraction data. As shown in our study, the method is much more effective to diffrac- tion data at moderate resolution, e.g., 3–4 Å. The systems previously attempted were neither in that resolution range nor large complex structures. To improve the quality of low-frequency modes, we recently have developed an eNMA that can produce modes with much more accurate eigenvectors without the tip effect (23). This eNMA made it possible for the effective normal-mode-based refinement of large protein complexes with regions of inherent high flexibility. In this article, we present a refinement protocol using a very small set of low-frequency normal modes as refinement parameters to model anisotropic motions. The significantly reduced number of adjust- able thermal parameters, even fewer than the conventional isotro- pic refinement methods, results in a significant increase in the observation/parameter ratio, thus minimizing the risk of overfitting. The refinement protocol was tested on an isotropically refined 3.42-Å structure of mammalian formiminotransferase cyclodeami- nase (FTCD) (24), a 0.5-MDa homooctameric enzyme (25), which plays important roles in coupling histidine catabolism with folate metabolism (26) and integrating the Golgi complex with the vimentin intermediate filament cytoskeleton (27–29). It also is Author contributions: J.M. designed research; B.K.P., X.C., and Q.W. performed research; M.L., N.K.V., and F.A.Q. contributed data; B.K.P., X.C., Q.W., and J.M. analyzed data; and Q.W. and J.M. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Abbreviations: eNMA, elastic normal mode analysis; FTCD, formiminotransferase cyclo- deaminase; TLS, translation, libration, and screw. Data deposition: The atomic coordinates have been deposited in the Protein Data Bank, www.pdb.org (PDB ID code 2PFD). To whom correspondence should be addressed. E-mail: [email protected]. © 2007 by The National Academy of Sciences of the USA www.pnas.orgcgidoi10.1073pnas.0701204104 PNAS May 8, 2007 vol. 104 no. 19 7869 –7874 BIOPHYSICS Downloaded by guest on May 12, 2020

Normal mode refinement of anisotropic thermal parameters ... · than the overall rigid-body motions (22). Those structures usually are the large complexes and have not been very abundant

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Normal mode refinement of anisotropic thermal parameters ... · than the overall rigid-body motions (22). Those structures usually are the large complexes and have not been very abundant

Normal mode refinement of anisotropic thermalparameters for a supramolecular complex at3.42-Å crystallographic resolutionBilly K. Poon†, Xiaorui Chen‡, Mingyang Lu§, Nand K. Vyas§, Florante A. Quiocho§, Qinghua Wang§, and Jianpeng Ma†‡§¶

†Department of Bioengineering, Rice University, Houston, TX 77005; ‡Graduate Program of Structural and Computational Biology and Molecular Biophysicsand §Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, BCM-125,Houston, TX 77030

Edited by William N. Lipscomb, Harvard University, Cambridge, MA, and approved March 27, 2007 (received for review February 8, 2007)

Here we report a normal-mode-based protocol for modeling aniso-tropic thermal motions of proteins in x-ray crystallographic refine-ment. The foundation for this protocol is a recently developed elasticnormal mode analysis that produces much more accurate eigenvec-tors without the tip effect. The effectiveness of the procedure isdemonstrated on the refinement of a 3.42-Å structure of formimino-transferase cyclodeaminase, a 0.5-MDa homooctameric enzyme. Us-ing an order of magnitude fewer adjustable thermal parameters thanthe conventional isotropic refinement, this protocol resulted in adecrease of the values of Rcryst and Rfree and improvements of thedensity map. Several poorly resolved regions in the original isotro-pically refined structure became clearer so that missing side chainswere fitted easily and mistraced backbone was corrected. Moreover,the distribution of anisotropic thermal ellipsoids revealed functionallyimportant structure flexibility. This normal-mode-based refinement isan effective way of describing anisotropic thermal motions in x-raystructures and is particularly attractive for the refinement of verylarge and flexible supramolecular complexes at moderate resolutions.

anisotropic temperature factor � conformational flexibility � elastic normalmode analysis � tip effect � x-ray crystallographic refinement

S tandard x-ray crystallographic refinement assumes thatatomic motions are isotropic, harmonic, and independent of

each other. These assumptions are reasonable for many smallglobular proteins, but they become inadequate for large com-plexes that carry highly flexible structural components (1, 2).These mobile components often undergo orientation-specific(anisotropic) movements, such as hinge-bending motion be-tween two rigid bodies, that are highly collective and interde-pendent (3). To properly represent those motions in refinement,ideally, anisotropic thermal parameters should be used. How-ever, large complexes containing highly flexible componentsoften yield crystals that only diffract to moderate resolutions.Thus, limited by the relatively small number of observed dif-fraction intensities, a full-scale anisotropic refinement that re-quires three positional and six thermal parameters for each atomis impractical for many systems. Consequently, the structuresusually are refined by using one isotropic thermal parameter foreach atom instead. These isotropically refined models carry largeerrors because of the inability to accurately model large-scaleanisotropic motions. The convergence of refinement also is slow.

Because normal modes are collective variables, a small set oflow-frequency modes can effectively approximate the overall aniso-tropic motions of structures (for reviews, see refs. 4 and 5). Theconcept of using low-frequency normal modes as x-ray crystallo-graphic refinement parameters to model thermal factors dates backto the early 1990s (6–11). In subsequent years, several attemptshave been made to use normal modes in combination with x-raydata for molecular replacement (12, 13), positional refinement (14),and protein dynamics (15–17). Despite the obvious advantage,normal-mode-based refinement in x-ray crystallography has neverbeen popularized for a number of reasons. One is that the tradi-

tional normal mode analysis is based on a molecular mechanicsforce field. It requires an initial energy minimization that shifts theatomic coordinates, especially those of residues on the proteinsurface (18, 19). Such positional shifts are detrimental to x-rayrefinement because they very often move the atomic coordinatesout of the densities. In principle, the problem with this initialminimization could be bypassed with the elastic normal modeanalysis (eNMA) that treats the starting structure as the minimizedstructure (20). However, in the conventional eNMA, such as theanisotropic network model (21), the so-called tip effect in eigen-vectors always contaminates the low-frequency modes, makingthem difficult to use as a continuous basis set, as required innormal-mode-based refinement. This tip effect is particularly se-vere in large protein complexes. Another reason is that for normal-mode refinement to be effective, the system has to be intrinsicallyflexible, such as hinged domain motions that are substantially largerthan the overall rigid-body motions (22). Those structures usuallyare the large complexes and have not been very abundant untilrecent years. Another issue related to the effectiveness of normal-mode-based refinement is the resolution of diffraction data. Asshown in our study, the method is much more effective to diffrac-tion data at moderate resolution, e.g., �3–4 Å. The systemspreviously attempted were neither in that resolution range nor largecomplex structures.

To improve the quality of low-frequency modes, we recently havedeveloped an eNMA that can produce modes with much moreaccurate eigenvectors without the tip effect (23). This eNMA madeit possible for the effective normal-mode-based refinement of largeprotein complexes with regions of inherent high flexibility. In thisarticle, we present a refinement protocol using a very small set oflow-frequency normal modes as refinement parameters to modelanisotropic motions. The significantly reduced number of adjust-able thermal parameters, even fewer than the conventional isotro-pic refinement methods, results in a significant increase in theobservation/parameter ratio, thus minimizing the risk of overfitting.

The refinement protocol was tested on an isotropically refined3.42-Å structure of mammalian formiminotransferase cyclodeami-nase (FTCD) (24), a 0.5-MDa homooctameric enzyme (25), whichplays important roles in coupling histidine catabolism with folatemetabolism (26) and integrating the Golgi complex with thevimentin intermediate filament cytoskeleton (27–29). It also is

Author contributions: J.M. designed research; B.K.P., X.C., and Q.W. performed research;M.L., N.K.V., and F.A.Q. contributed data; B.K.P., X.C., Q.W., and J.M. analyzed data; andQ.W. and J.M. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Abbreviations: eNMA, elastic normal mode analysis; FTCD, formiminotransferase cyclo-deaminase; TLS, translation, libration, and screw.

Data deposition: The atomic coordinates have been deposited in the Protein Data Bank,www.pdb.org (PDB ID code 2PFD).

¶To whom correspondence should be addressed. E-mail: [email protected].

© 2007 by The National Academy of Sciences of the USA

www.pnas.org�cgi�doi�10.1073�pnas.0701204104 PNAS � May 8, 2007 � vol. 104 � no. 19 � 7869–7874

BIO

PHYS

ICS

Dow

nloa

ded

by g

uest

on

May

12,

202

0

Page 2: Normal mode refinement of anisotropic thermal parameters ... · than the overall rigid-body motions (22). Those structures usually are the large complexes and have not been very abundant

linked to two human diseases, autoimmune hepatitis (30) andglutamate formiminotransferase deficiency (31). The FTCD struc-ture determined by x-ray crystallography and cryo-EM revealed asquare doughnut structure of two layers (24).

The overall structure of an FTCD octamer is shown in Fig. 1a,with each subunit composed of distinct FT and CD domains (Fig.1b). The FT domain is divided further into N- and C-subdomains.Some of the domains and subdomains are very flexible, especiallythe N-subdomain in the FT domain, which from an isolatedstructure is known to undergo a hinged rigid-body rotation of �30°upon binding by folic acid (32) (shown in Fig. 1c). The intrinsicflexibility is illustrated by the rainbow-colored isotropic B factors inone of the subunits of the ligand-free structure (Fig. 1d) (24).

In the original structure, although 95% of residues were fitted inthe density maps, the large size of the enzyme and relatively lowresolution of diffraction data made the structure refinement ofFTCD slow and difficult. By applying our refinement protocol, theresulting electron density maps allowed improvements of theatomic coordinates, especially in several previously poorly resolvedregions. The final model has lower values for Rcryst and Rfree thanthe isotropically refined model did. Moreover, the distribution ofanisotropic thermal ellipsoids revealed functionally importantstructure flexibility. Thus, the normal-mode-based refinement is aneffective way to improve the description of anisotropic thermalmotions with drastically fewer adjustable parameters than theconventional isotropic refinement. We expect the method to beparticularly useful in refining very large and flexible supramolecu-lar complexes with limited diffraction data.

ResultsAlthough FTCD is an octamer, it was crystallized with only foursubunits in the asymmetric unit (24). To obtain the normalmodes that represent the intrinsic f lexibility of the biological unitof FTCD, we first constructed the octameric FTCD by applyingthe symmetry operators to the four subunits in the asymmetricunit (23). Then, the new eNMA was used to calculate the normalmodes. In the normal-mode-based refinement, the first 50nonzero modes were combined to replace the isotropic thermalB factors with anisotropic B factors to represent the internalmotions. The translation, libration, and screw (TLS) method (33)was used to represent the external motions by treating all of theprotein atoms in the asymmetric unit as a rigid body. Iterationsof normal-mode-based anisotropic B factor refinement, TLS,positional refinement using REFMAC5 (34) in the CCP4 Suite(35), and manual adjustments using O (36) were carried out untilconvergence.

The starting point of our refinement was an unpublished all-atommodel of FTCD. Its C� positions were deposited into the ProteinData Bank (PDB ID code 1TT9). The values for Rcryst and Rfree of

the original structure refined in CNS (37) were 24.6% and 28.8%,respectively (24). They became 23.5% and 28.7%, respectively, afterthe original model was reminimized by REFMAC5. At the begin-ning of the refinement, the contributions for all modes wereartificially scaled to a large value. The first cycle of anisotropic Bfactor refinement with our normal-mode-based method and posi-tional refinement with REFMAC5 caused a decrease in Rfree to26.6%. The rapid improvement of Rfree mainly was from theintroduction of anisotropic B factors. At the conclusion of thenormal-mode-based refinement, the values for Rcryst and Rfreeconverged to 24.0% and 24.9%, respectively. In the refinement, thesame 5% of the data set used in the original CNS refinement wasused for the Rfree calculation. The improvement in the normal-mode refinement also was reflected in the �26° average phaseangle shift. Interestingly, the improvement is larger in the high-resolution range (Fig. 2). The final model had an rmsd of 0.012 Åfor bond length and 1.552° for bond angle, which are comparableto values for the original structure. The residues in the final modelwere distributed in the Ramachandran plot as 78.1%, 17.7%, 3.1%,and 1.1% for most favored, additionally allowed, generously al-lowed, and disallowed regions, respectively, as calculated byPROCHECK (38), similar to the original model.

A more important outcome of the normal-mode refinement wasthe improvement in the electron density maps, which in turnallowed improvement of the structural model. Between the originaland the new models (Fig. 3), each subunit had a total of 23 residues

Fig. 1. Structure of FTCD. (a) The square doughnut structure of an FTCD octamer. Two subunits are shown in red and blue, respectively. (b) The subunit structureof ligand-free FTCD. Backbone trace color ramped from the N terminus to the C terminus. (c) Superposition of the FT domain of human ligand-free FTCD (red)with the structure of the same domain in isolation (cyan) with the product analog, folinic acid (CPK mode), bound in the groove. (d) Rainbow-colored isotropicB factor in the original model. The hotter the color, the larger the B factors. The high flexibility of the N-subdomain, the linker region, and the lower half of theCD domain is evident.

Fig. 2. Phase shifts of the normal-mode-refined model with respect to theoriginal model. The results are shown as a function of resolution shells. Largerphase shifts are observed for higher resolution shells. Data were calculated bythe SFTOOLS program in CCP4 suite.

7870 � www.pnas.org�cgi�doi�10.1073�pnas.0701204104 Poon et al.

Dow

nloa

ded

by g

uest

on

May

12,

202

0

Page 3: Normal mode refinement of anisotropic thermal parameters ... · than the overall rigid-body motions (22). Those structures usually are the large complexes and have not been very abundant

with rmsd �0.5 Å in main chains (Fig. 3a) and 47 residues in sidechains (Fig. 3b). The overall rmsd of the entire subunit was 0.43 Åfor backbone and 0.57 Å for all atoms. The improvements wereespecially significant in three regions, appearing as spikes in thegraphs of rmsd between the two models for both main chains (Fig.3a) and side chains (Fig. 3b). The same structural changes wereobserved in all four subunits (data not shown) because of anoncrystallographic symmetry constraint. All these regions werelocated in highly mobile regions.

The first spike in rmsd was in the region of residues 13–15 (Fig.3). The electron density in this region of the original structure wasweak and fragmented in the original map (Fig. 4a), making thefitting highly uncertain. This region in the normal-mode-refinedomit 2Fo � Fc map was improved to an extent that it allowedstraightforward retracing of the backbone and refitting of theGlu-13 side chain (Fig. 4a�). The main-chain atoms in the originalstructure were misplaced in the density that belonged to the sidechains in the refitting.

The second spike in rmsd was in the region of residues 147–150(Fig. 3). Although the electron density of the original structure inthis region was good enough for tracing (Fig. 4a), the normal-moderefinement caused a large shift in electron density (Fig. 4b�).Consequently, the backbone was retraced, and side chains wererebuilt for this region.

The third spike was in the region of residues 426–427 (Fig. 3)because of a retracing of the backbone and refitting of the side chainof Lys-427 (Fig. 4 c and c�). Although the original map hadreasonable densities for the backbone and side chain of Lys-427, theelectron density map in the normal-mode refinement had a large

shift in this region, which suggests a flipping of the side chain ofLys-427 to an opposite position (Fig. 4c�).

Besides the structural adjustments, for each subunit of 540residues in total, there were �55 residues whose missing or incom-plete densities in the original map had become visible in thenormal-mode-refined omit 2Fo � Fc map (with respect to the 1.0�contour level). Most of them were clustered in very mobile regions.One such example was residue Met-132. The density for the sidechain of Se-Met-132 was incomplete at the 1.5� level in the originalmap (Fig. 4d), whereas the new omit map showed clear and strongdensity for it, allowing the placement of its side-chain atoms withease (Fig. 4d�).

One of the most important outcomes of the refined FTCDstructure is the excellent correlation of the anisotropically refinedthermal ellipsoids with function. This finding is illustrated clearly inFig. 5. The regions in the structure of the subunit that areextensively engaged in octamer assembly are significantly lessmobile, which is apparent for the C-subdomain of the FT domainand the ‘‘upper’’ portion of the CD domain (Fig. 5a), two of theregions that make the most intersubunit contacts (Fig. 1a; see alsoref. 24). In contrast, the regions most associated with catalyticfunction show very significant collective motion, which is bestexemplified by the FT N-subdomain that exhibits the greatestcollective motion in the FTCD structure (Fig. 5 a and b). In Fig. 5b,the rough locations of the two reaction centers are indicated by thetwo asterisks. It is consistent with prior findings that the closure ofthe active site cleft after binding of an inhibitor in the FT domainis achieved by the movement of the N-subdomain toward the C-subdomain (32) (see also Fig. 1c). The ends of the local two-foldrelated long CD–CD pair of bundle of helices, which harbors thedeaminase active site (24), exhibit the second greatest motion (Fig.5a). The coupling of the two motions could play a role in the twosequential enzyme activities of FTCD, including channeling of theproduct of the FT-catalyzed reaction in the FT active site to the CDsite (24). The distribution of the thermal ellipsoids suggests thatsuch a shuttling, in addition to the swinging of the anchoredpolyglutamate tail of the substrate, could be facilitated by theensemble of conformational flexibility.

We would like to underscore that, although the FTCD structurerefined in this study is the ligand-free form, the distribution ofanisotropically refined thermal ellipsoids clearly reveals function-ally important motions in the absence of ligands. This observationindicates that the functionally important conformational changesupon ligand binding are facilitated by the inherent structuralflexibility encoded in the structure itself and that x-ray crystallo-graphic data of a single ‘‘static’’ structure do contain functionallyrelevant motional information.

DiscussionThis refinement protocol is expected to be very effective on low-or moderate-resolution x-ray diffraction data (e.g., close to 3.0Å or lower) of very large and flexible complexes. For thesesystems, using a smaller set of collective variables as thermalparameters becomes more advantageous as a result of a muchbetter observation/parameter ratio. In this study, only 50 low-frequency normal modes (a total of 1,275 anisotropic thermalparameters) were used in the refinement of FTCD structure thatcontains 16,520 nonhydrogen heavy atoms in the asymmetricunit. Thus, there was more than an order of magnitude ofreduction in the number of thermal parameters than the con-ventional isotropic refinement. Given the total number of uniquereflections of 34,749, the structural refinement of FTCD isfundamentally underdetermined. The use of 10 times fewerthermal parameters evidently is advantageous.

The small and local conformational disordering of smallglobular proteins, such as a single loop disordering or a side chaintrapped in two distinct local minima, generally are not suitablefor normal-mode-based refinement. For usefulness and limita-

Fig. 3. Structural shifts of the normal-mode-refined model with respect tothe original model. The rmsd (angstroms) along the chain of a single subunitis shown for both main chains (a) and side chains (b). Three large spikes areevident in both graphs. The results for other subunits are very similar becauseof a noncrystallographic symmetry constraint.

Poon et al. PNAS � May 8, 2007 � vol. 104 � no. 19 � 7871

BIO

PHYS

ICS

Dow

nloa

ded

by g

uest

on

May

12,

202

0

Page 4: Normal mode refinement of anisotropic thermal parameters ... · than the overall rigid-body motions (22). Those structures usually are the large complexes and have not been very abundant

tion of normal-mode analysis, please see the recent comprehen-sive review in ref. 5. In our experience, this refinement protocolis best suited for large, multidomain complexes with distinctf lexible domains at relatively limited resolutions because, amongthe three assumptions of protein motions in x-ray crystallogra-phy (isotropic, harmonic, and independent), normal-mode-based x-ray refinement already takes care of the anisotropy andinterdependence, but the assumption of harmonicity remainsvalid. It is well known that the overall deformations of large,multidomain complexes are more harmonic than those ofsmaller proteins (5).

In x-ray crystallographic refinement, a related and more com-monly used tool for modeling large-scale rigid-body movement isthe TLS method (33). The normal-mode-based method does notconflict with the TLS method. Rather, they are complementary. Infact, in this study, the TLS method was used to model the overallrigid-body motion of the entire molecule, the so-called ‘‘external’’motion (6, 7). The difference between the two methods is that thenormal-mode-based method treats the ‘‘internal’’ motion of themolecule collectively, whereas TLS treats different rigid-bodygroups independently. Thus, in normal-mode refinement, the de-formation of the entire protein complex along the low-frequencymodes keeps the system in an energetically most favorable fashion,whereas TLS cannot maintain the contiguous rigid bodies if theprotein chain is partitioned into many TLS groups. Moreover, theTLS group boundaries are not always consistent with hinge pointsor domain boundaries for multidomain proteins, whereas normalmode analysis is best known for illustrating the hinge points andcollective domain movements [an excellent example is found in astudy of the allosteric mechanism of GroEL using normal modes(3)]. Thus, in TLS refinement, if the protein chain is partitioned intomany TLS groups, the insertion of breaking points into the middleof large structural units may obscure the nature and extent of thecollective motions of large groups (39).

In summary, the field of structural biology has entered into a newera in which a dramatically increasing number of supramolecularcomplexes are being investigated. These complexes always areorganized hierarchically, containing multiple subunits or domains,

and almost all of them are highly flexible. Consequently, for manyof those complexes, only moderate-resolution x-ray diffraction datausually can be obtained. When refined with conventional isotropicmethods, the final structural models frequently contain unresolvedor poorly modeled regions because of high flexibility. The inabilityto accurately model the flexible structural components in turnaffects the overall quality of the structural models. Thus, thenormal-mode-based refinement protocol described here is verytimely in providing a state-of-the-art computational tool for moreaccurate representation of structural flexibility/disordering in x-raycrystallographic refinement. The results on FTCD have demon-strated an unprecedented ability to improve structural refinementof large, mobile biomolecular complexes at moderate resolutions.

MethodseNMA Without Tip Effect. The concept of normal-mode analysis (18,19, 40) is to express the dynamics of a protein as a linear combi-nation of a set of harmonic modes, known as normal modes, eachof which describes the motion of the protein structure in a com-pletely collective fashion. In a recent study (23), a type of eNMAwas developed to avoid the so-called tip effect. The method wasdesigned to calculate the normal modes with C� traces or a subsetof C� traces. For a system with N C� atoms (i � 1, 2, . . . , N), thedegrees of freedom in internal coordinates are N � 2 bond angles{�i} and N � 3 pseudodihedral angles{�i}. The chosen potentialfunction has an extra term in addition to that in conventionaleNMA (21, 41):

V � ���2� �i

�j

hij��rij� � �rij0��2 � ��2� �

�� � �0 �2

[1]

hij � �1 �rij0� � rc

0 �rij0� � rc

� min(H��0 ) ,

where rij and rij0 are the instantaneous and equilibrium

values (or initial values from the coordinates) of pairwisedistance between the ith and the jth C� atoms, respectively. The

Fig. 4. Examples of large structural adjustments in normal-mode refinement. (Upper) The original models. (Lower) Normal-mode model. (a and a�) RegionGlu-13–Asn-15 superimposed with omit 2Fo � Fc map contoured at 1.5�. (b and b�) Region Glu-147–Pro-150 superimposed with omit 2Fo � Fc map contoured at 1.0�.In both b and b�, the original model (pink) and the new model (yellow) are superimposed to highlight the structural shifts. (c and c�) Region Pro-426–Lys-427superimposed with omit 2Fo � Fc map contoured at 1.0�. (d and d�) Residue Se-Met-132 superimposed with omit 2Fo � Fc map contoured at 1.5�.

7872 � www.pnas.org�cgi�doi�10.1073�pnas.0701204104 Poon et al.

Dow

nloa

ded

by g

uest

on

May

12,

202

0

Page 5: Normal mode refinement of anisotropic thermal parameters ... · than the overall rigid-body motions (22). Those structures usually are the large complexes and have not been very abundant

value of the Heaviside step function, hij, specifies the effect of thecutoff distance, rc. The absolute value of the force constant, �,is irrelevant to the calculation and often set to 1.0, and H��

0 arethe diagonal elements of Hessian matrix from the conventionaleNMA potential in internal coordinate space. Note that thesummation in the second term goes over all of the elements inthe set of {�} (all of � and �). A scaling factor, �, is chosen sothat only the smallest diagonal term is changed dramatically.This term can be adjusted for the amount of stiffness one wishesto introduce into the angular terms. In practice, its value canchange from 3 to 30; the larger its value, the stiffer the angularterms. In this study, because FTCD is an extraordinarily f lexiblesystem, this term was set to 30 to effectively suppress the tipeffect.

With normal modes, one can express the instantaneous displace-ment of atom j from its equilibrium position in terms of the normalmode variables � as

�r � E� . [2]

The matrix E is the eigenvector matrix where each column vectorrepresents a pattern of collective motion of atoms in a normalmode. The dimension of matrix E is 3N by M, M � (1, 3N), whereM is the number of low-frequency modes.

Crystallographic Refinement of Thermal B Factors. X-ray diffractiondata from a protein crystal provide information about the dynamicsof a nonhydrogen atom j in the form of its thermal factor Bj, whichis related to the mean-square fluctuation of that atom, �r j

2, byBj � (8�2/3)�rj

2. The B factors usually are optimized by minimiz-ing the weighted sum of the residual

�h

w�h�� �Fobs(h� � � �Fcal�h� �]2, [3]

by using a least-squares method. Here, h � (h, k, l)T are reciprocal-lattice points of the crystal, w(h) is a weight function for eachreflection, and Fobs and Fcal are the amplitudes of the observedand calculated structure factors, respectively. Fcal has

Fcal(q) � �j

f j�q�exp� iqTrj�exp� iqT�rj� , [4]

where q � 2��Th denotes a reciprocal lattice vector at a latticepoint h, � � (a*, b*, c*)T is a 3 3 matrix that converts Cartesiancoordinates to fractional coordinates with a*, b*, and c* beingthe reciprocal unit-cell vectors of the crystal. Here, fj is theatomic structure factor usually given by four Gaussian functionsand rj is the average position (. . . denotes time average). Thedynamic part of the structure factor exp(iqT�rj) is referred toas the Debye–Waller factor,

exp(iqT�rj� � exp��12

(qT�rj)2�, [5]

and in conventional isotropic refinement, it usually is approxi-mated by the isotropic B factor in the form of

exp��12

(qT�rj)2� � exp��16

q2�r j2� � exp[�q2Bj/16�2].

[6]

The assumptions in the above equation are that the atomicmotions are harmonic, isotropic, and independent.

Normal-Mode-Based Crystallographic Refinement. In normal-mode-based crystallographic refinement (6, 7), the Debye–Waller factoris expanded in terms of the effective normal modes. The directionsof atomic motions are defined by the theoretical eigenvectors ofnormal modes, and the amplitudes of the displacements of themodes (implicitly contained in the variances and covariances) areoptimized against experimental structure factors, rather than byusing the theoretical thermal-averaged displacements (40).

If we express the instantaneous atomic displacement of atom j,�rj, in terms of normal modes (Eq. 2), the Debye–Waller factor inEq. 5 becomes

exp��12

�qT�rj�2� � exp�[qTEj��TEj

Tq], [7]

with Ej being a 3 M matrix representing the portion of matrixE for atom j, �mn ' ��m�n being the variances �mm andcovariances �mn (m � n). To assure the positive semidefinitenessof matrix �, it is expressed by a lower triangular matrix � (�{�mn}, m, n � 1, 2, . . . , M; �mn � 0, if m � n) as:

� ��T. [8]

Here, {�mn} are the independent parameters to be optimizedagainst experimentally determined amplitudes in the refine-ment. For M low-frequency modes used in refinement, one hasM(M � 1)/2 parameters, which usually is much smaller than thenumber of heavy atoms in proteins.

In our protocol, the low-frequency modes first are calculatedbased on the C� trace (regardless of whether there were any missingC� atoms along the chain). The remaining atoms of every residueare assumed to move along the same direction as its C� atom. ThisC�-based mode calculation allows applications on extremely largesystems.

Fig. 5. Anisotropic deformations of FTCD revealed by the normal-moderefinement. (a) Thermal ellipsoids for a single subunit of FTCD in the same viewas Fig. 1b. It is evident that the N-subdomain of the FT domain and the lowerhalf of the CD domain are highly flexible. (b) Side view of the square doughnutof octameric FTCD. For clarity, only the ellipsoids of two FT domains and twoCD domains in a neighboring subunit are shown. The two catalytic reactioncenters are indicated by the asterisks.

Poon et al. PNAS � May 8, 2007 � vol. 104 � no. 19 � 7873

BIO

PHYS

ICS

Dow

nloa

ded

by g

uest

on

May

12,

202

0

Page 6: Normal mode refinement of anisotropic thermal parameters ... · than the overall rigid-body motions (22). Those structures usually are the large complexes and have not been very abundant

TLS Refinement, Overall Anisotropic B Factor, and Bulk Solvent Mod-eling. With the internal motion accounted for by the normal modes,the TLS method (33) was used to describe the external motion ofthe molecules. REFMAC5 can calculate the TLS parameters for asystem, and TLSANL (42) can take those parameters and convertthem into individual anisotropic B factors for each atom. ForFTCD, the rigid body was defined to be the four subunits in theasymmetric unit. A coordinate file with ANISOU records contain-ing the TLS contributions was output after running REFMAC5and TLSANL.

Finally, there was some anisotropy that resulted from the crystaland not from the position of the atoms. Normally, most of thisanisotropy is corrected during the data processing stage, but someanisotropic effects remain in the data. We accounted for thisadditional anisotropy by having an overall anisotropic B factor thatwas common to all of the atoms.

If we assume that the possible sources of anisotropy areindependent of one another, we can construct the anisotropic Bfactor, U, as

U � Unm � stlsUtls � Uoverall, [9]

where Unm is a 3 3 matrix calculated from Eq. 7 (Unm �Ej��TEj

T), Utls is from REFMAC5 and TLSANL, stls is a scalingfactor, and Uoverall is the overall anisotropic B factor. The scalingfactor was introduced because the TLS refinement was doneexternally and independent of the minimization of the otherparameters in Unm and Uoverall. In total, we have 20 parametersfor TLS, 1 parameter for TLS scaling, and 6 parameters for theoverall B factor.

To aid the minimization process, a bulk solvent correction similarto that implemented in CNS was used (37). The partial structurefactors for the bulk solvent were scaled according to

Fcalc � Fprotein � F solvent k solvent e�Bsolvent�sin2(�)�2 �, [10]

where Fprotein is the calculated structure factor for the proteinatoms, Fsolvent is the partial structure factor calculated from thesolvent mask, ksolvent is like an average solvent density, andBsolvent is a B factor for the solvent that smoothes the transitionfrom the bulk solvent region to the atomic region. Optimal valuesfor ksolvent and Bsolvent are found by minimizing the residual fromEq. 3.

Positional Refinement and Manual Model Adjustment. After thenormal-mode-based anisotropic B factor refinement, REFMAC5(34) was used to refine the Cartesian coordinates. From the newlyrefined B factors and positional parameters in the model, a newomit 2Fo � Fc map was calculated and averaged with the fourfoldnoncrystallographic symmetry. The map and model were visuallyinspected in O (36), and manual adjustments were made asnecessary. Three iterations of normal-mode-based anisotropic re-finement, REFMAC5 refinement, and manual adjustment wereneeded before the structure convergence.

Our code was implemented in C/C�� using the CLIPPERlibraries for crystallographic-related functions (43) and the Com-putational Crystallography Toolbox (CCTBX) for the Limited-memory Broyden–Fletcher–Goldfarb–Shanno (LBFGS) mini-mizer (44). Furthermore, minimization of the parameters for theanisotropic B factors is accomplished by Brent’s principal axismethod (45).

This work was supported by National Institutes of Health GrantsR01-GM067801 and R01-GM068826; the Rice Terascale Cluster fundedby the National Science Foundation under Grant EIA-0216467; and apartnership among Rice University, Intel, and HP. B.K.P. is supportedby a predoctoral fellowship from the Houston Area Molecular Biophys-ics Predoctoral Training Program (HAMBP), M.L. is partially supportedby a predoctoral fellowship from the W. M. Keck Foundation of the GulfCoast Consortia through the Keck Center for Computational andStructural Biology, and J.M. and F.A.Q. are supported by a Welch grant.

1. Vitkup D, Ringe D, Karplus M, Petsko GA (2002) Proteins 46:345–354.2. Gerstein M, Lesk AM, Chothia C (1994) Biochemistry 33:6739–6749.3. Ma J, Karplus M (1998) Proc Natl Acad Sci USA 95:8502–8507.4. Ma J (2004) Curr Protein Pept Sci 5:119–123.5. Ma J (2005) Structure (London) 13:373–380.6. Diamond R (1990) Acta Crystallogr A 46:425–435.7. Kidera A, Go N (1990) Proc Natl Acad Sci USA 87:3718–3722.8. Kidera A, Go N (1992) J Mol Biol 225:457–475.9. Kidera A, Inaka K, Matsushima M, Go N (1992) J Mol Biol 225:477–486.

10. Kidera A, Inaka K, Matsushima M, Go N (1992) Biopolymers 32:315–319.11. Kidera A, Matsushima M, Go N (1994) Biophys Chem 50:25–31.12. Suhre K, Sanejouand YH (2004) Acta Crystallogr D Biol Crystallogr 60:796–799.13. Lindahl E, Azuara C, Koehl P, Delarue M (2006) Nucleic Acids Res 34:W52–W56.14. Delarue M, Dumas P (2004) Proc Natl Acad Sci USA 101:6957–6962.15. Kundu S, Melton JS, Sorensen DC, Phillips GN, Jr (2002) Biophys J 83:723–732.16. Kondrashov DA, Cui Q, Phillips GN, Jr (2006) Biophys J 91:2760–2767.17. Kondrashov DA, Van Wynsberghe AW, Bannen RM, Cui Q, Phillips GN, Jr

(2007) Structure (London) 15:169–177.18. Levitt M, Sander C, Stern PS (1985) J Mol Biol 181:423–447.19. Brooks BR, Janezic D, Karplus M (1995) J Compt Chem 16:1522–1542.20. Joti Y, Nakasako M, Kidera A, Go N (2002) Acta Crystallogr D Biol Crystallogr

58:1421–1432.21. Atilgan AR, Durell SR, Jernigan RL, Demirel MC, Keskin O, Bahar I (2001)

Biophys J 80:505–515.22. Kuriyan J, Weis WI (1991) Proc Natl Acad Sci USA 88:2773–2777.23. Lu M, Poon B, Ma J (2006) J Chem Theor Comp 2:464–471.24. Mao Y, Vyas NK, Vyas MN, Chen DH, Ludtke SJ, Chiu W, Quiocho FA (2004)

EMBO J 23:2963–2971.25. Murley LL, MacKenzie RE (1995) Biochemistry 34:10358–10364.26. Shane B, Stokstad ELR (1984) in Folates and Pterins, eds Blakley RL, Benkovic

SJ (Wiley, New York), pp 433–455.

27. Bashour AM, Bloom GS (1998) J Biol Chem 273:19612–19617.28. Gao YS, Alvarez C, Nelson DS, Sztul E (1998) J Biol Chem 273:33825–33834.29. Gao YS, Vrielink A, MacKenzie R, Sztul E (2002) Eur J Cell Biol 81:391–401.30. Lapierre P, Hajoui O, Homberg JC, Alvarez F (1999) Gastroenterology 116:643–

649.31. Rosenblattt D (1995) in The Metabolic and Molecular Bases of Inherited Diseases,

eds Scriver C, Beaudet A, Sly W, Valle D (McGraw-Hill, New York), pp3111–3128.

32. Kohls D, Sulea T, Purisima EO, MacKenzie RE, Vrielink A (2000) Structure(London) 8:35–46.

33. Schomaker V, Trueblood KN (1968) Acta Crystallogr B 24:63–76.34. Murshudov GN, Vagin AA, Dodson EJ (1997) Acta Crystallogr D 55:240–255.35. Collaborative Computational Project Number 4 (1994) Acta Crystallogr D Biol

Crystallogr 50:760–763.36. Jones TA, Zou JY, Cowan SW, Kjeldgaard M (1991) Acta Crystallogr A 47:110–

119.37. Brunger AT, Adams PD, Clore GM, DeLano WL, Gros P, Grosse-Kunstleve RW,

Jiang JS, Kuszewski J, Nilges M, Pannu NS, et al. (1998) Acta Crystallogr D54:905–921.

38. Laskowski RA, Macarthur MW, Moss DS, Thornton JM (1993) J Appl Crystallogr26:283–291.

39. Painter J, Merritt EA (2006) Acta Crystallogr D Biol Crystallogr 62:439–450.40. Brooks CL, III, Karplus M, Pettitt BM (1988) Adv Chem Phys 71:1–249.41. Tirion MM (1996) Phys Rev Lett 77:1905–1908.42. Howlin B, Butler SA, Moss DS, Harris GW, Driessen HPC (1993) J Appl

Crystallogr 26:622–624.43. Cowtan K (2003) IUCr Computing Commission Newsletter 2:4–9.44. Grosse-Kunstleve RW, Sauter NK, Moriarty NW, Adams PD (2002) J Appl

Crystallogr 35:126–136.45. Brent RP (2002) Algorithms for Minimization Without Derivatives (Dover

Publications, Mineola, NY).

7874 � www.pnas.org�cgi�doi�10.1073�pnas.0701204104 Poon et al.

Dow

nloa

ded

by g

uest

on

May

12,

202

0