10
SuperBiHelix method for predicting the pleiotropic ensemble of G-proteincoupled receptor conformations Jenelle K. Bray 1,2 , Ravinder Abrol 1,3,4 , William A. Goddard III 4 , Bartosz Trzaskowski 5 , and Caitlin E. Scott Materials and Process Simulation Center, California Institute of Technology, Pasadena, CA 91125 Contributed by William A. Goddard III, November 18, 2013 (sent for review July 9, 2013) There is overwhelming evidence that G-proteincoupled receptors (GPCRs) exhibit several distinct low-energy conformations, each of which might favor binding to different ligands and/or lead to dif- ferent downstream functions. Understanding the function of such proteins requires knowledge of the ensemble of low-energy con- figurations that might play a role in this pleiotropic functionality. We earlier reported the BiHelix method for efficiently sampling the (12) 7 = 35 million conformations resulting from 30° rotations about the axis (η) of all seven transmembrane helices (TMHs), showing that the experimental structure is reliably selected as the best conformation from this ensemble. However, various GPCRs differ sufficiently in the tilts of the TMHs that this method need not predict the optimum conformation starting from any other template. In this paper, we introduce the SuperBiHelix method in which the tilt angles (θ, φ) are optimized simultaneously with rotations (η) efficiently enough that it is practical and suffi- cient to sample (5 × 3 × 5) 7 = 13 trillion configurations. This method can correctly identify the optimum structure of a GPCR starting with the template from a different GPCR. We have vali- dated this method by predicting known crystal structure confor- mations starting from the template of a different protein struc- ture. We find that the SuperBiHelix conformational ensemble in- cludes the higher energy conformations associated with the active protein in addition to those associated with the more stable in- active protein. This methodology was then applied to design and experimentally confirm structures of three mutants of the CB1 cannabinoid receptor associated with different functions. protein structure prediction | A 2A adenosine receptor | β 2 -adrenergic receptor | constitutive activity | functional selectivity G -proteincoupled receptors (GPCRs) [also referred to as seven-transmembrane receptors (7TMRs)] are integral mem- brane proteins that play a central role in transmembrane (TM) signal transduction. This largest superfamily in the human ge- nome with 800 receptors identified (1, 2) is activated by a va- riety of bioactive molecules, including biogenic amines, peptides, and hormones that modulate the activity of 7TMRs to effect regulation of essential physiological processes (e.g., neurotrans- mission, cellular metabolism, secretion, cell growth, immune defense, and differentiation) through G-proteincoupled and/or β-arrestincoupled signaling pathways (3) (Fig. 1A). A structural understanding of their pleiotropic function will have a tremen- dous and broad impact, as the disregulation of these receptors often plays an important role in major disease pathologies (1, 4). Malfunctions in GPCRs play a part in diseases such as ulcers, allergies, migraines, anxiety, psychosis, schizophrenia, hyperten- sion, asthma, congestive heart failure, Parkinson, and glaucoma. GPCRs are of great interest pharmacologically, as the targets of 50% of recently released drugs and 25 of the top 100 best-selling drugs (5). The pleiotropic nature of these receptors arises because 1020 conformations of the wild-type (WT) GPCR have sufficiently close energies that one or another can be stabilized by inter- actions with various ligands, which in turn can lead to ligand- dependent functionality. Moreover, it has been shown (68) that even a single mutation can change the relative ordering of these low-lying conformations sufficiently to dramatically change the affinity to various ligands and indeed even the functionality in terms of G-protein coupling and downstream signaling (3). The current structural, thermodynamic, and functional knowledge of GPCRs suggests an emerging conformational-ensemble picture like the one shown in Fig. 1B (9), which provides a schematic of GPCR conformations in different scenarios using WT as a ref- erence. The relative thermodynamic ordering of this pleiotropic ensemble of low-energy GPCR conformations can change depend- ing on the conditions: mutations, presence of ligands, and/or pres- ence of other proteins like G proteins, which in turn modifies the physiological function. This pleiotropic ensemble also has profound implications for the ability to control receptor pharmacology, especially associ- ated with receptor target-induced side effects, when the targeted receptor activates both beneficial and undesirable signaling pathways. An example is the niacin receptor GPR109A. Niacin is therapeutically beneficial as an antilypolytic agent (via G-proteinmediated pathways), but it also causes cutaneous flushing, which has been directly linked to the activation of β-arrestin 1 pathways (10). An analog of this molecule that does not affect the G-protein pathways but destabilizes the coupling to β-arrestin 1 and blocks Significance It is known that G-proteincoupled receptors exhibit several distinct low-energy conformations, each of which might favor binding to different ligands and/or lead to different down- stream functions. Understanding the function of such proteins requires knowledge of the ensemble of low-energy config- urations that might play a role in this pleiotropic functionality. We present the SuperBiHelix methodology aimed at identify- ing all the low-energy structures that might play a role in binding, activation, and signaling. SuperBiHelix uses overall template information from all available experimental or theo- retical structures and then does very complete (13 trillion configurations) sampling of the helix rotations and tilts to predict the ensemble of low-energy structures. We find that different mutations and ligands stabilize different conformations. Author contributions: J.K.B., R.A., and W.A.G. designed research; J.K.B., R.A., and C.E.S. performed research; B.T. contributed new reagents/analytic tools; J.K.B., R.A., W.A.G., B.T., and C.E.S. analyzed data; and J.K.B., R.A., W.A.G., and C.E.S. wrote the paper. The authors declare no conflict of interest. 1 J.K.B. and R.A. contributed equally to this work. 2 Present address: Department of Structural Biology, Stanford University, Stanford, CA 94305. 3 Present address: Departments of Medicine and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048. 4 To whom correspondence may be addressed. E-mail: [email protected] or abrolr@ csmc.edu. 5 Present address: Centre of New Technologies, University of Warsaw, 02-089 Warszawa, Poland. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1321233111/-/DCSupplemental. E72E78 | PNAS | Published online December 16, 2013 www.pnas.org/cgi/doi/10.1073/pnas.1321233111

SuperBiHelix method for predicting the pleiotropic ...ch10/Papers/Goddard_4.pdf · SuperBiHelix method for predicting the pleiotropic ensemble of G-protein–coupled receptor conformations

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SuperBiHelix method for predicting the pleiotropic ...ch10/Papers/Goddard_4.pdf · SuperBiHelix method for predicting the pleiotropic ensemble of G-protein–coupled receptor conformations

SuperBiHelix method for predicting the pleiotropicensemble of G-protein–coupledreceptor conformationsJenelle K. Bray1,2, Ravinder Abrol1,3,4, William A. Goddard III4, Bartosz Trzaskowski5, and Caitlin E. Scott

Materials and Process Simulation Center, California Institute of Technology, Pasadena, CA 91125

Contributed by William A. Goddard III, November 18, 2013 (sent for review July 9, 2013)

There is overwhelming evidence that G-protein–coupled receptors(GPCRs) exhibit several distinct low-energy conformations, each ofwhich might favor binding to different ligands and/or lead to dif-ferent downstream functions. Understanding the function of suchproteins requires knowledge of the ensemble of low-energy con-figurations that might play a role in this pleiotropic functionality.We earlier reported the BiHelix method for efficiently samplingthe (12)7 = 35 million conformations resulting from 30° rotationsabout the axis (η) of all seven transmembrane helices (TMHs),showing that the experimental structure is reliably selected asthe best conformation from this ensemble. However, variousGPCRs differ sufficiently in the tilts of the TMHs that this methodneed not predict the optimum conformation starting from anyother template. In this paper, we introduce the SuperBiHelixmethod in which the tilt angles (θ, φ) are optimized simultaneouslywith rotations (η) efficiently enough that it is practical and suffi-cient to sample (5 × 3 × 5)7 = 13 trillion configurations. Thismethod can correctly identify the optimum structure of a GPCRstarting with the template from a different GPCR. We have vali-dated this method by predicting known crystal structure confor-mations starting from the template of a different protein struc-ture. We find that the SuperBiHelix conformational ensemble in-cludes the higher energy conformations associated with the activeprotein in addition to those associated with the more stable in-active protein. This methodology was then applied to design andexperimentally confirm structures of three mutants of the CB1cannabinoid receptor associated with different functions.

protein structure prediction | A2A adenosine receptor |β2-adrenergic receptor | constitutive activity | functional selectivity

G-protein–coupled receptors (GPCRs) [also referred to asseven-transmembrane receptors (7TMRs)] are integral mem-

brane proteins that play a central role in transmembrane (TM)signal transduction. This largest superfamily in the human ge-nome with ∼800 receptors identified (1, 2) is activated by a va-riety of bioactive molecules, including biogenic amines, peptides,and hormones that modulate the activity of 7TMRs to effectregulation of essential physiological processes (e.g., neurotrans-mission, cellular metabolism, secretion, cell growth, immunedefense, and differentiation) through G-protein–coupled and/orβ-arrestin–coupled signaling pathways (3) (Fig. 1A). A structuralunderstanding of their pleiotropic function will have a tremen-dous and broad impact, as the disregulation of these receptorsoften plays an important role in major disease pathologies (1, 4).Malfunctions in GPCRs play a part in diseases such as ulcers,allergies, migraines, anxiety, psychosis, schizophrenia, hyperten-sion, asthma, congestive heart failure, Parkinson, and glaucoma.GPCRs are of great interest pharmacologically, as the targets of50% of recently released drugs and 25 of the top 100 best-sellingdrugs (5).The pleiotropic nature of these receptors arises because ∼10–

20 conformations of the wild-type (WT) GPCR have sufficientlyclose energies that one or another can be stabilized by inter-actions with various ligands, which in turn can lead to ligand-dependent functionality. Moreover, it has been shown (6–8) that

even a single mutation can change the relative ordering of theselow-lying conformations sufficiently to dramatically change theaffinity to various ligands and indeed even the functionality interms of G-protein coupling and downstream signaling (3). Thecurrent structural, thermodynamic, and functional knowledge ofGPCRs suggests an emerging conformational-ensemble picturelike the one shown in Fig. 1B (9), which provides a schematic ofGPCR conformations in different scenarios using WT as a ref-erence. The relative thermodynamic ordering of this pleiotropicensemble of low-energy GPCR conformations can change depend-ing on the conditions: mutations, presence of ligands, and/or pres-ence of other proteins like G proteins, which in turn modifies thephysiological function.This pleiotropic ensemble also has profound implications for

the ability to control receptor pharmacology, especially associ-ated with receptor target-induced side effects, when the targetedreceptor activates both beneficial and undesirable signalingpathways. An example is the niacin receptor GPR109A. Niacin istherapeutically beneficial as an antilypolytic agent (via G-protein–mediated pathways), but it also causes cutaneous flushing, whichhas been directly linked to the activation of β-arrestin 1 pathways(10). An analog of this molecule that does not affect the G-proteinpathways but destabilizes the coupling to β-arrestin 1 and blocks

Significance

It is known that G-protein–coupled receptors exhibit severaldistinct low-energy conformations, each of which might favorbinding to different ligands and/or lead to different down-stream functions. Understanding the function of such proteinsrequires knowledge of the ensemble of low-energy config-urations that might play a role in this pleiotropic functionality.We present the SuperBiHelix methodology aimed at identify-ing all the low-energy structures that might play a role inbinding, activation, and signaling. SuperBiHelix uses overalltemplate information from all available experimental or theo-retical structures and then does very complete (∼13 trillionconfigurations) sampling of the helix rotations and tilts topredict the ensemble of low-energy structures. We find thatdifferent mutations and ligands stabilize different conformations.

Author contributions: J.K.B., R.A., and W.A.G. designed research; J.K.B., R.A., and C.E.S.performed research; B.T. contributed new reagents/analytic tools; J.K.B., R.A., W.A.G.,B.T., and C.E.S. analyzed data; and J.K.B., R.A., W.A.G., and C.E.S. wrote the paper.

The authors declare no conflict of interest.1J.K.B. and R.A. contributed equally to this work.2Present address: Department of Structural Biology, Stanford University, Stanford,CA 94305.

3Present address: Departments of Medicine and Biomedical Sciences, Cedars-Sinai MedicalCenter, Los Angeles, CA 90048.

4To whom correspondence may be addressed. E-mail: [email protected] or [email protected].

5Present address: Centre of New Technologies, University of Warsaw, 02-089 Warszawa,Poland.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1321233111/-/DCSupplemental.

E72–E78 | PNAS | Published online December 16, 2013 www.pnas.org/cgi/doi/10.1073/pnas.1321233111

Page 2: SuperBiHelix method for predicting the pleiotropic ...ch10/Papers/Goddard_4.pdf · SuperBiHelix method for predicting the pleiotropic ensemble of G-protein–coupled receptor conformations

that pathway (shown with red arrow in Fig. 1A) will be highlydesirable. This requires a structural understanding of how dis-tinct conformations selectively couple to specific pathways undervarious conditions.Despite the great interest in GPCRs, progress in obtaining

experimental atomic-level structures essential for understandingthe nature of activation and for design and optimization of drugcandidates has been slow, due to challenges involved in GPCRexpression, purification, and crystallization. Breakthroughs inmembrane protein structural biology techniques (11) are accel-erating GPCR structure determinations and have resulted instructures (crystal/NMR) for ∼19 GPCRs, where a majority havebeen solved in one inactive form of the receptor. Three of thesereceptors have also been crystallized in functionally distinctconformations [β2 with Gs (12), meta II rhodopsin (13), and thepartially active form of A2A receptor (14)], whose comparisonwith the respective inactive conformations provides structuralinsight into their activation. One of these receptors (Neurotensinreceptor 1) has been crystallized only in the putatively activeconformation (15). Indeed, Kim et al. (16) found for the A3adenosine receptor that the optimum conformation for bindingfour selective agonists was the 15th in the hierarchy of con-formations for the apo protein, whereas four selective antago-nists preferred either the second or the third conformation.Thus, each receptor putatively has multiple active conformations(Fig. 1B), making it essential to obtain a method of predictingGPCR structures that does not require homologous experi-mental structures and that can determine the ensemble of low-lying structures.To provide the means for determining this ensemble of low-

lying seven-helix bundle conformations, we propose and validatehere the SuperBiHelix method, which aims to select from acomplete set of seven-helix packings or conformations the en-semble of low-energy GPCR structures that could play a role inligand binding and/or activation. The SuperBiHelix methodbuilds upon the BiHelix method (17). We showed earlier that,starting with the known X-ray structure, the BiHelix methodidentifies correctly the optimum packing from the (12)7 = 35million conformations differing by independent 30° rotationsabout the axes of the seven transmembrane helices. However,starting with some template (previous experimental or predictedstructure) the BiHelix method does not necessarily identify thecorrect tilts of the helices (7, 9, 18) in the membrane. In thispaper, we generalize the BiHelix concept to optimize the helicaltilts (θ, φ) simultaneous with the helical rotations (η), whichwe refer to as SuperBiHelix. We demonstrate here that, startingwith the X-ray template, say for A2A, the SuperBiHelix procedureoutlined below correctly predicts the structure for β2-adrenergicreceptor and vice versa. Also, the method is predictive as ex-emplified by the design of CB1 receptor mutants with alteredG-protein coupling efficacy, which were later confirmed experi-mentally by GRPγS assays (7).

MethodologySuperBiHelix and SuperCombiHelix. The SuperBiHelix procedure starts witha GPCR bundle template, obtained either from X-ray experiments or frompreviously predicted and validated GPCR structures. This template is definedby the 6 × 7 = 42 degrees of freedom: x, y, h, θ, ϕ, and η values for each of theseven TM helices, as described earlier (17). Because some helices may havekinks and bends, the helical axis is defined as its least moment of inertia. Thehydrophobic center (HPC) residue h is the residue that crosses the z =0 plane, which is defined as the plane that runs through the center of thelipid bilayer. This center is calculated from the hydrophobic profile obtainedeither by our PredicTM procedure (19) or by homology to the templatestructure prealigned to an implicit membrane, as in the OPM database (20).The x axis is defined along the vector pointing from the HPC of TM3 (which isnear the middle of the bundle), to the HPC of TM2. These definitions of the xaxis and z axis implicitly define the y axis. The (x, y) position of the helix HPCsin the z = 0 reference plane is defined by the template and is not currentlysampled by SuperBiHelix (this could be included but at substantially in-creased cost). The helix HPC residue (h) could be optimized by translating thehelix along its helical axis, but we do not currently do this for the standardSuperBiHelix method. The coordinates (x, y, h) are expected to be optimizedduring atomistic membrane bilayer molecular dynamics of the predictedensemble of conformations. This leaves 3 degrees of freedom to be sampledfor each of the seven helices: θ, the tilt angle of the TM helix axis from the zaxis (that is perpendicular to the membrane plane); ϕ, the sweep (or azi-muthal) angle of the helix axis about the z axis; and η, the rotation of thehelix around the helical axis.

The SuperBiHelix procedure starts with an input GPCR bundle determinedby 42 template variables defined earlier and would normally be calleda homology model. Just as in the BiHelix method (17), we approximate theenergy as the sum over pairwise interactions between each of the 12 pairs ofinteracting helices. For each of these interacting pairs, we sample θ, ϕ, and η,while ignoring the other five helices, leading to the BiHelix energies.SCREAM (21) is used to predict the side-chain placements, which are thenminimized for 10 steps with the backbone fixed (to resolve any bad con-tacts). This procedure is illustrated in Fig. 2A.

Once the 12 pairs of BiHelix energies have been determined for all possiblecombinations of θ, ϕ, and η, we combine these energies to predict a meanfield energy of the entire bundle for effectively (3*5*5)7 ∼ 1013 con-formations. These BiHelix energies are partitioned into intrahelical andinterhelical components to avoid multiple counting of the intrahelical con-tributions, and the energy of the entire complex is then calculated, as de-scribed in SI Text (17). Although the calculation of the energy of a complexbased on its BiHelix energies is very fast, the calculation of all possibleconfigurations is still computationally expensive. In practice, we generallysample three values of θ, five values of ϕ, and five values of η, leading to(3*5*5)7 ∼ 1013 total bundle conformations. To minimize the number oftotal bundle energies to be calculated, we developed a procedure to de-termine which conformations for each helix are most favorable.

To determine the conformations for each helix likely to lead to low-energybundles, we partitioned the seven-helix bundle into three QuadHelix bun-dles, as shown in Fig. 2B. It is feasible to estimate the total BiHelix energies ofthe three QuadHelix bundles because it requires only 3*(3*5*5)4 ∼ 108

bundle energies. The 2,000 structures with the lowest energy for eachQuadHelix are listed by increasing energy. Then, the best 36 conformationsfor each helix are selected. This list of 36 conformations for a specific helixdepends on how many QuadHelix bundles contain the helix. For helices 1, 4,

Fig. 1. Illustration of the pleiotropic ensemble andits effects. (A) Balanced signaling by GPCRs. (B) Func-tional/thermodynamic view of GPCR conformations.

Bray et al. PNAS | Published online December 16, 2013 | E73

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

PNASPL

US

Page 3: SuperBiHelix method for predicting the pleiotropic ...ch10/Papers/Goddard_4.pdf · SuperBiHelix method for predicting the pleiotropic ensemble of G-protein–coupled receptor conformations

and 6, which only exist in one QuadHelix bundle, the 36 unique single helixconformations are selected that correspond to the lowest energy QuadHelixconformations. For helices 2, 5, and 7, which exist in two QuadHelix bundles,the top 18 unique single-helix conformations are selected independentlyfrom both QuadHelix bundles and then combined. Finally, for helix 3, whichexists in all three QuadHelix bundles, the top 12 unique single-helix con-formations are selected independently and combined. If a helix conforma-tion is the same in two QuadHelix bundles, then the next configuration inthe latter QuadHelix (going in the order TM1-TM2-TM3-TM7, TM2-TM3-TM4-TM5, and TM3-TM5-TM6-TM7) is chosen. At the end, 36 unique con-figurations for each TM helix are selected. Last, from each individual helicalconformation list, the 36 conformations for each helix are used to calculatethe energy of 367 ∼ 8 × 1010 full bundles. We then output the 2,000 bestenergy structures from this procedure.

In the SuperComBiHelix step, these top 2,000 helical bundles (fromSuperBiHelix) are built and the side chains are reassigned with SCREAM (21),so they will likely have different conformations than in the BiHelix limit.Then the structure is minimized for 10 steps. This energy ranking forSuperComBiHelix will be different (and more accurate) than that forSuperBiHelix because all seven helices are present instead of just two. Thisprocedure results in the ensemble of low-energy conformations most likelyto play a role in binding of ligands and activation of the GPCR.

Extensive testing on these methods, shown below in Validation, led toseveral improvements in the procedure. During the side-chain predictionsteps in SuperBiHelix and SuperComBiHelix, SCREAM must be used witha 0.5-Å resolution library instead of the 1.0-Å resolution library that is thedefault for SCREAM. Additionally, for the best results we mutate the finaltwo residues of the C and N termini to alanine for each helix during theSuperBiHelix step. Then, before the SuperComBiHelix step, these mutatedalanine residues are mutated back to their original residues for the buildingof the full bundles. This step reduces artificial long-range electrostaticinteractions between charged groups that would be located in the polarhead group region of the lipid bilayer.

Binding Site Prediction. For the ligand docking step, we use techniques de-veloped as part of the DarwinDock/GenDock protocols (22), which aims atsampling all possible poses before evaluating energies and then groups theposes into families ordered by the energy of the family head to minimize thenumber of poses used for energy evaluation. The docking methodology isdescribed in SI Text.

ValidationSuperBiHelix on Crystal Helices in the Correct Template. TheSuperBiHelix and SuperComBiHelix procedures were devel-oped by testing them on helices from one crystal structure in thetemplate of another crystal structure. During these procedures,the extracellular/intracellular loops and the N/C termini werenot present, assuming that they do not exert a significant effect onthe TM helix bundle conformations. It is also important to de-termine whether performing SuperBiHelix and SuperComBiHelixon a crystal structure itself returns the original structure. There-fore, SuperBiHelix and SuperComBiHelix were run on the A2Aadenosine receptor (23) and the β2-adrenergic crystal structure(24), sampling θ with values of −10°, 0°, and 10°, ϕ with values of−30°, −15°, 0°, 15°, and 30°, and η with values of −30°, −15°, 0°,

15°, and 30°. BiHelix was not rerun beforehand because we hadpreviously analyzed (9) that the η values would be less than 30°from the crystal structure.We first tested how well the QuadHelix protocol worked. We

validated that, in order for the original crystal structure to showup in the best energy SuperBiHelix and SuperComBiHelixstructures, the crystal conformation for each helix must be in thathelix’s top 36 conformations. The ranking of the crystal confor-mation for each helix for the A2A adenosine receptor and theβ2-adrenergic receptor is shown in the Table S1. The crystalstructure conformation for each helix is in the top 36 helicalconformations. In fact, for many helices, the crystal structure isthe best conformation. Indeed, the worst ranking for a helix is forTM6 in the β2-adrenergic receptor, which ranked number 12.This validates that the QuadHelix protocol works well for theA2A adenosine receptor and the β2-adrenergic receptor crystalstructures. It also suggests that it might have been sufficientto have limited the selection to, say, the best 24 rather thanthe best 36.We next determined whether the ranking of the top 2,000

SuperBiHelix structures is improved by SuperComBiHelix. TheSuperBiHelix and SuperComBiHelix results are shown for theβ2-adrenergic receptor crystal structure in Fig. 3. In the Super-BiHelix structures, the crystal structure is rank 78, and afterSuperComBiHelix, it is rank 25. Not only does SuperComBiHelixcause significant improvement in the rank of the crystal structure,it also slightly improves the backbone root-mean-squared de-viation (rmsd) values of the top 10 structures. Another observa-tion to note is that these backbone rmsd values are very low (<0.6Å), showing that we are capturing near-native conformations.The SuperBiHelix and SuperComBiHelix results for the

A2A adenosine receptors crystal structure are in Fig. 4. Therank of the crystal structure goes from second in SuperBiHelixto sixth in SuperComBiHelix. Although SuperComBiHelix

Fig. 2. (A) Diagram of the SuperBiHelix method, showing how the seven-helix TM bundle is partitioned into 12 independent helix pairs. The θ, ϕ, andη values for each helix in the pair are sampled with the other helices notpresent. (B) To efficiently determine a subset of conformations for each helixmost likely to lead to the lowest energy bundles, we partition the seven-helix bundle into three QuadHelix bundles: TM1-TM2-TM3-TM7, TM2-TM3-TM4-TM5, and TM3-TM5-TM6-TM7.

Fig. 3. SuperBiHelix and SuperComBiHelix results for the β2-adrenergic re-ceptor crystal structure. The top 10 structures for both SuperBiHelix andSuperComBiHelix are shown, along the with crystal structure, which is out-lined by dashed lines. The crystal shows up as rank 78 for SuperBiHelix andrank 25 for SuperComBiHelix.

E74 | www.pnas.org/cgi/doi/10.1073/pnas.1321233111 Bray et al.

Page 4: SuperBiHelix method for predicting the pleiotropic ...ch10/Papers/Goddard_4.pdf · SuperBiHelix method for predicting the pleiotropic ensemble of G-protein–coupled receptor conformations

makes the crystal structure rank slightly worse than in SuperBiHelix,SuperComBiHelix significantly improves the backbone rmsdvalues of the top 10 structures (≤1.0 Å).The SuperComBiHelix results of both the β2-adrenergic and

A2A adenosine crystal structures show that the largest variation isin the sweep angles (φ) of the helices. Neither receptor showsany variation in TM1 and TM2. The sweep angle of TM3 differsfrom the crystal structure in the A2A adenosine receptor, but notin the β2-adrenergic receptor. The sweep angles of TM4, TM5,TM6, and TM7 vary for both receptors. Finally, TM5 is the onlyhelix whose η value changes from the crystal structure, for bothreceptors. Thus, it seems that TM5 is most flexible in bothreceptors. Thus, SuperBiHelix and SuperComBiHelix are suc-cessful for both the β2-adrenergic and A2A adenosine crystalstructures in returning near-native conformations, when startingfrom the correct template.

SuperBiHelix on Crystal Helices in an Incorrect Template. TheSuperBiHelix and SuperComBiHelix methods were tested on theβ2-adrenergic receptor (24) in the A2A adenosine receptor (23)template, and vice versa. The differences between the x, y, θ, ϕ,and η values for the two templates are given in the Table S2. xand y have only a small amount of variation between templates,supporting the SuperBiHelix procedure, which does not sample xand y. Additionally, ϕ and η vary more among the templates thanθ, so more ϕ and η values will need to be sampled than θ values.To test how well SuperBiHelix predicts structures when heli-

ces are in the incorrect template, β2-adrenergic crystal heliceswere given the x, y, ϕ, θ, and η values of the A2A adenosinetemplate, and SuperBiHelix/SuperComBiHelix steps were run,sampling θ with values of −10°, 0°, and 10°; ϕ with values of−30°, −15°, 0°, 15°, and 30°; and η with values of −30°, −15°, 0°,15°, and 30°. BiHelix was not run beforehand because η only variesby −18.8° to +4.8° between the two templates. For all of thevalidation runs, each helix in the original crystal structure wasminimized without the other helices present, so that the pro-cedure was not biased toward the crystal structure. The results inFig. 5 show that the SuperBiHelix and SuperComBiHelix pro-cedures cause the backbone rmsd of predicted conformations (tothe β2-adrenergic receptor crystal structure) to go from 2.0 to 1.6Å, a modest improvement. The 2.0-Å rmsd to the β2-adrenergiccrystal structure of the original structure in the incorrect tem-plate, represented by all yellow in Fig. 5, is the rmsd achieved by

using homology modeling. The lowest possible rmsd, given theangles sampled, is 1.2 Å. The conformation of TM1 is predictedquite poorly, most likely because TM1 has the fewest interactionswith other helices: it does not have direct interaction with TM3like the other TM helices. Additionally, Table S2 shows that, forthe x and y values, which are not sampled by SuperBiHelix, TM1has larger deviations between the β2-adrenergic and A2A adeno-sine templates than any other helix. Docking of the ligandcarazolol (used to crystallize the β2-adrenergic receptor) tostructures before and after this conformational sampling pro-cedure shows that this sampling is able to improve the pharma-cophore of the protein–ligand interactions (see SI Text for results).For A2A adenosine helices with the x, y, ϕ, θ, and η values of

the β2-adrenergic template, SuperBiHelix and SuperComBiHelixcauses the backbone rmsd of the predicted conformations (to theA2A adenosine receptor crystal structure) to go from 2.1 to 1.4 Å.This is a good improvement, given that the best rmsd possible,given the angles sampled, is 1.2 Å. The 2.1-Å rmsd to the A2Aadenosine crystal structure of the original structure in the in-correct template, represented by all yellow in Fig. 5, is the rmsdachieved by using homology modeling. Docking of the ligandZM241385 (used to crystallize the A2A adenosine receptor) tostructures before and after this conformational sampling pro-cedure shows an improvement in the pharmacophore (see SIText for results), similar to the case of carazolol docking topredicted β2-adrenergic receptor structure.We emphasize here that the ligand is not present during the

SuperBiHelix and SuperComBiHelix procedures. The crystalstructures are typically determined with the ligand bound, so theligand-free crystal structure need not be the lowest energystructure. The presence of the ligand could change the order ofthe structures. Additionally, the presence of the loops couldchange the ordering of the structures. However, even ignoringsuch factors, the SuperBiHelix procedure provides useful pre-dictions of new GPCR structures.

Fig. 4. SuperBiHelix and SuperComBiHelix results for the A2A adenosinereceptor crystal structure. The top 10 structures for both SuperBiHelix andSuperComBiHelix are shown, which include the crystal structure, outlined bydashed lines. The color scale is show in Fig. 3.

Fig. 5. SuperComBiHelix results for β2-adrenergic receptor helices in the A2A

adenosine template (Upper) and A2A adenosine receptor helices in theβ2-adrenergic template (Lower). The rmsd is the backbone rmsd to the targetcrystal structure. The structure outlined by dashed lines is the structureclosest to the crystal structure, given the angles sampled. The structure in allyellow (representing all 0° angles) is the original structure in the incorrecttemplate. The color scale is show in Fig. 3.

Bray et al. PNAS | Published online December 16, 2013 | E75

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

PNASPL

US

Page 5: SuperBiHelix method for predicting the pleiotropic ...ch10/Papers/Goddard_4.pdf · SuperBiHelix method for predicting the pleiotropic ensemble of G-protein–coupled receptor conformations

The Effect of SuperBiHelix on Binding Site Predictions. Althoughrmsd is a reasonable metric for testing SuperBiHelix, it does nottake ligand binding into account. One of the main purposes ofpredicting GPCR structures is for drug design, so it is importantto measure how well ligand binding can be predicted in struc-tures predicted by SuperBiHelix. Thus, ZM241385 was dockedinto the A2A adenosine structures predicted from the β2-adrenergictemplate before and after SuperBiHelix and SuperComBiHelix,and carazolol was docked into the β2-adrenergic structures pre-dicted from the A2A templates before and after SuperBiHelix andSuperComBiHelix. Then these docked results were compared withthe ligand-bound crystal structures to see whether SuperBiHeliximproved docking. The ligands were also docked into the ligand-free crystal structure for purposes of comparison.For ZM241385 docked into the ligand-free A2A adenosine

crystal structure, the contact rmsd is 2.4 Å. For the A2A aden-osine helices in the β2-adrenergic template before SuperBiHelix,the lowest contact rmsd in the final 13 docked structures is 4.6 Å,and after SuperComBiHelix it is 3.4 Å. Thus, SuperBiHelixmakes the binding site much more like that of the crystalstructure. As seen in Fig. 6, the docked ligand in the best energySuperComBiHelix structure is very similar to the pose in thecrystal structure. They both make strong hydrogen bonds withN253(TM6). The docked ligand in the structure before Super-BiHelix takes a different pose and does not form any hydrogenbonds with N253(TM6).We discuss the results for the carazolol docking in SI Text, with

the docked structures seen in Fig. S1. SuperBiHelix improves thebinding-site predictions for both β2-adrenergic helices in the A2Aadenosine template and A2A adenosine helices in the β2-adren-ergic template, but it has more effect on the A2A adenosinehelices in the β2-adrenergic template. This agrees with the rmsdcalculations for SuperBiHelix, in which there is a larger effect on

the A2A adenosine helices in the β2-adrenergic template than theβ2-adrenergic helices in the A2A adenosine template.

Can SuperBiHelix Predict Inactive and Active Conformations ofa Receptor? The GPCR conformation of an activated GPCR isexpected to have higher energy than the inactive conformation,making it a challenge to identify these higher energy confor-mations because, without the agonist and without a nearby Gprotein, these states might be too high for SuperBiHelix/Super-CombiHelix to identify. We rely on energy ordering the final setof conformations without ligand or G protein and there could betoo many nonactive states in between. However, for a case inwhich a receptor mutant is known to be easily activated, wemight expect that our predicted ensemble of low-energy config-urations starting with that template might include the activeconformation of that mutant receptor (discussed below). We alsodescribe the rhodopsin case in SI Text, in which using helicesfrom an inactive state (rhodopsin) in the active-like ligand-freeopsin template (seen in Fig. S2) allowed us to test this approach.To determine how well SuperBiHelix can recognize active and

inactive forms of the same receptor, we looked at the mutantof the cannabinoid receptor CB1 that is constitutively active.Kendall and coworkers identified a single-point mutant T3.46Athat was completely inactive and one (T3.46I) that was moreactive than the WT receptor based on ligand-binding profiles (6)and recently confirmed by the GTPγS assays (7). To determinethe origin of these major changes in activity, the SuperBiHelix/SuperCombiHelix methods were applied to the three receptorforms (WT and two single-point mutants T3.46A and T3.46I) topredict the ensemble of low-energy seven-helix bundle con-formations. We found substantially different TM helix packingsamong the WT and mutant receptors that lead to markedlydifferent coupling of the charged residues near each receptor’scytoplasmic region (Fig. 7 A–C). Both WT and T3.46A exhibitedTM3+TM6 coupling, known to be critical to keep GPCRs in-active. The fully inactive T3.46A mutant constrained TM6 fur-ther through a TM2+TM6 coupling (R2.37+D6.30) explainingits full inactivity. In contrast, the highly constitutively activeT3.46I mutant showed no coupling of TM6 to TM2 or TM3, butrather had a TM5+TM6 coupling, similar to that observed inactive GPCR crystal structures. This shows that SuperBiHelix isable to sample and capture conformations with structural dif-ferences that explain the binding and activation assays leading toconcepts consistent with those extracted from GPCR crystalstructures as well.To further validate these findings, we designed double

mutants to reverse the activity of the single mutants. Thus, we

Fig. 6. (A) The ZM241385-bound A2A adenosine crystal structure. (B)ZM241385 docked into the A2A adenosine helices in the β2-adrenergic tem-plate before SuperBiHelix. (C) ZM241385 docked into the A2A adenosinehelices in the β2-adrenergic template after SuperBiHelix.

Fig. 7. (A–C) Predicted structures of the T3.46A,WT, and T3.46I CB1 GPCRs, showing the salt bridgesand hydrogen bonds formed on the cytoplasmicside. (D) Comparison of basal GTPγS binding toHEK293 cell membranes expressing the CB1 recep-tors including the double mutants (8).

E76 | www.pnas.org/cgi/doi/10.1073/pnas.1321233111 Bray et al.

Page 6: SuperBiHelix method for predicting the pleiotropic ...ch10/Papers/Goddard_4.pdf · SuperBiHelix method for predicting the pleiotropic ensemble of G-protein–coupled receptor conformations

predicted that the two double mutants (T3.46A/R2.37A andT3.46A/R2.37Q) would regain WT constitutive activity. Thisprediction was subsequently confirmed experimentally (see SIText for details) by the GTPγS assays (Fig. 7D), validating thestructures predicted with the SuperBiHelix methodology (7).The structures lead to an activation mechanism for CB1 thatexplains all experiments and that may play a role in otherGPCRs. Additional mutants have also been designed and thentested in GTPγS assays (8), lending strong support to this ac-tivation mechanism.The CB1 example shows that the SuperBiHelix methodology

can be used to predict and design (based on testable hypothesis)active conformations. SuperBiHelix was able to guide experi-ments by predicting gain of function experiments (like makingthe inactive T3.46A mutant constitutively active by adding a well-chosen mutation). Thus, coupled to experiments to providefunctional validations, SuperBiHelix methodology can be used toprovide very specific tests of specific hypotheses probing thestructural basis of GPCR activation.

DiscussionIn addition to CB1 (7, 8), SuperBiHelix has been applied suc-cessfully to the adenosine A3 receptor (16) and other adeno-sine receptors (22), serotonin 5-HT2B and 5-HT2C receptor(25), the histamine H3 receptor (26), the CCR5 receptor (27,28), TAS2R38 bitter taste receptor (29), and the V2 vasopressinreceptor (30). The predicted GPCR structures in these studieswere validated by predicting the binding sites and energies forknown series of ligands and comparing with experimental mu-tagenesis, binding, and/or functional data. These predicted li-gand–protein structures provided molecular-level interpretationsfor structural or functional observations in the CCR5 and V2receptors. Indeed, for A3 adenosine receptor (16), it was foundthat all four selective agonists preferred the 15th WT confor-mation, whereas the four selective antagonists all preferred thesecond or the third conformations. Moreover, all of the agonistscaused the “trigger” Trp in TM6 (6.48) to switch from verticalbefore binding to horizontal after binding.We showed in Validation that SuperBiHelix does better than

homology modeling. β2-Adrenergic crystal helices in the A2Aadenosine template have a 2.0-Å homology model rmsd, which

SuperBiHelix improves to 1.6-Å rmsd. Similarly, A2A adenosinecrystal helices in the β2-adrenergic template have a 2.1-Å ho-mology model rmsd, which SuperBiHelix improves to 1.4 Å. Incommunity-wide assessments of structure prediction methods(31, 32) aimed at GPCRs, the SuperBiHelix method has per-formed well at predicting the receptor structures. Prediction ofligand binding sites (without using prior mutagenesis data on theligand or similar ligands) has not performed as well becausedocking of ligands to predicted protein structures depends highlyon the accuracy of the protein structure. Homology-based meth-ods have not led to the prediction of multiple receptor con-formations like that possible with the SuperBiHelix method. Onlya handful of methods (18, 32) have been able to predict multi-ple conformations based on some level of rigorous conforma-tional sampling.SuperBiHelix and SuperComBiHelix allow for the efficient

sampling of GPCR conformational space. This makes it possibleto predict structures of receptors that are dissimilar to any ex-perimental crystal structure. It also predicts an ensemble of low-lying structures, mirroring the flexibility of GPCR structures.When helices from one crystal structure are placed into the tem-plate of another structure, SuperBiHelix and SuperComBiHelixsuccessfully move the experimental helices closer to their origi-nal template. The procedure also improves binding-site predic-tions and makes ligand-binding calculations more accurate. Thesuccess of SuperBiHelix and SuperComBiHelix on experimentalcrystal structures can now lead to better predictions of GPCRstructures and binding sites, and therefore more successful ra-tional drug design. The computational methodology can also beused to probe GPCR activation as was highlighted by designinga constitutively active mutant based on an inactive CB1 receptormutant. It shows the strength of the methodology to complementand guide experiments in exploring many structural hypothesesof GPCR activation and function.

ACKNOWLEDGMENTS. This work was financially supported by funds do-nated to the Materials and Process Simulation Center. J.K.B. acknowledgesthe Department of Energy Computational Science Graduate Fellowship. Thecomputers used were funded by grants from Defense University ResearchInstrumentation Program and by National Science Foundation (equipmentpart of Materials Research Science and Engineering Center).

1. Lagerström MC, Schiöth HB (2008) Structural diversity of G protein-coupled receptorsand significance for drug discovery. Nat Rev Drug Discov 7(4):339–357.

2. Abrol R, Goddard WA, 3rd (2011) G protein-coupled receptors: Conformational“gatekeepers” of transmembrane signal transduction and diversification. Extracel-lular and Intracellular Signalling, eds Adams JD, Parker KK (RSC, Cambridge, UK), pp188–229.

3. Kenakin T, Miller LJ (2010) Seven transmembrane receptors as shapeshifting proteins:The impact of allosteric modulation and functional selectivity on new drug discovery.Pharmacol Rev 62(2):265–304.

4. Lundstrom K (2006) Latest development in drug discovery on G protein-coupled re-ceptors. Curr Protein Pept Sci 7(5):465–470.

5. Klabunde T, Hessler G (2002) Drug design strategies for targeting G-protein-coupledreceptors. ChemBioChem 3(10):928–944.

6. D’Antona AM, Ahn KH, Kendall DA (2006) Mutations of CB1 T210 produce active andinactive receptor forms: Correlations with ligand affinity, receptor stability, and cel-lular localization. Biochemistry 45(17):5606–5617.

7. Scott CE, Abrol R, Ahn KH, Kendall DA, Goddard WA, 3rd (2013) Molecular basis fordramatic changes in cannabinoid CB1 G protein-coupled receptor activation uponsingle and double point mutations. Protein Sci 22(1):101–113.

8. Ahn KH, Scott CE, Abrol R, Goddard WA, Kendall DA (2013) Computationally-pre-dicted CB1 cannabinoid receptor mutants show distinct patterns of salt-bridges thatcorrelate with their level of constitutive activity reflected in G protein coupling levels,thermal stability, and ligand binding. Proteins 81(8):1304–1317.

9. Abrol R, Kim SK, Bray JK, Griffith AR, Goddard WA, 3rd (2011) Characterizing andpredicting the functional and conformational diversity of seven-transmembraneproteins. Methods 55(4):405–414.

10. Walters RW, et al. (2009) beta-Arrestin1 mediates nicotinic acid-induced flushing, butnot its antilipolytic effect, in mice. J Clin Invest 119(5):1312–1321.

11. Blois TM, Bowie JU (2009) G-protein-coupled receptor structures were not built ina day. Protein Sci 18(7):1335–1342.

12. Rasmussen SG, et al. (2011) Crystal structure of the β2 adrenergic receptor-Gs proteincomplex. Nature 477(7366):549–555.

13. Choe HW, et al. (2011) Crystal structure of metarhodopsin II. Nature 471(7340):651–655.

14. Xu F, et al. (2011) Structure of an agonist-bound human A2A adenosine receptor.

Science 332(6027):322–327.15. White JF, et al. (2012) Structure of the agonist-bound neurotensin receptor. Nature

490(7421):508–513.16. Kim S-K, Riley L, Abrol R, Jacobson KA, Goddard WA, 3rd (2011) Predicted structures

of agonist and antagonist bound complexes of adenosine A3 receptor. Proteins 79(6):

1878–1897.17. Abrol R, Bray JK, Goddard WA, 3rd (2011) Bihelix: Towards de novo structure pre-

diction of an ensemble of G-protein coupled receptor conformations. Proteins 80(2):

505–518.18. Bhattacharya S, et al. (2013) Critical analysis of the successes and failures of homology

models of G protein-coupled receptors. Proteins 81(5):729–739.19. Abrol R, Griffith AR, Bray JK, Goddard WA, 3rd (2012) Structure prediction of G

protein-coupled receptors and their ensemble of functionally important con-

formations. Methods Mol Biol 914:237–254.20. Lomize MA, Lomize AL, Pogozheva ID, Mosberg HI (2006) OPM: Orientations of

proteins in membranes database. Bioinformatics 22(5):623–625.21. Kam VWT, Goddard WA (2008) Flat-bottom strategy for improved accuracy in protein

side-chain placements. J Chem Theory Comput 4(12):2160–2169.22. Goddard WA, 3rd, et al. (2010) Predicted 3D structures for adenosine recep-

tors bound to ligands: Comparison to the crystal structure. J Struct Biol 170(1):

10–20.23. Jaakola VP, et al. (2008) The 2.6 angstrom crystal structure of a human A2A adenosine

receptor bound to an antagonist. Science 322(5905):1211–1217.24. Cherezov V, et al. (2007) High-resolution crystal structure of an engineered human

beta2-adrenergic G protein-coupled receptor. Science 318(5854):1258–1265.25. Kim SK, Li Y, Abrol R, Heo J, Goddard WA, 3rd (2011) Predicted structures and dy-

namics for agonists and antagonists bound to serotonin 5-HT2B and 5-HT2C re-

ceptors. J Chem Inf Model 51(2):420–433.26. Kim SK, Fristrup P, Abrol R, Goddard WA, 3rd (2011) Structure-based prediction of

subtype selectivity of histamine H3 receptor selective antagonists in clinical trials.

J Chem Inf Model 51(12):3262–3274.

Bray et al. PNAS | Published online December 16, 2013 | E77

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

PNASPL

US

Page 7: SuperBiHelix method for predicting the pleiotropic ...ch10/Papers/Goddard_4.pdf · SuperBiHelix method for predicting the pleiotropic ensemble of G-protein–coupled receptor conformations

27. Berro R, et al. (2013) Use of G-protein-coupled and -uncoupled CCR5 receptors byCCR5 inhibitor-resistant and -sensitive human immunodeficiency virus type 1 variants.J Virol 87(12):6569–6581.

28. Grunbeck A, et al. (2012) Genetically encoded photo-cross-linkers map the binding siteof an allosteric drug on a G protein-coupled receptor. ACS Chem Biol 7(6):967–972.

29. Tan J, Abrol R, Trzaskowski B, Goddard WA, 3rd (2012) 3D structure prediction ofTAS2R38 bitter receptors bound to agonists phenylthiocarbamide (PTC) and 6-n-propylthiouracil (PROP). J Chem Inf Model 52(7):1875–1885.

30. Carpentier E, et al. (2012) Identification and characterization of an activating F229Vsubstitution in the V2 vasopressin receptor in an infant with NSIAD. J Am Soc Nephrol23(10):1635–1640.

31. Michino M, et al. (2009) Community-wide assessment of GPCR structure modellingand ligand docking: GPCR Dock 2008. Nat Rev Drug Discov 8(6):455–463.

32. Kufareva I, Rueda M, Katritch V, Stevens RC, Abagyan R (2011) Status of GPCRmodeling and docking as reflected by community-wide GPCR Dock 2010 assessment.Structure 19(8):1108–1126.

E78 | www.pnas.org/cgi/doi/10.1073/pnas.1321233111 Bray et al.

Page 8: SuperBiHelix method for predicting the pleiotropic ...ch10/Papers/Goddard_4.pdf · SuperBiHelix method for predicting the pleiotropic ensemble of G-protein–coupled receptor conformations

Supporting InformationBray et al. 10.1073/pnas.1321233111SI Text

BiHelix Energy CalculationsWe use the BiHelix energy calculation method as described in ref.1. The BiHelix sampling is done for all 12 nearest-neighbor helixpairs, resulting in (12) × (144) = 1,728 energies, which can becombined to estimate the energy of all possible 127 ∼ 35 millionconformations in our sampling demonstration as follows. Each ofthe 1,728 energies corresponds to a specific helix pair i-j, fora specific combination of ηi and ηj, and is reported in the form ofthree energy components:

i) einterij ðηi; ηjÞ: This component corresponds to interhelical en-ergy of the helix pair that describes the total interactionenergy between the helices. It is calculated by subtractingthe internal energy of individual helices from the total energyof the two interacting helices and captures side chain–sidechain, side chain–backbone, and backbone–backbone inter-actions across the two helices.

ii) ei;intraij ðηi; ηjÞ: This component refers to the intrahelical energyof helix i while it is interacting with helix j in the i-j pair.

iii) ej;intraij ðηi; ηjÞ: This component refers to the intrahelical energyof helix j while it is interacting with helix i in the i-j pair.

The interhelical component of the energy is additive and can besummed to give the total interhelical energy of a seven-helixbundle. The intrahelical component of helix 3, for example, in the2–3 helix pair will in general be different from that in the 3–4helix pair. To accommodate this, the intrahelical energy ofa helix in the seven-helix bundle is approximated as the averageof that energy from all helix pairs involving that helix. Using this“mean-field” approximation, the energy of the ∼35 millionconformations for the seven-helix bundle can be estimated. Thecorresponding equations are shown in Eqs. S1–S3, where Ni isthe number of helix pairs involving helix i, and J1 through JNi arethe helix partners of helix i in those Ni pairs:

eintratotal ðη1; η2; η3; η4; η5; η6; η7Þ=X7

i= 1

1Ni

XJNij= J1

heintraij

�ηi; ηj

�i; [S1]

eintertotalðη1; η2; η3; η4; η5; η6; η7Þ=X6

i= 1

XJNij= J1

heinterij

�ηi; ηj

�i; [S2]

Eðη1; η2; η3; η4; η5; η6; η7Þ= eintratotal + eintertotal : [S3]

Binding Site Prediction MethodologyFor the ligand-docking step, we use techniques developed as partof the GenDock and DarwinDock procedure. First, each struc-ture was prepared for docking by mutating all of the bulky,nonpolar residues (phenylalanine, isoleucine, leucine, methio-nine, tyrosine, valine, and tryptophan) to alanine, a procedure wecall alanization. Then a sphere set representing the binding regionwas formed by taking a 2.0-Å radius around the coordinates ofthe ligand bound to the template. The ligands were assignedMulliken charges from quantum mechanics (B3LYP flavor ofDFT using the 6–31G** basis set, calculated with Jaguar). ThenGenDock was used to dock the ligand with the DarwinDockmethod.DarwinDock generates a large number of poses (no energy

calls) using Dock6, and clusters them into families by seeding the

families with pose pairs closest to each other in heavy-atom rmsdand expanding the families or merging them as long as each familymember pose is within 2.0-Å rmsd of all other family members.The program then adds 5,000 new ligand poses (again usingDock6). If the fraction of new families is less than 1/20, then weconsider that completeness is achieved. Generally, this leadsto 30,000–50,000 poses partitioned into 2,000–5,000 families.The family head pose is defined as the centroid pose in eachfamily (based on smallest average rmsd to all other poses in thefamily); for rare families of just two poses, the family head ispicked randomly. DarwinDock then scores these family headsand chooses the best 10% family heads and their family mem-bers for further analysis. Then, these remaining families arescored completely. Finally, the top 100 structures are passed onto the next step.For each of these 100 ligand configurations, the bulky nonpolar

residues (that went alanization) are restored, and their side chainsalong with others in a 4-Å unified binding site (defined as theunion of all residues within a 4-Å radius of the docked ligand inall of the complexes) are optimized with SCREAM (2). Thisprocedure leads to a different set of optimum binding-site resi-due side-chain positions for each of the 100 poses. The com-plexes are ranked by total energy, and the best 50% are kept forneutralization.During the neutralization step, which uses the method de-

scribed in ref. 3, we neutralize all charged residues of the systemand the ligand by transferring the hydrogen of each salt bridgefrom the acceptor back to the donor and by adding a proton toeach exposed Asp or Glu and removing one from each Lys andArg. We use a modified Dreiding force field that included spe-cial hydrogen-bonding parameters chosen to reproduce thebinding for dimers of analogous residues found from quantummechanics. The neutral residue scheme is an improvement overthe charged residue scheme for the binding-energy calculationsbecause it decreases the large variations between complexescaused by exaggerated long-range coulombic interactions be-tween charged groups. These exaggerated interactions are due tothe fact that the charges are fixed, so that the charge screeningpresent in the experimental system is not present to damp thelong-range interactions. The neutralization procedure is carriedout for the binding-energy calculations and not the dockingprocedure because the large coulombic interactions are impor-tant to ensure that binding modes with a salt bridge are selected.After the neutralization, the complexes are ranked by total

energy, and the best 50% are kept for minimization. The 4.0-Åunified binding site is then minimized for 50 steps. The com-plexes are again ranked by total energy, and the best 50% arekept for the final step, resulting in 13 final structures. Then, foreach these 13 structures, we minimize the full complex for 500steps (or to the RMS force is 0.25 kcal/mol·Å. Finally, these 13structures are put through a quench-anneal cycle (50 K to 600 Kand back over 11.5 ps) using the nonneutralized model, selectingthe configuration with the lowest potential energy structure.Then the structures are reneutralized and reminimized.The 13 final docked structures for each structure were all

compared with the ligand-bound crystal structure. The similarityto the ligand-bound crystal structure was measured with thecontact rmsd. This is calculated by first determining the closestcontacts for the ligand on each residue for the crystal structure,and finding their contact distances. Then the same contact dis-tances are determined for the predicted ligand-bound complex.Finally, the rmsd is calculated between the two sets of distances.

Bray et al. www.pnas.org/cgi/content/short/1321233111 1 of 3

Page 9: SuperBiHelix method for predicting the pleiotropic ...ch10/Papers/Goddard_4.pdf · SuperBiHelix method for predicting the pleiotropic ensemble of G-protein–coupled receptor conformations

The Effect of SuperBiHelix on Binding Site PredictionsFor carazolol docked into the ligand-free β2-adrenergic crys-tal structure, the contact rmsd is 2.4 Å. For the β2-adrenergichelices in the A2A adenosine template before SuperBiHelix, thelowest contact rmsd in the final 13 docked structures is 4.5 Å,and after SuperComBiHelix it is 4.4 Å. These bound structuresare seen in Fig. S1. Although there is very little improvement inthe contact rmsd from SuperBiHelix, inspection of the boundstructures shows marked improvement. In the crystal structure,carazolol forms strong hydrogen bonds with D113(TM3) andN312(TM7). In the structure before SuperBiHelix, carazololonly has a hydrogen bond with D113(TM3). However, afterSuperComBiHelix, carazolol makes strong hydrogen bonds withboth D113(TM3) and N312(TM7). So, SuperBiHelix does makethe binding site more similar to the crystal structure.

Predicting the Inactive Form of Rhodopsin Starting with theActive FormFor a second case to validate how well SuperBiHelix can rec-ognize active and inactive forms of the same receptor, we placedthe helices from rhodopsin [Protein Data Bank (PDB) ID 1u19]in the opsin (PDB ID 3cap) template, which we expect to havea low-energy conformation that is active when opsin helices areused, but which would have a low-energy state that is inactivewhen rhodopsin helices are used. The reasoning for this is asfollows. Opsin is the ligand-free active-like conformation ofRhodopsin and it was crystallized without the G protein. The

helix shapes have adapted to the active-like state of Opsin in theabsence of any other cocrystal component. Because we do notperform helix shape optimization during SuperBiHelix, for thesehelix shapes the Opsin conformation is expected to have a lowerenergy than the Rhodopsin conformation, due to better inter-helical energies.We carried out SuperBiHelix, sampling θ with values of −10°,

0°, and 10°; ϕ with values of −45°, −30°, −15°, 0°, 15°, 30°, and45°; and η with values of −30°, −15°, 0°, 15°, and 30°. Indeed, thelowest energy conformation from SuperCombiHelix is within0.6-Å rmsd of the X-ray rhodopsin structure. In fact, the only de-gree of freedom off from the rhodopsin template is the sweep angleof helix 4, which is off by 30°. The results are shown in Fig. S2.

CB1 Binding Experimental MethodsThe level of [35S]GTPγS binding to CB1 was measured in theabsence of ligand for the wild-type, T3.46I, T3.46A, T3.46/R2.37A, and T3.46/R2.37Q receptors. The level of [35S]GTPγSbinding for the mock-transfected membrane sample is shown forcomparison. Data are presented as specific binding of GTPγS tothe membrane preparation. Nonspecific binding was determinedin the presence of 10 μM unlabeled GTPγS. Each data pointrepresents the mean ± SEM of at least three independent ex-periments performed in duplicate. The dashed line indicates thelevel of non-CB1–mediated GTPγS binding obtained from [35S]GTPγS binding to the mock-transfected membrane sample.

1. Abrol R, Bray JK, Goddard WA, 3rd (2011) Bihelix: Towards de novo structureprediction of an ensemble of G-protein coupled receptor conformations. Proteins80(2):505–518.

2. Kam VWT, Goddard WA (2008) Flat-bottom strategy for improved accuracy in proteinside-chain placements. J Chem Theory Comput 4(12):2160–2169.

3. Bray JK, Goddard WA, 3rd (2008) The structure of human serotonin 2c G-protein-coupled receptor bound to agonists and antagonists. J Mol Graph Model 27(1):66–81.

Fig. S1. (A) The carazolol-bound β2-adrenergic crystal structure. (B) Carazolol docked into the β2-adrenergic helices in the A2A adenosine template beforeSuperBiHelix. (C) Carazolol docked into the β2-adrenergic helices in the A2A adenosine template after SuperBiHelix.

Fig. S2. SuperComBiHelix results for rhodopsin helices in the opsin template. The rmsd is the backbone rmsd to the rhodopsin crystal structure. The structureoutlined by dashed lines is the structure closest to the crystal structure, given the angles sampled. The structure in all yellow (representing all 0° angles) is theoriginal structure in the incorrect template.

Bray et al. www.pnas.org/cgi/content/short/1321233111 2 of 3

Page 10: SuperBiHelix method for predicting the pleiotropic ...ch10/Papers/Goddard_4.pdf · SuperBiHelix method for predicting the pleiotropic ensemble of G-protein–coupled receptor conformations

Table S1. The ranking of the crystal structure conformation foreach helix after the QuadHelix protocol

TM 1 2 3 4 5 6 7

β2 1 1 1 2 3 12 1A2A 1 1 4 7 4 2 1

Results are for both the A2A adenosine and the β2-adrenergic receptors.

Table S2. The differences between the A2A adenosine receptorand β2-adrenergic receptor templates

TM x, Å y, Å θ, o ϕ, o η, o

1 2.0 1.7 6.3 17.9 4.82 0.2 0.0 7.7 2.4 −18.83 0.0 0.0 4.9 2.0 −13.24 −1.5 −0.4 −4.1 −3.0 −9.25 0.1 −0.6 −1.2 14.9 0.06 0.8 0.0 10.8 −24.8 0.97 −0.1 1.1 4.1 3.1 2.6

The system of coordinates is described in Fig. 2 in the manuscript.

Bray et al. www.pnas.org/cgi/content/short/1321233111 3 of 3