15
The Role of Interfacial Water in ProteinLigand Binding: Insights from the Indirect Solvent Mediated Potential of Mean Force Di Cui, Bin W. Zhang, Nobuyuki Matubayasi,* ,,§ and Ronald M. Levy* ,Center for Biophysics and Computational Biology, Department of Chemistry, and Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania 19122, United States Division of Chemical Engineering, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka 560-8531, Japan § Elements Strategy Initiative for Catalysts and Batteries, Kyoto University, Katsura, Kyoto 615-8520, Japan ABSTRACT: Classical density functional theory (DFT) can be used to relate the thermodynamic properties of solutions to the indirect solvent mediated part of the solutesolvent potential of mean force (PMF). Standard, but powerful numerical methods can be used to estimate the solutesolvent PMF from which the indirect part can be extracted. In this work we show how knowledge of the direct and indirect parts of the solutesolvent PMF for water at the interface of a protein receptor can be used to gain insights about how to design tighter binding ligands. As we show, the indirect part of the solutesolvent PMF is equal to the sum of the 1-body (energy + entropy) terms in the inhomogeneous solvation theory (IST) expansion of the solvation free energy. To illustrate the eect of displacing interfacial water molecules with particular direct/indirect PMF signatures on the binding of ligands, we carry out simulations of protein binding with several pairs of congeneric ligands. We show that interfacial water locations that contribute favorably or unfavorably at the 1-body level (energy + entropy) to the solvation free energy of the solute can be targeted as part of the ligand design process. Water locations where the indirect PMF is larger in magnitude provide better targets for displacement when adding a functional group to a ligand core. INTRODUCTION It is widely believed that the displacement of water from binding sites at protein receptor surfaces plays a key role in determining proteinligand binding anity. 115 The loosely expressed idea is that by displacing a water molecule whose thermodynamic signature is unfavorablerelative to the bulk, a ligand may gain extra binding anity as compared with displacing a water molecule with a favorablesignature. While this suggestion appears frequently in the literature, the idea is not clearly formulated when expressed in this way, as it is well- known that the free energy (chemical potential) of the solvent is constant throughout the solution. 16,17 However, the excess chemical potential of the solvent in solution, which is closely related to the solutesolvent potential of mean force (PMF) varies throughout the solution, and knowledge of the direct and indirect parts of the solutesolvent PMF can be used to inform the ligand design process, as we illustrate in this work. Earlier eorts to take the thermodynamic signatures of interfacial waters into account as part of the protein ligand design process have been based on inhomogeneous solvation theory (IST). 1832 In the IST formulation of the problem, the position dependent excess energies and entropies of the solution are the objects for analysis. There are 1-body and 2- body contributions to the excess energy. The excess entropy is expressed as an innite series expansion in the thermodynamic limit; in the IST analysis of binding, only the 1-body entropy term is usually retained. Friesner, Berne, and co-workers developed a practical method called WaterMap based on IST to analyze hydration sites around the protein surface where the water density is signicantly denser than the bulk uid. 5,22 Lazaridis and co-workers applied the method of solvation thermodynamics of ordered water (STOW) to compute the thermodynamics of specic water molecules in protein cavities. 28 Further, a grid-based implementation of the IST equations was developed by Gilson, Kurtzman, and co-workers called GIST. 26,27 In the IST approach, the excess energy and excess entropy at each location are calculated separately, and the local free energyis estimated by integrating the energy and entropy densities over a small volume at the interface with the receptor. Besides the approaches based on IST, one is able to quantify the thermodynamic properties of water in proteinligand binding from several other methods such as 3D- RISM, 3335 SZMAP, 36 and GRID. 37 Classical density functional theory (DFT) 3849 can also be used as a framework to identify locations at the interface of a protein receptor which might be targets for ligand design; the goal being to displace a solvent molecule at these locations by a ligand functional group. Both IST and DFT have been formulated in a way that uses two end pointsimulations, one of the pure solvent and the other of the protein in solution. We have written recently about the relationship between end Received: October 24, 2017 Published: December 20, 2017 Article pubs.acs.org/JCTC Cite This: J. Chem. Theory Comput. 2018, 14, 512-526 © 2017 American Chemical Society 512 DOI: 10.1021/acs.jctc.7b01076 J. Chem. Theory Comput. 2018, 14, 512526

The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

The Role of Interfacial Water in Protein−Ligand Binding: Insightsfrom the Indirect Solvent Mediated Potential of Mean ForceDi Cui,† Bin W. Zhang,† Nobuyuki Matubayasi,*,‡,§ and Ronald M. Levy*,†

†Center for Biophysics and Computational Biology, Department of Chemistry, and Institute for Computational Molecular Science,Temple University, Philadelphia, Pennsylvania 19122, United States‡Division of Chemical Engineering, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka 560-8531, Japan§Elements Strategy Initiative for Catalysts and Batteries, Kyoto University, Katsura, Kyoto 615-8520, Japan

ABSTRACT: Classical density functional theory (DFT) can beused to relate the thermodynamic properties of solutions to theindirect solvent mediated part of the solute−solvent potential ofmean force (PMF). Standard, but powerful numerical methodscan be used to estimate the solute−solvent PMF from which theindirect part can be extracted. In this work we show howknowledge of the direct and indirect parts of the solute−solventPMF for water at the interface of a protein receptor can be usedto gain insights about how to design tighter binding ligands. As weshow, the indirect part of the solute−solvent PMF is equal to the sum of the 1-body (energy + entropy) terms in theinhomogeneous solvation theory (IST) expansion of the solvation free energy. To illustrate the effect of displacing interfacialwater molecules with particular direct/indirect PMF signatures on the binding of ligands, we carry out simulations of proteinbinding with several pairs of congeneric ligands. We show that interfacial water locations that contribute favorably or unfavorablyat the 1-body level (energy + entropy) to the solvation free energy of the solute can be targeted as part of the ligand designprocess. Water locations where the indirect PMF is larger in magnitude provide better targets for displacement when adding afunctional group to a ligand core.

■ INTRODUCTION

It is widely believed that the displacement of water frombinding sites at protein receptor surfaces plays a key role indetermining protein−ligand binding affinity.1−15 The looselyexpressed idea is that by displacing a water molecule whosethermodynamic signature is “unfavorable” relative to the bulk, aligand may gain extra binding affinity as compared withdisplacing a water molecule with a “favorable” signature. Whilethis suggestion appears frequently in the literature, the idea isnot clearly formulated when expressed in this way, as it is well-known that the free energy (chemical potential) of the solventis constant throughout the solution.16,17 However, the excesschemical potential of the solvent in solution, which is closelyrelated to the solute−solvent potential of mean force (PMF)varies throughout the solution, and knowledge of the direct andindirect parts of the solute−solvent PMF can be used to informthe ligand design process, as we illustrate in this work. Earlierefforts to take the thermodynamic signatures of interfacialwaters into account as part of the protein ligand design processhave been based on inhomogeneous solvation theory(IST).18−32 In the IST formulation of the problem, theposition dependent excess energies and entropies of thesolution are the objects for analysis. There are 1-body and 2-body contributions to the excess energy. The excess entropy isexpressed as an infinite series expansion in the thermodynamiclimit; in the IST analysis of binding, only the 1-body entropyterm is usually retained. Friesner, Berne, and co-workers

developed a practical method called WaterMap based on IST toanalyze hydration sites around the protein surface where thewater density is significantly denser than the bulk fluid.5,22

Lazaridis and co-workers applied the method of solvationthermodynamics of ordered water (STOW) to compute thethermodynamics of specific water molecules in proteincavities.28 Further, a grid-based implementation of the ISTequations was developed by Gilson, Kurtzman, and co-workerscalled GIST.26,27 In the IST approach, the excess energy andexcess entropy at each location are calculated separately, andthe “local free energy” is estimated by integrating the energyand entropy densities over a small volume at the interface withthe receptor. Besides the approaches based on IST, one is ableto quantify the thermodynamic properties of water in protein−ligand binding from several other methods such as 3D-RISM,33−35 SZMAP,36 and GRID.37

Classical density functional theory (DFT)38−49 can also beused as a framework to identify locations at the interface of aprotein receptor which might be targets for ligand design; thegoal being to displace a solvent molecule at these locations by aligand functional group. Both IST and DFT have beenformulated in a way that uses two “end point” simulations,one of the pure solvent and the other of the protein in solution.We have written recently about the relationship between end

Received: October 24, 2017Published: December 20, 2017

Article

pubs.acs.org/JCTCCite This: J. Chem. Theory Comput. 2018, 14, 512−526

© 2017 American Chemical Society 512 DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

Page 2: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

point formulations of IST and DFT.50 In contrast to the ISTapproach where the excess energy and entropy are the objectsof analysis, the DFT formulation focuses on the solute−solventPMF, a free energy difference, and its direct and indirect parts.As illustrated in the results described below, we show that inorder to improve ligand binding affinity, interfacial watermolecules located at positions where the indirect solute−solvent PMF is large in magnitude, regardless of the sign, areprime candidates for displacement by a ligand functional group,when considering congeneric ligand pairs or de novo design.The first target protein we chose is coagulation factor Xa

(FXa; see Figure 1), which has been the focus of previous

studies using the WaterMap and GIST approaches based onIST.5,26 We compare the binding of pairs of congeneric ligandswhich share structural similarity. In the cases illustrated here,the difference in the binding affinity between congenericligands has a large component that can be related to theproperties of the displaced water. Two other examples are alsoinvestigated: the binding of biotin to streptavidin to illustratethe effect of displacing water at a hydrophobic interface,22,51

and binding to a dry region (cavity) of the mouse major urinaryprotein (MUP)29 binding site.

■ METHODSDensity Functional Theory. In this section, we review a

derivation of the classical DFT equations related to thepotential distribution theorem.52 In a solution containing onesolute molecule, the solvation free energy Δμ, representing thefree energy change for turning on the intermolecular interactionof the solute with the solvent, can be expressed using theKirkwood charging formula as53

∫ ∫μ λλ

ρΔ =∂

∂λ

λu

xx

xd d( )

( )0

1

(1)

Here x represents the complete set of position and orientationvariables of a solvent molecule relative to the solute molecule.uλ(x) is the solute−solvent interaction energy that is graduallyturned on with the coupling parameter λ (0 ≤ λ ≤ 1). ρλdenotes the solvent density distribution around the solute at

the coupling parameter λ. A density functional is introducedthrough partial integration of eq 1:

∫ ∫ ∫μ ρ λρ

λΔ = −

∂∂λ

λu ux x x xx

xd ( ) ( ) d d( )

( )0

1

(2)

where ρ(x) represents ρλ(x) at λ = 1 and u(x) represents uλ(x)at λ = 1. The first term of eq 2 is an integration of the densityweighted direct part of the solute−solvent PMF. The secondterm of eq 2 can be approximated by a free energy densityfunctional as discussed in previous papers.46,50

The indirect part of the solute−solvent PMF ωλ can beexpressed46,50

ωρρ

= − −λλ

λk T uxx

xx( ) ln

( )

( )( )B

0 (3)

where kB is the Boltzmann constant and T is the temperature.ρ0(x) denotes ρλ(x) at λ = 0, indicating the density in the puresolvent system. In this formula, the total PMF corresponds tothe first term of the right-hand side of eq 3 and can bedecomposed into the direct solute−solvent interaction plus theremaining indirect solvent-mediated term. Rearranging terms ineq 3 to separate uλ(x), then replacing uλ(x) and u(x) in eq 2,the solute excess chemical potential can be written (see refs 46and 50 for a detailed derivation)

∫ ∫

∫ ∫

μ ρ ρ ρ ω

λ ωρ

λ

Δ = − − + −

+∂

∂λλ

k T x x x x x x

x xx

d [ ( ) ( )] d ( )[ ( )]

d d ( )( )

B 0

0

1

(4)

where ω(x) represents ωλ(x) at λ = 1, meaning the indirect partof PMF in the solution when the solute is fully coupled to thesolvent. Equation 4 can be further rewritten as

∫∫

∫ ∫

μ ρ ρ

ρ ω ω

λ ωρ

λ

Δ = − − ∞

− − ∞

+∂

∂λλ

k T d

d

d d

x x

x x x

x xx

[ ( ) ( )]

( )[ ( ) ( )]

( )( )

B

0

1

(5)

where ρ(∞) and ω(∞) denote the value of the solvent densityρ(x) and the indirect PMF ω(x) in the bulk far from the solutecalculated in the canonical ensemble. The second term in eq 5is an integration of −[ω(x) − ω(∞)] weighted by the densitydistribution ρ(x) in solution. On the basis of the definition ofω(x), −[ω(x) − ω(∞)] is the change of the indirect part of thePMF by displacing one water molecule from the configuration xto the bulk. When this displaced water located at x interactsmore favorably with the other solvent molecules than those inthe bulk ω(x) < ω(∞), the positive value of the second term ofthe integrand in eq 5 contributes unfavorably to the solvationfree energy of the solute, Δμ. When the displaced water issubject to an unfavorable interaction with the other solventmolecule compared with bulk ω(x) > ω(∞), water at thislocation makes a favorable contribution to Δμ through thesecond term in eq 5. Furthermore, the total PMF WT(x) totransfer a water molecule from the pure solvent to location x atthe interfacial region can be expressed as

ρρ

= − =WT k T Tsxxx

x( ) ln( )( )

( )B0

(1)

(6)

Figure 1. Representative structure of protein FXa with ligand XLCbound. For the ligand molecule, H1 and H2 in the label are the twohydrogen atoms which are modified to fluorine atoms.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

513

Page 3: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

where s(1)(x) is the one-body term of the space-resolvedentropy at location x in solution from the IST expression.50,54,55

We note that the difference between using the bulk solventdensity ρ(∞) far from the solute and the pure solvent densityρ0(x) as the reference for the PMF vanishes in thethermodynamic limit. When integrating the PMF over theentire volume V however, care must be taken in how thereference is treated.50 Since the direct part of the PMF u(x) isequivalent to the one-body term (solute−solvent term) of thespace-resolved energy in the IST expression e(1)(x),50 theindirect PMF corresponds to the one-body term in the IST(energy + entropy) expansion for the solvent excess chemicalpotential:

ω = − = − −WT u e Tsx x x x x( ) ( ) ( ) [ ( ) ( )](1) (1) (7)

When expressed in this way, it is now clear from eqs 4, 5, and 7why water molecules at the interface with a repulsive indirectPMF make a favorable (stabilizing) contribution at the 1-bodylevel to the excess chemical potential of the solute, while watermolecules with an attractive indirect PMF make an unfavorable(destabilizing) contribution at the 1-body level to the excesschemical potential of the solute. When the indirect PMF ω(x)at x is positive, the 1-body contribution to the free energy tomove a water to x from the bulk (or pure solvent) is attractiveand vice versa. We show in the Results section, that regardlessof whether interfacial water molecules make a stabilizing ordestabilizing contribution to the excess chemical potential ofthe solute at the 1-body level, when the indirect solute−solventPMF term is large in magnitude (i.e., |ω(x)| is large), displacingthese waters by an appropriately hydrophilic or hydrophobicligand functional group, depending on the sign of ω(x), cansignificantly increase ligand binding affinity.In approximate implementations of IST formulas for the

solute chemical potential and for the analysis of interfacialsolvent effects on protein−ligand binding, the 2-body energyreplaces the third term of eq 5, and the first term is dropped.Two-body and higher order terms in the IST entropyexpansion however are usually not included in the IST analysisof the thermodynamic signatures of interfacial waters, althoughthere have been a few reports which include two bodyentropies.21,24,56,57 A key distinction between IST and DFT isthat IST includes two body energies in the analysis while DFTdoes not. Our ansatz is that the DFT formulas which are basedon the analysis of the indirect solute−solvent PMF, andtherefore include the IST 1-body energy and 1-body entropyterms but not the 2-body energy, provides a well balancedapproximation, especially for strongly associating liquids such aswater for which cancellation between the second order energyand the second and higher order entropy terms is more likely.For example, the strengthened hydrogen bonding around ahydrophobic solute decreases both the two-body energy andthe two-body and higher entropies; so that including the two-body energy in the IST analysis while dropping the excessentropy terms beyond first order may well be unbalanced. Inorder to further motivate why the analysis of the indirect PMFω holds information about the role of interfacial water inprotein−ligand binding, we have designed a thermodynamiccycle to show how the relative binding free energy ΔΔGbindbetween a pair of congeneric ligands can be expressed in a waythat includes the contribution ω from displaced water explicitly.The thermodynamic cycle is described in the Appendix, andconsidered in the Discussion.

Simulation Method. All MD simulations were performedin GROMACS package version 5.1.2.58−60 To evaluate thedifference of binding affinity between FXa with a pair of ligands,we calculated the standard binding free energy of FXa with eachof the ligands using the double decoupling method in which athermodynamic cycle was designed.61 In the first step of thecycle, the ligand was decoupled from the protein−ligandcomplex in aqueous solution. The PDB code of the complex is1MQ5, which is the crystal structure of the ligand withidentification XLC bound to FXa.62 Chain A of the protein,ligand XLC and the coordinated cation Ca2+ were extractedfrom the crystal structure and solvated in a cubic TIP3P63,64

water box with dimensions 8.0 nm × 8.0 nm × 8.0 nm usingthe genbox tool in GROMACS. This ensures that each atom ofthe complex was at least 1.0 nm away from the edge of the box.In this way, a total of 15 919 water molecules were contained inthe box with 3 additional Cl− to neutralize the system. TheAMBER99SB-ILDN force field65 was applied as the potentialwith ligand parameters assigned from Amber Tool.66 Theexternal coordinates of the ligand relative to the proteindenoted as (r, θ, ϕ, Θ, Φ, Ψ) were restrained to theirequilibrium values in the bound state as described pre-viously67,68 so that the ligand remains in the binding pocketas the interactions between the protein and ligand wereprogressively removed. The other degrees of freedom of theprotein and ligand were allowed to move freely. The effect ofthe restraints on the ligand when unbound in solution can beaccounted for by an analytical term67 in the final standardbinding free energy calculation. This set of decouplingsimulations involved a total of 41 simulation windows.Among these windows, different lambda states for applyingthe restraints, and removing the Coulombic interactions andvan der Waals interactions were assigned. The positionalrestraints were first turned on with the values of couplingparameter lambda as follows: 0.0, 0.01, 0.025, 0.05, 0.075, 0.1,0.2, 0.35, 0.5, 0.75, and 1.0. Then the Coulombic interaction ofthe ligand with the protein and water was turned off withlambda values as follows: 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8,0.9, and 1.0. Finally, the van der Waals interaction of the ligandwith the protein and water was turned off with lambda values asfollows: 0.0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5,0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, and 1.0. For eachwindow, energy minimization was performed first, followed by100 ps NPT equilibration and 10 ns NVT production sampling.Temperature was maintained at 300 K using the Berendsencoupling scheme with time constant of τt = 1.0 ps.69 For theconstant pressure simulation, the pressure was kept constant bythe Berendsen pressure barostat with a pressure relaxation timeof 0.5 ps. The LINCS algorithm70 was used to constrain bondlengths involving hydrogen atoms. The van der Waalsinteraction was truncated at 1.0 nm. The long-range electro-static interaction was treated using the smooth particle-meshEwald approach71,72 with a real-space cutoff of 1.0 nm, a splineorder of 4, a reciprocal-space mesh size of 72 for each of the x,y, and z directions. Dynamics was propagated using a leapfrogstochastic dynamics integrator with a 2.0 fs time step.Trajectory files were saved every 0.5 ps and energy files weresaved every 0.04 ps. The decoupling free energies of the ligandfrom the complex solution, ΔGcomplex, were estimated usingBennett’s acceptance ratio method (BAR) for each adjacentwindow pair using the GROMACS tool.73

In another step of the thermodynamic cycle, the ligand wasdecoupled from pure solvent. The ligand was solvated in a

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

514

Page 4: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

cubic box with dimensions 3.8 nm × 3.8 nm × 3.8 nm, resultingin a total of 1702 water molecules. The same force fieldparameters and simulation methods as ligand decoupling fromthe protein−ligand solution system were applied except that norestraints were required and the ligand was allowed to movefreely during the simulation. Only Coulombic and van derWaals interactions were turned off with a total of 31 simulationwindows.Finally, the standard binding free energy between the protein

and ligand, ΔGbind can be obtained via the double decouplingmethod (DDM) thermodynamic cycle from

Δ = Δ + Δ + Δ

+ Δ

G G G G

G

bind solvation restraint on complex

restraint off (8)

Here, ΔGsolvation is the free energy to decouple the ligand frompure solvent without restraint. ΔGrestraint‑on is the free energy toturn on the restraint in the unbound state. ΔGcomplex is the freeenergy for turning on the Coulombic and van der Waalsinteractions between protein and ligand in the presence of therestraint. Finally, ΔGrestraint‑off is the free energy to turn off therestraint in the protein−ligand bound state. ΔGsolvation,ΔGcomplex, and ΔGrestraint‑off are obtained from the simulationsmentioned above; ΔGrestraint‑on is calculated analytically

67 as 6.90kcal/mol.To compute the total PMF to move a tagged water molecule

from the bulk solvent far from the protein to the interfacialregion, the tagged water molecule was labeled with a specialname different from the other solvent molecules in thetopology file so that this tagged water can be distinguishable.In the following treatments of the PMF and its direct andindirect parts, “protein” or “solute” refers to the protein withouteither the reference (initial) or the modified (final) ligand.Previously, we have shown the same total PMF values can beobtained from two approaches.50 One involves the integrationof the mean force exerted on the tagged water by the solute andother solvent molecules74,75 while another is based on theparticle insertion method. In this work, the latter approach wasadopted, the total PMF is equivalent to the difference betweenthe solvation free energy to insert the tagged water in thelocation x at the solute interface and in the bulk. In these twosets of simulations of tagged water insertion, the proteinstructure was frozen, which is different from the decouplingsimulation of the ligand from complex, when the protein andligand were allowed to move freely except for the relativeposition and orientation restraints on the ligand. The taggedwater was grown at a fix position and orientation in theinterfacial region or in the bulk by turning on the Coulombicand van der Waals interactions using a total of 31 simulationwindows. The interfacial water locations selected for theprotein−solvent potential of mean force analysis were chosenfrom the trajectories of the solvated protein using acombination of criteria: strength of the direct interaction withthe protein; solvent density estimated by clustering the wateroxygen locations in the trajectories; descriptions of thecorresponding locations in the IST WaterMap literature5,22,29

for the proteins chosen for study here.

■ RESULTSLigands Binding to Factor Xa. We begin by describing

our results for congeneric ligand pairs binding to FXa. Althoughthe structures of congeneric pairs are alike overall, differencesmay still arise in more than one region. Taking ligands XLC

and XLD as examples, ligand XLD has an extra methoxy groupcompared with XLC; in addition, a six-membered ring in XLCis replaced by a five-membered ring in XLD. The existence ofdissimilarities in these regions makes multiple contributions tothe effects of solvent displacement on the ligand binding, whichcomplicates our analysis of the thermodynamic signatures ofthe displacement of water molecules located at specificpositions at the interface. To simplify the problem initially,we substitute a functional group for one of the hydrogen atomsin the XLC ligand as a starting point. This minimal structuralperturbation leads to only one extra water molecule beingdisplaced during the binding between FXa and the modifiedligand; this perturbation serves as a first example.To make a comparative study, hydrogen atoms located in

two distinct positions in XLC were modified which involve thedisplacement of water molecules with significantly differentthermodynamic signatures as the added functional groupsreplaces the two hydrogen atoms at different positions on theligand. As shown in Figure 1, two hydrogen atoms labeled asH1 and H2 in different local environments were chosen. H1 isattached to the methyl group on the piperazine ring of theligand close to the protein surface in the bound complex(Figure 2A,C), while H2 is attached to the chloro-benzene ringof the ligand which is pointing away from the protein (Figure2B,D). The perturbations are polar substitutions, changing theselected hydrogen into fluorine as shown in Figure 2C,D, withthe modified ligand molecules labeled XLC-F1 and XLC-F2.The Van de Waals radius of fluorine is slightly larger than that

Figure 2. (A) Transforming the ligand XLC to XLC-F1 causes thedisplacement of one water molecule in region 1. (B) TransformingXLC to XLC-F2 causes the displacement of one water molecule inregion 2. (C) Chemical structures from XLC to XLC-F1. (D)Chemical structures from XLC to XLC-F2.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

515

Page 5: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

of the hydrogen it displaces, ensuring the displacement of justone water molecule in region 1 or region 2 shown in Figure2A,B. The standard binding free energy values between FXaand several pairs of ligands obtained using the doubledecoupling method are listed in Table 1. The statisticaluncertainties were estimated using the block average method.76

ΔΔG was calculated from the difference of the absolutestandard binding free energy between the initial (ΔGini) andfinal (ΔGfin) ligand binding to FXa. A considerable gain in therelative binding affinity, ΔΔG = −4.03 kcal/mol, was observedfor the modified XLC-F1 ligand relative to the binding of XLC;while there was significantly less gain in the relative bindingaffinity for XLC-F2 with ΔΔG = −1.50 kcal/mol. In an attemptto account for this difference, we identified the extra displacedsingle water molecules in these two cases and dissected thePMF contributions into direct and indirect parts, as the water isbrought from the bulk to the locations at the interface occupiedby F1 and F2. The extra displaced water can be identified bycomparing the excluded volume of the initial and modifiedligands, which was defined as the space within 0.2 nm of anyheavy atoms of the ligand when it formed the complex with theprotein.22 The total PMF was computed by the particleinsertion method for transferring the water from the bulk to thedesignated position at the protein interface where the water wasdisplaced when the ligand XLC is converted to XLC-F1. Thedirect term is straightforwardly obtained from the pairinteraction between the protein and displaced water; while

the indirect contribution can be derived from the differencebetween the total PMF and the direct contribution. As shownin Table 2 row 1 and row 2, the densities at the locations ofthese displaced water molecules, one from XLC-F1 and theother from XLC-F2, are almost equal to the bulk density withthe total PMF WT = 0.02 kcal/mol and WT = −0.04 kcal/mol,respectively. However, the direct and indirect components arequite distinct. For the displaced water in region 1, ω issubstantially larger than 0 with a value of +4.83 kcal/mol,indicating that the interaction between the displaced water andthe other surrounding water molecules is unfavorable comparedwith bulk. We verified this by examining the average number ofhydrogen bonds formed between the displaced water moleculewith other waters. Two water molecules are considered ashydrogen bonding if the oxygen−oxygen distance is less than0.35 nm and the O−O−H angle is less than 30°.77 On average,the displaced water can form ∼2.7 hydrogen bonds with otherwater molecules in region 1; in contrast, a water molecule canform ∼3.3 hydrogen bonds with surrounding solvent in thebulk on average. The decrease of water−water hydrogen bondsfor the water in region 1 is balanced by the water−solutehydrogen bond involving residue GLU 97 on the proteinsurface, leading to a favorable direct part of the PMF with avalue of −4.81 kcal/mol. According to the thermodynamicsignature of the water in region 1, we characterize it as “bulkdensity hydrophilic water” as shown in Table 3. Displacementof a “bulk density hydrophilic water” with this thermodynamic

Table 1. Standard Binding Free Energy Data between Protein (FXa for the First 9 Examples, Streptavidin for the 10th, 11thExample and MUP for the Last Example) and Several Pairs of Ligands

initial ligand final ligand ΔGini (kcal/mol) ΔGfin (kcal/mol) ΔΔG (kcal/mol)

XLC XLC-F1 −5.57 ± 0.28 −9.60 ± 0.60 −4.03 ± 0.66XLC XLC-F2 −5.57 ± 0.28 −7.07 ± 0.25 −1.50 ± 0.38XLC XLD −5.57 ± 0.28 −11.35 ± 0.34 −5.78 ± 0.44XLC XLD-P1 −5.57 ± 0.28 −11.23 ± 0.48 −5.66 ± 0.56XLC XLD-P2 −5.57 ± 0.28 −6.17 ± 0.79 −0.60 ± 0.84IIA IIB −7.69 ± 0.47 −12.47 ± 0.41 −4.78 ± 0.62IIA IIB-P1 −7.69 ± 0.47 −10.73 ± 0.69 −3.04 ± 0.83IIA IIB-P2 −7.69 ± 0.47 −8.46 ± 0.50 −0.77 ± 0.69RRP RTR −4.83 ± 0.56 −6.75 ± 0.06 −1.92 ± 0.56biotin-O biotin −7.40 ± 0.58 −9.94 ± 0.29 −2.54 ± 0.65biotin-deuri biotin −2.58 ± 0.32 −9.94 ± 0.29 −7.36 ± 0.43OC9 F09 −8.33 ± 0.35 −9.08 ± 0.22 −0.75 ± 0.41

Table 2. Total PMF (WT), Direct PMF (u) and Indirect PMF (ω) for the Displaced Water Molecule by the Modified LigandBinding to Protein Compared with the Reference Ligand

perturbation occupation no. type WT (kcal/mol) u (kcal/mol) ω (kcal/mol)

XLC to XLC-F1 region 1 1 BD phil 0.02 −4.81 4.83XLC to XLC-F2 region 2 1 BD wat −0.04 −0.22 0.18XLC to XLD-P1 region 3 1 HD phil −5.50 −13.05 7.55XLC to XLD-P2 region 4 1 BD wat −0.27 −1.68 1.41XLC to XLD-P2 region 4 2 BD wat −0.61 −0.14 −0.47XLC to XLD-P2 region 4 3 BD wat −0.08 0.64 −0.72IIA to IIB-P1 region 5 1 HD phil −1.96 −6.95 4.99IIA to IIB-P2 region 6 1 BD wat −1.71 −3.60 1.89RRP to RTR region 7 1 HD phil −6.51 −12.76 6.25RTR to RRP region 8 1 BD wat −0.45 −1.13 0.68biotin-O to biotin region 9 1 HD phob −5.10 −0.08 −5.02biotin-deuri to biotin region 10 1 HD phil −5.12 −12.39 7.27biotin-deuri to biotin region 10 2 HD phil −3.58 −10.02 6.44OC9 to F09 region 11 1 LD dry 1.97 −0.28 2.25

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

516

Page 6: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

signature can make a favorable contribution to the binding freeenergy, which we find empirically corresponds to the reward ofΔΔG = −4.03 kcal/mol for the binding free energy changewhen the ligand XLC is transformed to XLC-F1. On the otherhand, the indirect PMF ω is close to 0 with a value of 0.18 kcal/mol for the displaced water molecule in region 2. This waterhas only a small favorable direct interaction with the protein,but it can interact with the other solvent molecules as stronglyas if it is located in the bulk. Displacement of a water with thissignature makes a much smaller contribution to the enhance-ment of binding affinity compared with that of the “bulk densityhydrophilic water” in region 1, even though the density at bothpositions is close to the bulk. As distinguished from the “bulkdensity hydrophilic water” in region 1, which has a relativelystrong favorable interaction with the protein (which isapproximately canceled by the indirect term), we refer to thewater in region 2 as “bulk density water”, which has total PMF,direct PMF, and indirect PMF all close to 0 as summarized inthe last row in Table 3. The results of these two initial exampleswhich involve a minimal perturbation of the ligand structure,provide a first hint that the analysis of ω, the indirect part of thePMF, may be useful as part of the ligand design process.We now turn to an analysis of how the thermodynamic

signatures of interfacial water affect the affinity of realcongeneric ligand pairs with experimentally determinedstructures bound to FXa. Double decoupling calculationsrevealed that the relative standard binding free energy betweencongeneric ligands XLC and XLD was −5.78 kcal/mol shownin Table 1. A detailed comparison of the difference in theexcluded volume between XLC and XLD in Figure 3 revealsthat two regions can be depicted as extra excluded volumeregions where water is displaced when XLC is transformed intoXLD. The first one is located around the methyl groupconnected to the nitrogen atom near the five-membered ring inXLD, causing the displacement of one water molecule in region3 as shown in Figure 3A. Dissection of the PMF to move thisdisplaced water from the bulk indicates that it has a largerepulsive value of the indirect PMF ω = +7.55 kcal/mol and aquite favorable direct interaction u(x) = −13.05 kcal/mol asshown in Table 2 row 3. The favorable direct contributiondominates the unfavorable indirect contribution, so the totalPMF WT is much more favorable compared to the bulk with avalue of −5.50 kcal/mol. We describe the thermodynamicsignature of this water in region 3 as “high density hydrophilicwater”, with thermodynamic characteristics summarized in thesecond row in Table 3. On the basis of our previous discussion,if this region is occupied by growing a functional group from

the XLC ligand to liberate the water, which can also mimicsome of the favorable direct interaction between the displacedwater and the solute while eliminating the unfavorable indirectcomponents, the added functional group can make a significantcontribution to the binding affinity. To verify this, ligand XLD-P1 with structure shown in Figure 3D was designed with thesame structure as XLC except that the terminal six-memberedring is replaced by the corresponding part of the five-memberedring connected with the NCH3 group from XLD, ensuring thatregion 3 is the only extra excluded volume when comparing theinterfacial water structure of XLD-P1 with XLC. Impressively,ΔΔG is −5.66 kcal/mol for this transformation, which is almostall the gain in the affinity by changing XLC to XLD (−5.78kcal/mol). An additional excluded volume region of XLD ascompared with XLC arises from the presence of an additionalmethoxy group attached to the benzene ring, which enablesexclusion of three water molecules in region 4 as shown inFigure 3B. However, the indirect PMF ω for all these threedisplaced water molecules are small with values of 1.41 kcal/mol, −0.47 kcal/mol, and −0.72 kcal/mol, respectively, inTable 2 row 4, row 5, and row 6. From our analysis of thesignatures of interfacial waters, displacement of these watermolecules is expected to make small contributions in the gain ofbinding affinity. This is confirmed by the double decouplingresult ΔΔG = −0.60 kcal/mol between ligand XLC and XLD-P2. Here, XLD-P2 with structure shown in Figure 3E is

Table 3. Thermodynamic Signatures of Interfacial Water

type WT ρ/ρ0 u ω

1-bodyeffect on

Δμdesigntarget

bulk densityhydrophilic (BDphil)

≈0 ≈1 ≤−4 ≈−u stabilize yes

high densityhydrophilic (HDphil)

<0 >1 ≤−4 ≫0 stabilize yes

high densityhydrophobic (HDphob)

<0 >1 ≈0 <0 destabilize yes

low density drywater (LD dry)

>0 <1 ≈0 >0 stabilize(small)

?

bulk density water(BD wat)

≈0 ≈1 ≈0 ≈0 no

Figure 3. (A) Transforming the ligand XLC to XLD causes thedisplacement of one water molecule in region 3. (B) TransformingXLC to XLD causes the displacement of three water molecules inregion 4. (C) Chemical structures from XLC to XLD. (D) Chemicalstructure of XLD-P1. (E) Chemical structure of XLD-P2.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

517

Page 7: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

designed based on the addition of a methoxy group in thebenzene ring of XLC to displace only the waters in region 4.This example clearly illustrates the utility of evaluating theindirect part of the PMF at locations where interfacial waterswould be displaced by adding a functional group to a ligandcore.Another pair of congeneric ligands IIA/IIB, with chemical

structures shown in Figure 4C, binding to FXa was also

investigated. As shown in Figure 4, the perturbation of structurebetween IIA/IIB is smaller compared with XLC/XLD, withonly one hydrogen atom in the benzene ring changed to amethyl group in region 5 and another hydrogen atom in thefive-membered ring changed to a cyano group in region 6. Ineach of the two regions, one extra water molecule is displacedby the additional functional group of IIB, leading to a standardbinding free energy difference for the ligand pairs with a valueof −4.78 kcal/mol in Table 1. Further calculation of theindirect PMF contributions of the two displaced watermolecules discloses that the water in region 5 has a muchlarger indirect PMF (ω = 4.99 kcal/mol) than the water inregion 6 (ω = 1.89 kcal/mol) although their total PMFs andtherefore densities were similar as shown in Table 2 row 7 androw 8. To characterize the effect of displacement of these twowater molecules separately, we further designed two ligandmolecules IIB-P1 and IIB-P2 with structures shown in Figure4D,E. IIB-P1 is derived from the change of the hydrogen atomof IIA in region 5 to a methyl group with other parts of thestructure unperturbed so that only the water with the largervalue of ω was displaced. A significant gain in the bindingaffinity with ΔΔG = −3.04 kcal/mol was observed in thebinding of IIB-P1 with FXa. On the other hand, IIB-P2 isobtained by the replacement of the hydrogen atom in IIA with

a cyano group in region 6, resulting in the exclusion of a singlewater with the smaller value of ω. Correspondingly, only aslight gain in the binding affinity with ΔΔG = −0.77 kcal/molis observed for this transformation.In the above two pairs of congeneric ligands, the trans-

formation consists of the perturbation of the smaller functionalgroups in the initial ligand to larger functional groups in thefinal ligand which can displace several extra water moleculesdue to the enlargement of the excluded volume. In thisexample, we consider a pair of ligands RRP and RTR (chemicalstructure shown in Figure 5C) with almost equal excluded

volume, displacing the same number of water molecules but atdifferent positions. As shown in Figure 5, the ligands RRP andRTR are isomers with the substituted group − CN2H3 locatedat different positions on the aromatic ring. As the ligand istransformed from RRP to RTR in panel A, the para-substituentcauses the displacement of one water molecule in region 7 andat the same time another water molecule appears in region 8due to the annihilation of the meta-substituted group; while asthe ligand is changed from RTR to RRP in panel B, the meta-substituent causes the liberation of one water molecule inregion 8 and the occupation of region 7 by another watermolecule due to the disappearance of the para-substitutionalgroup. In this example, one water molecule disappears whileanother water molecule appears at the same time, which isdifferent from previous examples where water molecules aredisplaced as a smaller congeneric ligand is transformed to alarger one. In this case, a comparison of the thermodynamicsignatures of both the appearing and disappearing watermolecules is necessary in order to analyze the relative bindingaffinity change during the transformations. Region 7 is moreburied inside the binding pocket within which the water has a

Figure 4. (A) Transforming the ligand IIA to IIB causes thedisplacement of one water molecule in region 5. (B) Transforming IIAto IIB causes the displacement of one water molecule in region 6. (C)Chemical structures from IIA to IIB. (D) Chemical structure of IIB-P1. (E) Chemical structure of IIB-P2.

Figure 5. (A) Transforming the ligand RRP to RTR causes thedisplacement of one water molecule in region 7. (B) The backtransformation of RTR to RRP causes the displacement of one watermolecule in region 8. (C) Chemical structures from RRP to RTR. (D)Chemical structures from RTR to RRP.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

518

Page 8: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

stronger direct interaction with the protein compared withwater in region 8 as listed in Table 2 row 9 and row 10. Thefavorable water−solute interaction is partially counterbalancedby the unfavorable water−water interaction, with a large valueof ω = 6.25 kcal/mol for the displaced water in region 7. Similarto the water in region 3 shown in Table 2, the displaced waterin region 7 is also classified as a “high density hydrophilicwater”. Exclusion of a water molecule with this thermodynamicsignature will contribute more significantly to the enhancementof binding affinity compared with expulsion of water with smallω (0.68 kcal/mol) in region 8. Overall, the ligand RTR with thepara-substituent binds more tightly to FXa relative to RRP dueto the displacement of a “high density hydrophilic water”,which is in line with ΔΔG = −1.92 kcal/mol from Table 1.In the above discussion of FXa binding with pairs of

congeneric ligands, the transformed ligand with significantlyhigher binding affinity compared with the initial referenceligand, usually involves the displacement of water moleculeswith “hydrophilic” thermodynamic signatures (Table 3) thathave strong direct interactions with the protein. Owing to thecompetition between direct and indirect parts of the PMF forwaters with “hydrophilic” signatures, these waters also havelarge repulsive values of ω. If the indirect part approximatelycounterbalances the favorable direct part to keep the total PMFclose to zero, we classify these water molecules as “bulk densityhydrophilic water”; on the other hand, if the direct part issufficiently favorable that the indirect part is unable to offsetmost of the direct contribution, these water molecules areclassified as “high density hydrophilic water” with favorable WTcompared to the bulk. We have shown that displacement ofthese two types of interfacial water at the protein receptorinterface can make favorable contributions to the enhancementof the binding affinity; with the larger contribution associatedwith locations where the indirect PMF ω(x) is more repulsive.Ligands Binding to Streptavidin. In the next example, we

consider the displacement of another type of water molecule,one at a hydrophobic interface where the direct interactionbetween interfacial water and the protein is quite weak but thetotal PMF is very favorable due to the indirect PMFcontribution. It has been reported that the hydrophobicenclosure effect22 is important for the binding betweenstreptavidin and ligands in addition to the electrostaticpolarization effect.78 Following ref 22, we chose streptavidinas the target protein and identified a water molecule that islocated around a hydrophobic patch of streptavidin, which isclose to Gly-48 in region 9 as shown in Figure 6A. Furtheranalysis of the thermodynamic properties of the water in Table2 row 11 indicates that the direct interaction between the waterand protein is negligible with a value of −0.08 kcal/mol, but theindirect PMF is quite favorable ω = −5.02 kcal/mol. In Table 3,this type of water is defined as “high density hydrophobicwater”. To demonstrate the effect of the displacement of a“high density hydrophobic water” on protein−ligand binding,we designed a modified ligand based on biotin. Biotin, withstructure shown in Figure 6C, can bind to streptavidin with anextraordinarily strong binding affinity, ΔG = −18.3 kcal/mol,estimated from titration calorimetric measurements.51 Wereplaced the beta carbon with oxygen in biotin so that theexcluded volume of the modified ligand is smaller than that ofbiotin. The modified ligand is labeled as “biotin-O” withstructure shown in Figure 6C. As a result, when biotin-O ischanged to biotin as shown in Figure 6A, this requires thedisplacement of the “high density hydrophobic water” in region

9 that we identified. In Table 1, we compared the relativebinding affinity of ligand pairs biotin-O/biotin with streptavi-din. There is a gain in the binding affinity with a value of ΔΔG= −2.54 kcal/mol due to the displacement of the “high densityhydrophobic water”. We note that the calculated absolutestandard binding free energy between streptavidin and biotin is−9.94 kcal/mol, which deviates substantially from theexperimental result −18.3 kcal/mol. The reason for thisunderestimate suggests a force field issue since the potentialparameters we apply here are not polarizable and it has beenpointed out that electrostatic polarization is critical for thestrong binding between streptavidin and biotin.78 Thisunderestimate is not a key issue in this study since we areinterested in the relative binding affinity ΔΔG of the twoclosely related ligands biotin and biotin-O.We further notice that there is the formation of a five-

membered water ring in the binding pocket of streptavidin, inthe absence of bound ligand. It has been reported thatdisplacement of these entropy restricted water molecules alsoplays a significant role in the binding between streptavidin andbiotin from an analysis based on IST.22 We analyzed thethermodynamic properties of two water molecules belonging tothis five-membered water ring buried inside the binding pocketaround residues Ser-27, Ser-45, and Asp 128 in region 10 asshown in Figure 6B. Due to favorable hydrogen bonding withthe polar protein residues, these two water molecules displayquite strong direct interactions as shown in Table 2 row 12 androw 13. Consequently, the total PMFs are favorable (WT =−5.12 kcal/mol and WT = −3.58 kcal/mol respectively)although the positive indirect PMFs counterbalance some ofthe stabilizing effect from the direct interactions. In the ISTdiscussion,22 these two water molecules bear large entropicpenalties, which is consistent with the negative total PMFs inour discussion according to eq 6. On the basis of our previous

Figure 6. (A) Transforming the ligand biotin-O to biotin causes thedisplacement of one water molecule in region 9. (B) Transformingbiotin-deuri to biotin causes the displacement of two water moleculesin region 10. (C) Chemical structures from biotin-O to biotin. (D)Chemical structures from biotin-deuri to biotin.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

519

Page 9: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

definition, these two water molecules belong to the “highdensity hydrophilic water” category, and displacement of thesewaters is predicted to lead to a significant enhancement ofbinding affinity. To verify this, we designed another ligandbased on the biotin core as shown in Figure 6D. Comparedwith biotin, there is a lack of ureido group (−HNCONH−) inthe modified ligand, we labeled it as “biotin-deuri”. As the initialligand biotin-deuri is transformed to biotin, it involves thedisplacement of the two “high density hydrophilic waters” inregion 10 as shown in Figure 6B. There is a significant changein the relative binding affinity as the ligand is transformed frombiotin-deuri to biotin with ΔΔG = −7.36 kcal/mol as shown inTable 1, which is consistent with our previous discussion.Ligands Binding to MUP. Finally, we consider another

type of interfacial water that has been previously targeted byIST analysis29 for purposes of ligand design which correspondsto “low density dry water” as shown in Table 3. The water atthis position is much lower in density compared with bulk.Previously, it has been shown that the binding site of proteinMUP is buried inside the protein so it is a classic example tostudy the effect of “dry regions” in the ligand binding affinity.Here, we studied the binding between MUP with two ligands:OC9 and F09 (structure shown in Figure 7B). Double

decoupling calculations in Table 1 last row indicate the bindingfree energy difference to be small, ΔΔG = −0.75 kcal/mol. Asthe initial ligand OC9 changes to F09, it involves thedisplacement of one water molecule in region 11 as shown inFigure 7. Analysis of the thermodynamic signature of thisdisplaced water in the last row of Table 2 reveals that it is a“low density dry water” with WT > 0 (ρ(x)/ρ(∞) = 0.04), u ≃0 and ω > 0. In this example, the displacement of the “lowdensity dry water” at a location which is “dewet” contributes arelatively small amount to the enhancement of binding affinity(ΔΔG = −0.75 kcal/mol) compared with the first three typesof water signatures listed in Table 3.

■ DISCUSSIONIn the IST analysis, several criteria have been proposed asthermodynamic signatures of the interfacial water positions thatcan be regarded as targets for ligand design. In the initialapplication of “WaterMap”, it was found that displacement ofwater molecules with highly restricted 1-body entropy leads toenhanced binding affinity.22 In a further study of the relativebinding affinity between congeneric ligand pairs using “Water-Map”, a more quantitative scoring function was proposed basedon the following components: 1-body energy + 1-body entropy

+ 2-body energy.5,20 Later, scoring functions based on GISTwhich also involve 1-body energy, 1-body entropy, and 2-bodyenergy terms were presented, and it was further pointed outthat the energy terms play a dominant role over the entropyterm according to the fitted scoring function.26 In this work, wehave developed an alternative approach rooted in end pointDFT expressions which focuses on the effects of the direct andindirect PMF components of interfacial waters around thereceptor on the binding free energy differences between thecongeneric ligand pairs, binding to the same protein receptor.The indirect part of the PMF is equivalent to the sum of the 1-body energy + 1-body entropy terms in the IST expansionevaluated at fixed translational position and orientation of theinterfacial water molecule which is displaced by a functionalgroup of one of the congeneric ligands. Our analysis, which isbased on the evaluation of both the direct and indirectcomponents of the protein−solvent PMF, and therefore islimited to 1-body energy and 1-body entropy terms, appears tobe balanced and provides physical insights, as the change in thebinding free energies between congeneric ligands can be relatedto the thermodynamic signatures of individual water moleculesthat are displaced when a functional group is added to one ofthe congeneric ligand pairs.The above examples illustrate the empirical correlation

between ΔΔGbind of congeneric ligand pairs and the magnitudeof ω from the extra displaced water molecules. To visualize thisrelationship, we constructed a simple fitting function with twoparameters based upon ω from the displaced water moleculesto predict the standard binding free energy difference between apair of congeneric ligands.

∑ ωΔΔ = +=

G A Bi

n

ibind1 (9)

where ωi is the indirect PMF from water molecule number i.Here, A and B are two fitting parameters determined bymatching ΔΔGpredict with ΔΔGDDM, which is the free energydifference from the double decoupling calculation. Using resultsfrom double decoupling calculations as reference values insteadof experimental ones removes discrepancies from force fieldissue in the simulation. On the basis of the linear regressionfitting, parameter A was −0.551 and parameter B was −0.432.The final ΔΔGpredict versus ΔΔGDDM is plotted in Figure 8 withR2 = 0.84. We emphasize that we are not proposing eq 9 as ascoring function for ligand design; this subject is considered inthe Appendix.To motivate the physical basis for the correlation shown in

Figure 8, we refer to the Appendix which works through athermodynamic cycle for the difference between the free energyof binding for two congeneric ligands in terms of the excesschemical potentials of all the species in the cycle. The result ofthis analysis, eq A11, is a DFT based expression for ΔΔGbindwhich is exact if all the terms are evaluated. The terms are offive kinds: (1) local integrals involving the solvent density andindirect PMF ω over small volumes (Vdisp) at the protein−ligand receptor interface where one or a few waters aredisplaced during the process of an FEP type transformationfrom the smaller to the larger ligand of the congeneric pair; (2)the difference in the interaction of the congenic ligands with theprotein; (3) local integrals of the solvent density and ω overVdisp around the unbound ligand; (4) a term arising from thedifference in the DFT charging integrals between thecongeneric ligands in solution, and a corresponding change

Figure 7. (A) Transforming the ligand OC9 to F09 causes thedisplacement of one water molecule in region 11. (B) Chemicalstructures from OC9 to F09.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

520

Page 10: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

between the complexes with the congeneric ligands bound; and(5) integrals from more distant regions (beyond Vdisp) arisingfrom the changes in the solvent density and ω between the twoligands or two complexes. The empirical relation eq 9, suggeststhat a useful approximation to be applied to eq A11 is toneglect the contribution to ΔΔG from changes in the solventstructure beyond Vdisp and to neglect the DFT charging terms;therefore to assume that the first two terms involving end pointDFT expressions for the bound complex, along with the changein interaction term, make the most important contributions toΔΔGbind. If we make this assumption, it becomes apparent thatwhen a “high density hydrophilic water” is displaced by afunctional group which is added to a ligand core, the drivingforce for the more favorable binding of the larger ligandcompared with smaller ligand is mainly from the interactionbetween the protein and the added functional group of thelarger ligand, corresponding to the ΔUpro‑lig term in eq A11.The first term in eq A11 corresponds to the difference insolvent density, between the density at the location of thesolvent molecule to be displaced, and the bulk. When the waterto be displaced is “high density hydrophilic water”, the firstterm is positive and contributes unfavorably to ΔΔGbind.Furthermore, when water has a “high density hydrophilic”thermodynamic signature, ω is positive and the second term isalso unfavorable. Therefore, the favorable interactions betweenthe protein and added functional group from the larger ligandwill be opposed by the first and second terms in eq A11. Whenω is positive, this corresponds to an unfavorable interaction forthe tagged water with the surrounding waters and a veryfavorable interaction with the protein (see Table 2). Thissuggests a ligand design strategy which adds a hydrophilicfunctional group to the original ligand core to replace thehydrophilic water and recover the favorable direct interactionbetween the tagged water and protein. Although the favorableinteraction is weakened by the solvation effect from the firstand second terms in eq A11, ultimately, the added functionalgroup can still drive the increased affinity through a favorablevalue of the ΔUpro‑lig term of eq A11. This actually accentuatesthe importance of the competition between the direct andindirect terms in the DFT analysis approach for “high densityhydrophilic water”. In the case of “high density hydrophobicwater”, in contrast, the contribution from the change ofinteraction between the protein and larger ligand compared

with smaller ligand is small. In this case, the first term in eq A11is still unfavorable due to the high density nature. On the otherhand, the high density corresponding to large ρ(x) is stronglycorrelated with a negative ω, which makes the second term ineq A11 favorable for the binding. Therefore, the first andsecond terms compensate partially against each other. Forinterfacial waters with thermodynamic signature “high densityhydrophobic water”, the binding is driven by the second term;the displacement of water with negative ω contributes favorablyto ΔΔGbind. Finally, in the case of “low density dry water”, u ≈0 indicates that the contribution from ΔUpro‑lig term is alsonegligible in this case. However, the roles of the first andsecond terms in eq A11 in determining the ΔΔGbind are theinverse from the case of “high density hydrophobic water”. Thefirst term in eq A11 is favorable since ρ(x) is smaller than ρ0,while ω > 0 implies that the second term in eq A11 isunfavorable. There is still a competition between the first andsecond terms. But this time, the first term provides the drivingforce for more favorable binding when a “low density drywater” is displaced by a ligand functional group. On the basis ofthe analysis presented in the Appendix, it can be shown that thefirst term is larger in magnitude than the second, so that theeffect of placing a ligand functional group at dry cavity whichforms part of the protein binding site will be favorable, but theeffect may be small as is the case in our example.From an analysis of eq A11, it should be clear that the sign of

the indirect PMF cannot be used to determine the effect ofwater displacement on the strength of the ligand binding.Stated in another way, the sign of the first order contribution(1-body energy + 1-body entropy) of the IST solvation freeenergy of a solvent molecule displaced during the trans-formation of ligand to its congeneric pair, is not determinativeof the sign of ΔΔGbind. Instead, water locations with largeabsolute values of the indirect PMF ω, serve as targets forligand design to enhance the binding affinity.We note the fitting function, eq 9, is greatly oversimplified.

For one the density of the displaced solvent molecule does notappear explicitly. Furthermore, and most importantly, thedifference in the binding affinity for a given pair of congenericligands with a protein receptor is not only determined by thedisplaced solvent. Changes in the interactions between proteinand ligand pairs also make a contribution to the free energydifference as shown by ΔUpro‑lig term in eq A11; this term islargely responsible for the gain in affinity when “high densityhydrophilic water” is displaced by adding a functional group toa ligand core. Construction of more realistic scoring functions,such as eq A11, based on the analysis of the protein−solventPMF and its direct and indirect components to predict ΔΔGbindshould also consider the change in the interactions between theprotein and ligand in addition to the density-weighted indirectPMF contribution. To construct scoring functions based on thespatially resolved, end-point DFT framework, it will benecessary to develop a more automated procedure forcalculating the solute−solvent PMF and its indirect partω(x), as well as for calculating the charging integral (densityfunctional) systematically throughout the protein−ligandbinding interface. Work along these lines is underway in ourgroups.

■ CONCLUSIONBuilding upon the insights gained from the extensive prioranalysis of the binding free energy between protein receptors(e.g., FXa and Streptavidin) and pairs of congeneric

Figure 8. ΔΔGpredict based on eq 9 versus ΔΔGDDM using doubledecoupling method from 12 congeneric ligand pairs.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

521

Page 11: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

ligands,5,22,26 we describe an alternative approach based on thethermodynamic signatures of displaced water molecules at theinterface of a protein receptor with a view toward aiding thedesign of more potent binding ligands with higher bindingaffinity. The idea of designing tight binding ligands by analyzingthe thermodynamic signatures of solvent molecules around thetarget protein receptor has been described previ-ously.5,20−22,26,27 The prior work is based on the ISTframework which analyzes the thermodynamic signatures ofinterfacial water in terms of excess energies and entropies. Inthis work, we develop an alternative approach rooted in DFTwhich focuses on the effect of the indirect contribution to thesolute−solvent PMF of interfacial waters around the receptor ingoverning the binding free energy changes when functionalgroups are added to a ligand core. We identified three types ofinterfacial water positions that can be regarded as targets forligand design as summarized in Table 3: “bulk densityhydrophilic water”, “high density hydrophilic water” and“high density hydrophobic water”. Considering the fourthtype of interfacial water thermodynamic signature we studied,“low density dry” water, we found that for the one example weanalyzed, the contribution to the binding affinity was small.While this result is consistent with an analytic argument wepresent in the Appendix concerning the relative contributionsto ligand binding from the displacement of high densityhydrophobic versus low density dry interfacial waters, moresystematic study of this is required. Our major conclusion isthat the displacement of a water molecule with an indirectsolute−solvent PMF term which is large in magnitude,regardless of the sign, can make a significant contribution totighter binding.For the displacement of a “high density hydrophilic water”

molecule with large positive ω(x), one goal in further design ofligands involves modification of the original ligand structurearound a druggable site to mimic and enhance the favorableinteraction between the displaced water and the receptor, whichalso disrupts the neighboring water structure to a smallerextent. In this case, the dominant effect in determining therelative binding affinity arises from the change in the interactionterm (ΔUpro‑lig) in eq A11. Interfacial water locations whichcontribute favorably to the excess chemical potential of thesolute (ω(x) > 0 in eq 5) and locations which contributeunfavorably to the excess chemical potential of the solute (ω(x)< 0 in eq 5) can both be targeted for displacement by adding afunctional group to a reference ligand which will improve thebinding affinity. Furthermore, we have found that a druggablesite that can be targeted for further ligand design does notnecessarily reside at positions where the solvent density is muchhigher than the bulk, which is usually required from thehydration site approach (HSA) in IST.5,22,31 In Table 2 row 1,we identified a type of “bulk density hydrophilic water” inregion 1. From our end point DFT analysis approach, a watermolecule in this region bears both large favorable directinteractions with the solute and unfavorable indirect con-tributions from surrounding solvent which almost completelycancel each other, resulting in a total PMF close to 0. Ouranalysis indicates that the displacement of such a bulk densityhydrophilic water by adding a functional group to a ligandscaffold nearby can also result in a significant gain in thebinding affinity.

■ APPENDIX

In this Appendix we highlight the roles of the indirect (solvent-mediated) part of the potential of mean force (PMF) in thestatistical thermodynamics of protein−ligand binding. We baseour formalism on the DFT expression of eq 5 and discuss whatinformation is carried by ω(x) for the difference in the bindingfree energy ΔΔGbind between congeneric pairs of ligands. Anapproximate argument is presented, in particular, for the casesof “high density hydrophobic water” and “low density drywater”, for which the changes in the interactions between theprotein and ligand are weak when an added functional group ofa congeneric pair displaces interfacial water molecules, and thethermodynamic signature of the displaced waters is likely to bethe key factor in the determination of ΔΔGbind.Consider a ligand S with smaller excluded volume and a

ligand L with larger excluded volume binding to the sameprotein. The excess chemical potential of S in water is denotedas ΔμS and the excess chemical potential of L in water isdenoted as ΔμL. The free energy of changing S to L in water isdenoted as ΔGwat,lig and the free energy of changing S to L invacuum is denoted as ΔGvac,lig. According to the thermody-namic cycle shown in Scheme 1A:

μ μΔ − Δ = Δ − ΔG GL S wat,lig vac,lig (A1)

Label the complex between the protein and ligand S as PSand the complex between the protein and ligand L as PL. Theexcess chemical potentials of PS and PL in water are denoted asΔμPS and ΔμPL, respectively. The free energies of changing PSto PL in water and vacuum are denoted as ΔGwat,com andΔGvac,com, respectively. According to the thermodynamic cycleshown in Scheme 1B:

μ μΔ − Δ = Δ − ΔG GPL PS wat,com vac,com (A2)

Combining eqs A1 and A2 we obtain

μ μ μ μ

Δ − Δ

= Δ − Δ − Δ − Δ

+ Δ − Δ

G G

G G

( ) ( )

( )

wat,com wat,lig

PL PS L S

vac,com vac,lig (A3)

According to the thermodynamic cycle applied in FEPcalculation shown in Scheme 2, the left-hand side of theabove equation (ΔGwat,com − ΔGwat,lig) can be related to therelative binding free energy between the ligand pair S and Lwith the proteins in water:

ΔΔ = Δ − Δ = Δ − Δ‐ ‐G G G G Gbind bind L bind s wat,com wat,lig

(A4)

Scheme 1. Thermodynamic Cycle for ΔΔGbind by DFTAnalysis of Displaced Watera

a(A) S change to L in vacuum and water. (B) PS change to PL invacuum and water.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

522

Page 12: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

where ΔGbind‑L is the binding free energy between ligand L andprotein and ΔGbind‑s is the binding free energy between ligand Sand protein. Similarly, the last term of the right-hand side of eqA3 (ΔGvac,com − ΔGvac,lig) can be expressed as ΔΔGvac, which isthe binding free energy difference between ligand pair L and Swith protein in vacuum and it can be approximated by ΔUpro‑lig,the difference of the interaction between the protein and ligandas the binding ligand change from S to L, when the thermalmotion of the protein and ligands are not considered. (We notethat in the current work the thermal motion of the protein andligands are not considered in the calculations involving directand indirect PMF of displaced water, but they are included inthe double decoupling free energy simulations to estimateΔΔGbind for a congeneric ligand pair.) On the basis of eqs A3and A4:

μ μ μ μΔΔ = Δ − Δ − Δ − Δ + Δ ‐G U( ) ( )bind PL PS L S pro lig

(A5)

This is the expression of ΔΔGbind based on the difference ofexcess chemical potential, which can be evaluated based onDFT analysis. According to eq 5 in the main text, ΔμS can beexpressed as

∫∫

μ ρ ρ

ρ ω ω

Δ = − − ∞

− − ∞ +

k T x x

x x x

d [ ( ) ( )]

d ( )[ ( ) ( )] charging term

s B s s

s s s

(A6)

where ρS is the water density around S and ωS is the indirectPMF derived from it. Similarly, ΔμL can be expressed as

∫∫

μ ρ ρ

ρ ω ω

Δ = − − ∞

− − ∞ +

k T x x

x x x

d [ ( ) ( )]

d ( )[ ( ) ( )] charging term

L B L L

L L L

(A7)

where ρL is the water density around L and ωL is the indirectPMF derived from it. When the difference of ΔμL from ΔμS isdominated by those water molecules that are displaced whenthe ligand is grown from S to L, it can be expressed as

μ μ ρ ρ ρ

ρ ω ω

ρ ω

Δ − Δ = − ∞ +

+ − ∞

+ +

k T k T V

d

V

S L

x x

x x x

d [ ( ) ( )]

( )[ ( ) ( )]

change in charging term terms for the integration

outside arising from the changes in and

between and

V

s s

V

s s s

L s B B 0disp

disp

disp

disp

(A8)

where the domain of integration is restricted to the region inwhich water is found in system S but not in system L. It isfurther assumed that the changes in the solvent density and

indirect PMF are minor outside the domain of integration ofVdisp, where water is not displaced and we express this part ofcontribution to the change in excess chemical potential as anintegral over the volume V > Vdisp. A similar expression holdsfor the protein−ligand complexes PS and PL:

μ μ ρ ρ

ρ ρ ω ω

ρ ω

Δ − Δ = − ∞

+ + − ∞

+ +

k T

k T V

V

x x

x x x

d [ ( ) ( )]

d ( )[ ( ) ( )]

change in charging term terms for the integration

outside arising from the changes in and

between PS and PL

V

V

PL PS B PS PS

B 0disp

PS PS PS

disp

disp

disp

(A9)

According to eq A8 and A9:

μ μ μ μ

ρ ρ

ρ ω ω

ρ ρ

ρ ω ω

ρ ω

Δ − Δ − Δ − Δ

= − ∞

+ − ∞

− − ∞

− − ∞

+ +

k T

k T

V

x x

x x x

x x

x x x

( ) ( )

d [ ( ) ( )]

d ( )[ ( ) ( )]

d [ ( ) ( )]

d ( )[ ( ) ( )]

change in charging term terms for the integration

outside arising from the changes in and

V

V

V

V

PL PS L S

B PS PS

PS PS PS

B S S

S s s

disp

disp

disp

disp

disp

(A10)

Combining eq A5 and A10:

ρ ρ

ρ ω

ρ ρ

ρ ω

ρ ω

ΔΔ = −

+ + Δ

− −

+ +

G k T

x U

k T

V

x x

x x

x x

x x x

d [ ( ) ]

d ( ) ( )

{ d [ ( ) ]

d ( ) ( )

change in charging term terms for the

integration outside

arising from the changes in and }

V

V

V

V

s

bind B PS 0

PS PS pro lig

B s 0

s

disp

disp

disp

disp

disp

(A11)

where we have replaced the bulk values of water density andindirect PMF to the ones in the thermodynamic limit since theintegration is done over the finite volume. The first three termscapture the thermodynamic effects of displacing interfacialwaters closest to the protein, and changes in the protein−ligandinteractions when a functional group is added to a ligand coreof a congeneric pair; and in the following we focus on thoseterms. We further suppose that the functional group added tothe ligand is polar when the water molecule to be displaced is of“high density hydrophilic” type and is nonpolar for adisplacement of “high density hydrophobic water” or “lowdensity dry water”. In the numerical treatments in the maintext, ρPS and ωPS were approximated by using the density andindirect PMF around the protein without the ligand.

Scheme 2. Thermodynamic Cycle for ΔΔGbind by FEP

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

523

Page 13: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

For a “high density hydrophilic water”, ρPS(x) > ρ0 andωPS(x) > 0. This means that the first two terms of eq A11contribute unfavorably to ΔΔGbind. It is then suggested that thedriving force for obtaining a favorable ΔΔGbind is ΔUpro‑lig.ΔUpro‑lig needs to be a “strong” interaction to overturn theeffect of the first two terms of eq A11, and thus a functionalgroup with a polar interaction is added to the ligand.When the displaced water is “high density hydrophobic

water” or “low density dry water”, its interaction with theprotein is weak. This is rephrased as u ≈ 0, and with oursupposition, ΔUpro‑lig is further considered to be minor. Thesum of the first to third terms of eq A11 then reduces to

∫ ∫

ρ ρ ρ ω

ρ

βω

ρ βω

− +

= −

+

≈ − +

β βω

β βω

βω βω

− −

− −

− −

k T x x

k T

k T x

x x x

x

x

x

d [ ( ) ] d ( ) ( )

d [e 1

e ( )]

d [e 1 e ( )]

V V

Vu

u

V

x x

x x

x x

B PS 0 PS PS

B 0( ) ( )

( ) ( )PS

B 0( ) ( )

PS

disp disp

disp

PS

PS

disp

PS PS

(A12)

On the rightmost side, the integrand is always negative for anonzero ωPS, irrespective of its sign, and becomes larger inmagnitude with |ωPS| in both the positive and negative regionsof ωPS. To prove this, let f(z) = exp(−z) − 1 + z exp(−z),where z corresponds to βωPS in eq A12. f(0) = 0 and df(z)/dz= −z exp(−z). This derivative is negative when z > 0, and ispositive when z < 0. f(z) thus increases with z when z < 0 anddecreases with z when z > 0, implying that f(z) < f(0) = 0 if z isnot equal to 0. Therefore, in the case of “high densityhydrophobic water” and “low density dry water” with u ≈ 0, thesum of the first and second terms makes a favorablecontribution to ΔΔGbind according to eq A12. We expect thatthese two terms can account for most of ΔΔGbind of acongeneric ligand pair involving the addition of a hydrophobicfunctional group which displaces a “high density hydrophobic”or “low density dry water”.It is also possible to show that a “high density hydrophobic

water” has a stronger effect than a “low density dry water”. Tosee this point, we note that f(∞) = −1. Thus, since f(z) is adecreasing function of z when z > 0, the rightmost side of eqA12 satisfies

ρ ρ

ρ βω

− = −

< − + <βω βω− −

k T V k T

k T

x

x x

d

d [e 1 e ( )] 0

V

Vx x

B 0disp

B 0

B 0( ) ( )

PS

disp

disp

PS PS

(A13)

This shows that ωPS of “low density dry water” (with ωPS > 0)contributes to ΔΔGbind in a range between 0 and − kBTρ0V

disp.When Vdisp is such that ρ0V

disp is close to 1, the contribution isbetween 0 and −kBT. Accordingly, the rightmost side of eq A12is more favorable (more negative) for a “high densityhydrophobic water” (with ωPS < 0) than for a “low densitydry water” when

ω⟨ ⟩ < −k TPS B (A14)

holds, where ⟨ωPS⟩ is the average of ωPS in the Vdisp regionintroduced as

∫ω

ρ ω

ρ⟨ ⟩ =

x x x

x x

d ( ) ( )

d ( )

V

VPSPS PS

PS

disp

disp

(A15)

Equation A14 means that the average of ωPS in the Vdisp regionis more negative than −kBT, which is valid in Table 2.Therefore, a “high density hydrophobic water” can be a goodtarget for ligand design, while a “low density dry water” maynot be. We plan further analysis of this conjecture.Finally, we remark on the numerical treatment of the first

two terms of eq A11. Let

∫ ρ=N x xd ( )V

av PS

disp

(A16)

be the average number of water molecules in the Vdisp region, inwhich the integration is taken over the space and over theorientation. This situation is common to eq A15, where thespatial and orientational averaging is done. The first two termsof eq A11 can then be rewritten as

∫ ∫ρ ρ ρ ω

ρ ω

− +

= − + ⟨ ⟩

k T

k T N V N

x x x x xd [ ( ) ] d ( ) ( )

( )

V V

B PS 0 PS PS

B av 0 disp av PS

disp disp

(A17)

In MD, a finite sampling is possible for the water configurationx. When we sample many xi within the Vdisp region (i = 1, ..., n),⟨ωPS⟩ is calculated as

∑ ωn

x1

( )i

iPS(A18)

since xi appears with a probability proportional to ρPS(xi). Thecalculations performed in the present work correspond to n = 1,that this estimate is reasonable can be (semiquantitatively)validated as follows. A “high density hydrophilic water” will berestrained in terms of the position and orientation, meaningthat ρPS(x) has a sharp peak at the x sampled during the courseof MD. Thus, a single or a few snapshots (small n) can berepresentative and are expected to provide the average of eqA15 without doing harm for semiquantitative arguments suchas Figure 8. For a “high density hydrophobic water” or “lowdensity dry water”, on the other hand, water can move ratherfreely and ρPS(x) is a broad distribution. This implies that sincethe direct interaction u is small, ωPS does not depend stronglyon x in the Vdisp. Thus, ⟨ωPS⟩ is considered to be close toωPS(x) sampled in MD, leading to sufficiency of small n in eqA18. The dependence of a computed ⟨ωPS⟩ on the number ofsamplings n is a subject of a forthcoming work, in which apredictive scheme for ΔΔGbind is explored on the basis of eqA11 that is derived through DFT.

■ AUTHOR INFORMATIONCorresponding Authors*E-mail: [email protected].*E-mail: [email protected].

ORCIDBin W. Zhang: 0000-0003-3007-4900Nobuyuki Matubayasi: 0000-0001-7176-441XRonald M. Levy: 0000-0001-8696-5177NotesThe authors declare no competing financial interest.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

524

Page 14: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

■ ACKNOWLEDGMENTS

This work has been supported in part by NIH Grant GM30580and by an NIH computer equipment grant, Grant No.OD020095. This work is also supported by the Grants-in-Aidfor Scientific Research (Nos. JP15K13550 and JP26240045)from the Japan Society for the Promotion of Science, by theElements Strategy Initiative for Catalysts and Batteries and thePost-K Supercomputing Project from the Ministry ofEducation, Culture, Sports, Science, and Technology ofJapan, and by the HPCI System Research Project (ProjectIDs: hp170097 and hp170221).

■ REFERENCES(1) Lemieux, R. How Water Provides the Impetus for MolecularRecognition in Aqueous Solution. Acc. Chem. Res. 1996, 29, 373−380.(2) Ladbury, J. Just Add Water! The Effect of Water on theSpecificity of Protein-Ligand Binding Sites and Its PotentialApplication to Drug Design. Chem. Biol. 1996, 3, 973−980.(3) Mancera, R. Molecular Modeling of Hydration in Drug Design.Curr. Opin. Drug Discovery Dev. 2007, 10, 275−280.(4) Li, Z.; Lazaridis, T. Water at Biomolecular Binding Interface.Phys. Chem. Chem. Phys. 2007, 9, 573−581.(5) Abel, R.; Young, T.; Farid, R.; Berne, B.; Friesner, R. Role of theActive-Site Solvent in the Thermodynamics of Factor Xa LigandBinding. J. Am. Chem. Soc. 2008, 130, 2817−2831.(6) Abel, R.; Wang, L.; Friesner, R.; Berne, B. A Displaced-SolventFunctional Analysis of Model Hydrophobic Enclosures. J. Chem.Theory Comput. 2010, 6, 2924−2934.(7) Baron, R.; Setny, P.; McCammon, J. Water in Cavity-LigandRecognition. J. Am. Chem. Soc. 2010, 132, 12091−12097.(8) Hummer, G. Molecular Binding: Under Water’s Influence. Nat.Chem. 2010, 2, 906−907.(9) Beuming, T.; Che, Y.; Abel, R.; Kim, B.; Shanmugasundaram, V.;Sherman, W. Thermodynamics Analysis of Water Molecules at theSurface of Proteins and Applications to Binding Site Prediction andCharacterization. Proteins: Struct., Funct., Genet. 2012, 80, 871−883.(10) Graziano, G. Molecular Driving Forces of the Pocket-LigandHydrophobic Association. Chem. Phys. Lett. 2012, 533, 95−99.(11) Riniker, S.; Barandun, L.; Diederich, F.; Kramer, O.; Steffen, A.;van Gunsteren, W. Free Enthalpies of Replacing Water Molecules inProtein Binding Pockets. J. Comput.-Aided Mol. Des. 2012, 26, 1293−1309.(12) Mondal, J.; Friesner, R.; Berne, B. Role of Desolvation inThermodynamics and Kinetics of Ligand Binding to a Kinase. J. Chem.Theory Comput. 2014, 10, 5696−5705.(13) Cui, D.; Ou, S.; Patel, S. Protein-Spanning Water Networks andImplications for Prediction of Protein-Protein Interactions Mediatedthrough Hydrophobic Effects. Proteins: Struct., Funct., Genet. 2014, 82,3312−3326.(14) Vukovic, S.; Brennan, P.; Huggins, D. Exploring the Role ofWater in Molecular Recognition: Predicting Protein LigandabilityUsing a Combinatorial Search of Surface Hydration Sites. J. Phys.:Condens. Matter 2016, 28, 344007.(15) Raman, E.; MacKerell, A. Rapid Estimation of HydrationThermodynamics of Macromolecular Regions. J. Chem. Phys. 2013,139, 055105.(16) Widom, B. Structure of Interfaces from Uniformity of theChemical Potential. J. Stat. Phys. 1978, 19, 563−574.(17) Ben-Amotz, D. Interfacial Solvation Thermodynamics. J. Phys.:Condens. Matter 2016, 28, 414013.(18) Lazaridis, T. Inhomogeneous Fluid Approach to SolvationThermodynamics. 1. Theory. J. Phys. Chem. B 1998, 102, 3531−3541.(19) Lazaridis, T. Inhomogeneous Fluid Approach to SolvationThermodynamics. 2. Applications to Simple Fluids. J. Phys. Chem. B1998, 102, 3542−3550.

(20) Li, Z.; Lazaridis, T. The Effect of Water Displacement onBinding Thermodynamics: Concanavalin A. J. Phys. Chem. B 2005,109, 662−670.(21) Li, Z.; Lazaridis, T. Thermodynamics of Buried Water Clustersat a Protein-Ligand Binding Interface. J. Phys. Chem. B 2006, 110,1464−1475.(22) Young, T.; Abel, R.; Kim, B.; Berne, B.; Friesner, R. Motifs forMolecular Recognition Exploring Hydrophobic Enclosure in Protein-Ligand Binding. Proc. Natl. Acad. Sci. U. S. A. 2007, 104, 808−813.(23) Snyder, P.; Mecinovic, J.; Moustakas, D.; Thomas, S., III;Harder, M.; Mack, E.; Lockett, M.; Heroux, A.; Sherman, W.;Whitesides, G. Mechanism of the Hydrophobic Effect in theBiomolecular Recognition of Arglsulfonamides by Carbonic Anhy-drase. Proc. Natl. Acad. Sci. U. S. A. 2011, 108, 17889−17894.(24) Huggins, D. Quantifying the Entropy of Binding for WaterMolecules in Protein Cavities by Computing Correlations. Biophys. J.2015, 108, 928−936.(25) Li, Z.; Lazaridis, T. Thermodynamic Contributions of theOrdered Water Molecule in HIV-1 Protease. J. Am. Chem. Soc. 2003,125, 6636−6637.(26) Nguyen, C.; Cruz, A.; Gilson, M.; Kurtzman, T. Thermody-namics of Water in an Enzyme Active Site: Grid-Based HydrationAnalysis of Coagulation Factor Xa. J. Chem. Theory Comput. 2014, 10,2769−2780.(27) Nguyen, C.; Young, T.; Gilson, M. Grid InhomogeneousSolvation Theory: Hydration Structure and Thermodynamics of theMiniature Receptor Cucurbit[7]uril. J. Chem. Phys. 2012, 137, 044101.(28) Li, Z.; Lazaridis, T. Computing the ThermodynamicContributions of Interfacial Water. Methods Mol. Biol. 2012, 819,393−404.(29) Wang, L.; Berne, B.; Friesner, R. Ligand Binding to Protein-Binding Pockets with Wet and Dry Regions. Proc. Natl. Acad. Sci. U. S.A. 2011, 108, 1326−1330.(30) Abel, R.; Salam, N.; Shelley, J.; Farid, R.; Friesner, R.; Sherman,W. Contribution of Explicit Solvent Effects to the Binding Affinity ofSmall-Molecule Inhibitors in Blood Coagulation Factor SerineProteases. ChemMedChem 2011, 6, 1049−1066.(31) Haider, K.; Huggins, D. Combining Solvent ThermodynamicProfiles with Functionality Maps of the Hsp90 Binding Site to Predictthe Displacement of Water Molecules. J. Chem. Inf. Model. 2013, 53,2571−2586.(32) Pearlstein, R.; Hu, Q.; Zhou, J.; Yowe, D.; Levell, J.; Dale, B.;Kaushik, V.; Daniels, D.; Hanrahan, S.; Sherman, W.; Abel, R. NewHypotheses about the Structure-Function of Proprotein ConvertaseSubtilisin/Kexin Type 9: Analysis of the Epidermal Growth Factor-likeRepeat a Docking Site Using WaterMap. Proteins: Struct., Funct., Genet.2010, 78, 2571−2586.(33) Beglov, D.; Roux, B. An Integral Equation to Describe theSolvation of Polar Molecules in Liquid Water. J. Phys. Chem. B 1997,101, 7821−7826.(34) Kovalenko, A.; Hirata, F. Three-Dimensional Density Profiles ofWater in Contact with a Solute of Arbitrary Shape: A RISM Approach.Chem. Phys. Lett. 1998, 290, 237−244.(35) Truchon, J.; Pettitt, M.; Labute, P. A Cavity Corrected 3D-RISM Functional for Accurate Solvation Free Energies. J. Chem.Theory Comput. 2014, 10, 934−941.(36) Bayden, A.; Moustakas, D.; Joseph-McCarthy, D.; Lamb, M.Evaluating Free Energies of Binding and Conservation of Crystallo-graphic Waters Using SZMAP. J. Chem. Inf. Model. 2015, 55, 1552−1565.(37) Goodford, P. A Computational Procedure for DeterminingEnergetically Favorable Binding Sites on Biologically ImportantMacromolecules. J. Med. Chem. 1985, 28, 849−857.(38) Evans, R. The Nature of the Liquid-Vapour Interface and OtherTopics in the Statistical Mechanics of Non-uniform, Classical Fluids.Adv. Phys. 1979, 28, 143−200.(39) Henderson, D. Fundamentals of Inhomogeneous Fluids; MarcelDekker: New York, 1992.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

525

Page 15: The Role of Interfacial Water in Protein Ligand Binding ... · The Role of Interfacial Water in Protein−Ligand Binding: Insights from the Indirect Solvent Mediated Potential of

(40) Hansen, J.; McDonald, I. Theory of Simple Liquids; AcademicPress: London, 1986.(41) Jeanmairet, G.; Levesque, M.; Vuilleumier, R.; Borgis, D.Molecular Density Functional Theory of Water. J. Phys. Chem. Lett.2013, 4, 619−624.(42) Gendre, L.; Ramirez, R.; Borgis, D. Classical Density FunctionalTheory of Solvation in Molecular Solvents: Angular Grid Implemen-tation. Chem. Phys. Lett. 2009, 474, 366−370.(43) Zhao, S.; Ramirez, R.; Vuilleumier, R.; Borgis, D. MolecularDensity Functional Theory of Solvation: From Polar Solvents toWater. J. Chem. Phys. 2011, 134, 194102.(44) Borgis, D.; Gendre, L.; Ramirez, R. Molecular DensityFunctional Theory: Application to Solvation and Electron-TransferThermodynamics in Polar Solvents. J. Phys. Chem. B 2012, 116, 2504−2512.(45) Matubayasi, N.; Nakahara, M. Theory of Solutions in theEnergetic Representation. I. Formulation. J. Chem. Phys. 2000, 113,6070−6081.(46) Matubayasi, N.; Nakahara, M. Theory of Solutions in the EnergyRepresentation. II. Functional for the Chemical Potential. J. Chem.Phys. 2002, 117, 3605−3616; J. Chem. Phys. 2003, 118, 2446(erratum)..(47) Matubayasi, N.; Nakahara, M. Theory of Solutions in the EnergyRepresentation. III. Treatment of the Molecular Flexibility. J. Chem.Phys. 2003, 119, 9686−9702.(48) Harris, R.; Deng, N.; Levy, R.; Ishizuka, R.; Matubayasi, N.Computing Conformational Free Energy Differences in ExplicitSolvent: An Efficient Thermodynamic Cycle Using an Auxiliary. J.Comput. Chem. 2017, 38, 1198−1208.(49) Matubayasi, N. Free-Energy Analysis of Protein Solvation withAll-Atom Molecular Dynamics Simulation Combined with a Theory ofSolutions. Curr. Opin. Struct. Biol. 2017, 43, 45−54.(50) Levy, R.; Cui, D.; Zhang, B.; Matubayasi, N. RelationshipBetween Solvation Thermodynamics from IST and DFT Perspectives.J. Phys. Chem. B 2017, 121, 3825−3841.(51) Weber, P.; Wendoloski, J.; Pantoliano, M.; Salemme, F.Crystallographic and Thermodynamic Comparison of Natural andSynthetic Ligands Bound to Streptavidin. J. Am. Chem. Soc. 1992, 114,3197−3200.(52) Beck, T.; Paulaitis, M.; Pratt, L. The Potential DistributionTheorem and Models of Molecular Solutions; Cambridge UniversityPress: Cambridge, 2012.(53) Kirkwood, J. Statistical Mechanics of Fluid Mixtures. J. Chem.Phys. 1935, 3, 300−313.(54) Huggins, D.; Marsh, M.; Payne, M. Thermodynamic Propertiesof Water Molecules at Protein-Protein Interaction Surface. J. Chem.Theory Comput. 2011, 7, 3514−3522.(55) Matubayasi, N.; Gallicchio, E.; Levy, R. On the Local andNonlocal Components of Solvation Thermodynamics and TheirRelation to Solvation Shell Models. J. Chem. Phys. 1998, 109, 4864−4872.(56) Huggins, D.; Payne, M. Assessing the Accuracy ofInhomogeneous Fluid Solvation Theory in Predicting HydrationFree Energies of Simple Solutes. J. Phys. Chem. B 2013, 117, 8232−8244.(57) Nguyen, C.; Kurtzman, T.; Gilson, M. Spatial Decomposition ofTranslational Water-Water Correlation Entropy in Binding Pockets. J.Chem. Theory Comput. 2016, 12, 414−429.(58) Berendsen, H.; van der Spoel, D.; van Drunen, R. ROMACS: AMessage-Passing Parallel Molecular Dynamics Implementation.Comput. Phys. Commun. 1995, 91, 43−56.(59) Hess, B.; Kutzner, C.; van der Spoel, D.; Lindahl, E. GROMACS4: Algorithms for Highly Efficient, Load-Balanced, and ScalableMolecular Simulation. J. Chem. Theory Comput. 2008, 4, 435−447.(60) Pronk, S.; Pall, S.; Schulz, R.; Larsson, P.; Bjelkmar, P.;Apostolov, R.; Shirts, M. R.; Smith, J. C.; Kasson, P. M.; van der Spoel,D.; Hess, B.; Lindahl, E. GROMACS 4.5: A High-Throughput andHighly Parallel Open Source Molecular Simulation Toolkit. Bio-informatics 2013, 29, 845−854.

(61) Gilson, M.; Given, J.; Bush, B.; McCammon, J. The Statistical-Thermodynamic Basis for Computation of Binding Affinities: ACritical Review. Biophys. J. 1997, 72, 1047−1069.(62) Adler, M.; Kochanny, M.; Ye, B.; Rumennik, G.; Light, D.;Biancalana, S.; Whitlow, M. Crystal Structures of Two PotentNonamidine Inhibitors Bound to Factor Xa. Biochemistry 2002, 41,15514−15523.(63) Jorgensen, W.; Jenson, C. Temperature Dependence of TIP3P,SPC, and TIP4P Water from NPT Monte Carlo Simulations: SeekingTemperatures of Maximum Density. J. Comput. Chem. 1998, 19,1179−1186.(64) Jorgensen, W.; Chandrasekhar, J.; Madura, J.; Impey, R.; Klein,M. Comparison of Simple Potential Functions for Simulating LiquidWater. J. Chem. Phys. 1983, 79, 926−935.(65) Lindorff-Larsen, K.; Piana, S.; Palmo, K.; Maragakis, P.; Klepeis,J.; Dror, R.; Shaw, D. Improved Side-Chain Torsion Potentials for theAmber ff99SB Protein Force Field. Proteins: Struct., Funct., Genet. 2010,78, 1950−1958.(66) Case, D.; Betz, R.; Botello-Smith, W.; Cerutti, D.; Cheatham, T.;Darden, T.; Duke, R.; Giese, T.; Gohlke, H.; Goetz, A.; Homeyer, N.;Izadi, S.; Janowski, P.; Kaus, J.; Kovalenko, A. et al. Amber 14;University of California: San Francisco, CA, 2016.(67) Boresch, S.; Tettinger, F.; Leitgeb, M.; Karplus, M. AbsoluteBinding Free Energies: A Quantitative Approach for Their Calculation.J. Phys. Chem. B 2003, 107, 9535−9551.(68) Mobley, D.; Chodera, J.; Dill, K. On the Use of OrientationalRestraints and Symmetry Corrections in Alchemical Free EnergyCalculations. J. Chem. Phys. 2006, 125, 084902.(69) Berendsen, H.; Postma, J.; van Gunsteren, W.; DiNola, A.;Haak, J. Molecular Dynamics with Coupling to an External Bath. J.Chem. Phys. 1984, 81, 3684−3690.(70) Hess, B.; Bekker, H.; Berendsen, H.; Fraaije, J. LINCS: A LinearConstraint Solvent for Molecular Simulations. J. Comput. Chem. 1997,18, 1463−1472.(71) Darden, T.; York, D.; Pedersen, L. Particle mesh Ewald: AnN.log(N) Method for Ewald Sums in Large Systems. J. Chem. Phys.1993, 98, 10089−10092.(72) Essmann, U.; Perera, L.; Berkowitz, M.; Darden, T.; Lee, H.;Pedersen, L. A Smooth Particle mesh Ewald Method. J. Chem. Phys.1995, 103, 8577−8593.(73) Bennett, C. Efficient Estimation of Free Energy Differencesfrom Monte Carlo Data. J. Comput. Phys. 1976, 22, 245−268.(74) Cui, D.; Ou, S.; Peters, E.; Patel, S. Ion-Specific InducedFluctuations and Free Energetics of Aqueous Protein HydrophobicInterfaces: Toward Connecting to Specific-Ion Behaviors at AqueousLiquid-Vapor Interfaces. J. Phys. Chem. B 2014, 118, 4490−4504.(75) Cui, D.; Ou, S.; Patel, S. Protein Denaturants at Aqueous-Hydrophobic Interfaces: Self-Consistent Correlation between InducedInterfacial Fluctuations and Denaturant Stability at the Interface. J.Phys. Chem. B 2015, 119, 164−178.(76) Flyvbjerg, H.; Petersen, H. G. Error Estimates on Averages ofCorrelated Data. J. Chem. Phys. 1989, 91, 461−466.(77) Liu, P.; Harder, E.; Berne, B. Hydrogen-Bond Dynamics in theAir-Water Interface. J. Phys. Chem. B 2005, 109, 2949−2955.(78) Mei, Y.; Li, Y.; Zeng, J.; Zhang, J. Electrostatic Polarization IsCritical for the Strong Binding in Streptavidin-Biotin System. J.Comput. Chem. 2012, 33, 1374−1382.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b01076J. Chem. Theory Comput. 2018, 14, 512−526

526