Approximate procedure for estimating van der Waals and solvent-accessible surface areas

Embed Size (px)

Citation preview

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    1/25

    J. Chim. Phys 1999 96, 566-590O EDP Sciences. Les Ulis

    n approximate procedurefor the calculation of van der Waalsand solvent accessible surfaces areas;computing Gibbs free energies of hydration

    M. Ulmschneider and E. pnigaul{Laboratoire de Photochimie Gnrale, UMR 75 5du CNRS, ENSCMu,rue Alfred Werner, 68093 Mulhouse cedex, France

    Received 5 Februaty 1998; accepted 4 February 1999)Correspondence and repflnts.

    RSU Les tapes vectorielle et analytique u nouveau procd ASC de calcul approchdes surfaces molculaires (de van der Waals ou accessible au solvant), sont ajustes.

    L objectif est d approcher au mieux les valeurs moyennes de surface qui ont t6calcules pour les structures molculaires d un ensemble de rfrence. Les surfacesmolculaires, les surfaces atomiques partielles, ainsi que les gradients dcrivant lesvariations de surface en fonction des dplacements nuclaires, sont valus par unevoie analytique.Aprs sa validation, le modle ASC est exploit pour tudier les corrlationsentre surfaces molculaires et enthalpies libres d hydratation exprimentales. Unecorrlation satisfaisante est tablie pour un large ventail de composs organiques,condition d introduire des lments de surface o et en complment de ladistinction traditionnelle entre surfaces atomiques partielles polaires et non-polaires.mots-cls: modle analytique, surface molculaire, surface de van der Waals,surface accessible au solvant, surface atomique partielle, surface atomique sigma,surface atomique pi, enthalpie libre d hydratation.

    BSTR CTBoth geometrical and analytical steps of the new ASC procedure for theapproximate cornputation of molecular van der Waals and solvent-accessible surfaceareas, were calibrated. The assigned objective was to determine best fit accuratesurface values for a representative set of molecular structures. Molecular surfaces,partial atomic surfaces, as also the gradients describing surface changes as afunction of atomic displacements are analytically described.

    After the validation of the ASC model, an endeavour was made to correlateexperimental Gibbs free energies of hydration with partial atomic surfaces. Asatisfactory correlation was obtained for a large set of organic compounds,provided that o and K surface elements are introduced in addition to the

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    2/25

    Solvent accessible surface and rnolar energy o hydration 67traditional differentiation into polar and non-polar partial atomic surfaces.k y words: analytical model molecular surface van der Waals surface solvent-accessible surface partial atomic surface atomic sigma surface atomic pi surfaceGibbs free energy of hydration.

    INTRODU TIONAlthough not strictly defined in a quantum mechanical sense the notion of

    molecular surface plays an important role in the interpretation and prediction ofmolecular properties and recognition [l-41. The present paper is concemed with ananalytical procedure for the calculation of van der Waals and solvent-accessiblemolecular surfaces as also partial atomic surface areas. The van der Waals surfaceof a molecule is the exposed surface of the fused van der Waals spheres of itsconstituent atoms. The solvent-accessible surface is the locus of the centre of aprobe sphere rolling over the van der Waals surface [4]. Both van der Waals andsolvent-accessible surfaces are continuous with discontinuous slopes at theboundaries between atoms. The accurate analytical calculation of surface areas is acomplex geometrical problem of multiply overlapping spheres. A variety ofnumerical procedures have been developed for the approximate calculation ofmolecular surfaces [3-131. If one is willing to make a sufficient computationaleffort numerical procedures can be applied to any desired degree of accuracy.However they do not provide the gradient describing the changes in surface areafor small atomic displacements. This quantity which proves useful in molecularmechanics calculations would have to be estimated by finite differences requiringexcessive computational effort. Numerical approaches rely on geometricalconstructions surface point distributions or three dimensional grid algorithms.These methods are time-consuming. That is why several authors have proposedanalytical solutions to the molecular surface problem. Analytical approximationsbased on statistics or exact solutions to more rigorous definitions of a solvent-accessible surface have been worked out [14-191.

    The present paper deals with a fast and effective analytical procedure. It consistsof a geometrical step in which a few probe points with assigned surface equivalentsare placed around each atom followed by an analytical step in which partial surfaceJ Chim hys

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    3/25

    68 M Ulmschneider nd E enigaultinclusions in neighbouring atoms were estirnated by using a simple point-to-atomdistance function. Since the geometrical and the analytical steps both depend on theatomic coordinates in a straightforward manner, the surface areas as also theirderivatives with respect to atomic displacements are analytically available.Throughout this work the procedure will be referred to as ASC ApproximatedSurface Calculation) for greater convenience.

    Most biochemical processes in living systems occur in aqueous media.Accordingly, many attempts have been made at describing the properties ofmolecular systems in water [2,11,15,20-301. To assess conformational propertiesand molecular recognition phenornena in aqueous solution, it is necessary toestimate the Gibbs free energies of hydration in a conformation-dependent manner[3 1-34]. Hence, simple empirical models that estimate this hydration energy directlyfrom the structural information of a given solute molecule are of particularinterest. By using the novel analytical SC method to compute partial atomic vander Waals surface areas, a linear additivity scheme of atom-specific alndifferentiated solvation energy contributions was developed. These contributionswere calibrated by multiple-linear regression analysis of calculated andexperimental Gibbs free energies of hydration from a large set of organiccompounds.

    1. GENERAL ASPECTS OF THE ASC PROCEDUREkey element of the SC method [35 36] is the arrangement of the probe points

    around individual atoms, which is based on atomic hybridization and structuralcontext. The centred regular tetrahedron, trigonal bipyramid or regular octahedronassigned to each bonded sp3, sp2, and spl-hybridized atom, is oriented in such away that the corresponding vertices lie optimally on the bond axes of the heavy-atom skeleton al1 distances from the centre of the polyhedron to the vertices areequal to unity). Each vertex not identified by a bond axis is used to define thelocation of an atom-specific probe point, placed along the radial vector to thevertex at a distance r from the atomic nucleus and centre of the polyhedron. Theradial distance equals the atomic van der Waals radius r d optionally

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    4/25

    Solvent-accessiblesurface and rnolar energy of hydration 69incremented by a solvent radius rsol for van der Waals and solvent-accessiblesurfaces respectively. radial increment rinc is added to optimize the surfacecalculations. An example of construction is shown in Figure 1To estimate the free surface area of a bonded atom A a surface equivalent S sassigned to each associated probe point. Since probe points are arranged on thevertices of symmetrical polyhedrons the surface equivalents are given by

    S A - = 4 r c r j l n Awhere r is the radius of the atomic sphere and nA the number of vertices of theselected polyhedron.

    Figure : probe vecror constructions or a mode1 lactant, reference polyhedra (ab ove) and van de rWaals probe points (be low ) or selected a tom s (the probe points are differeniiared a s a nd ir ypepoinis)

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    5/25

    570 M. Ulmschneider and E. PnigaultTo estimate surface inclusions by rnutual overlap of neighbouring atomic spheres,

    inclusion factors are defined by a universal function f of the distance between probepoints and atomic nuclei. This function takes values between total inclusion) and1 no inclusion) within a limited distance range given by the cutoff distance offbeyond which there is no inclusion. The cutoff distance is obtained from the vander Waals radius r g d w of the overlapping atom B optionally augmented by radiusrsol of the sphere simulating a solvent molecule:

    Goff Cext rB vdw ) cext rThe extension factor cm is introduced as an ajustable parameter to optimize the

    surface estimation.The inclusion function f and its derivative with respect to the distance must be

    well behaved, as the distance x approaches the boundaries of the definition interval.A good way to achieve this is to take a cubic spline over the definition range:

    if x E [O, oi x E c ~ . + = [

    Explicit values for a and b can e found by applying expressions of thefunction and its derivative i at the cutoff boundary:

    Solving this system of two equations leads to complete definition of the inclusionfunction:

    For an atom A with n~ probe points and an intersecting atom B, the residualexposed surface area of the j-th probe point of is given by

    S ~ = S A . f j , j ~ )

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    6/25

    Solvent accessible surface and molar energy of hydration 571where x is the distance between the j-th probe point of A and atom B.

    For a set of overlapping atoms a probabilistic approach [15 7 is adopted.The residual exposed surface equivalent of the j-th probe point of A is given by theproduct of the respective inclusion factors:

    NsA j = SA . f ~ jSA f A jk = l . k A

    The exposed partial atomic surface SA of atom A and the exposed surface S ofthe molecule are given by following equations:

    2. C LIBR TION OF SC PROCEDUREThe structures of 430 representative small monofunctional and polyfunctional

    organic molecules were built by using the modelling package MOLOC [37] hichis property of the F. Hoffmann-La Roche Company. The structure library containshydrocarbons aicohols phenols ethers amines ketones aldehydes and esters. Inorder to get representative samples for proper calibration various conformationswere included for both cyclic and acyclic compounds and most major functionalgroups.Table 1: equations of best fit and parameters used for ASC procedure SMOL SVandS A S C are the accurate and the approximate surface areas, the solvent radius is afixed parameter, probe shift and extension factors are optimized during calibration

    parametenlinear regression

    analysissolvent radiusr,,,lY A)probe shift ri A)extension factor cexz

    van der Waals surfacesSMom = (0.006iO.800) (1.123 0.005) SASCn= 430 s=5.60 = 0.993 F = 59177

    OO

    1.89

    solveni accessible surfacesSMOLTV = (0.004I4.879) (1.190f0.016)SASCn =430 r = 25.22 = 0.927 = 5430

    1.451 O01.56

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    7/25

    57 M Ulrnschneider and E nigaultFor al1 molecules in the structure library, the approximate molecular surface

    areas SASCwere determined for a given parameter set { r i n c } . The accuratesurface areas SMOtSV were computed by the numerical procedure MOLSVavailable through QCPE [ 6 ] .The optimum values for parameters rkc and ceHwerecalculated by optimizing the linear relationship between SASC and SMOLSV Theresults for both van der Waals and solvent-accessible surfaces are summarized inTable 1

    Gratifyingly, rin remained close to zero for optimum estimates of the van derWaals surfaces, so that, for conceptual simplicity, the value of this parameter wasset to for this type of surface calculations. A similar reasoning was applied in thecase of solvent-accessible surfaces, where rincwas set to 1

    From these data as well as the plots of Figures 2 and 3, it is evident that theprocedure yields reasonably accurate estimates of molecular van der Waalssurfaces, but is somewhat less satisfactory for solvent-accessible surfaces: thedegree of complexity in multiply overlapping atomic spheres is considerably higherthan with van der Waals surfaces.

    For comparison, Table gives the CPU times in seconds measured with a MicroVax 3800, that are related to the van der Waals and solvent-accessible surface areasof four selected molecular stmctures, calculated by using the MOLSV and the ASCrnodels. These molecules are benzene, the helical conformation of N-acetyl-deca-alanyl-N -methylarnide (Ac-AlalO-NHMe),bovine B 1 insulin (the second chain of apolypeptide hormone including 30 amino acids) and sperm whale deoxymyoglobin(a 153 residues long polypeptide chain). The number n of heavy atoms (Le. C, Oand N) in the selected molecules is also shown in Table II

    Inspection of Table II shows that the algorithm used with the ASC model is muchmore convenient than the MOLSV model. The computation times with the ASCrnodel are shorter because the number of points required for one atom is on theaverage 600 times lower than with MOLSV.However, ASC calculations are not 600times faster, because additionna1 arithmetic steps are needed for each point. Forgreat values of n, the dependence of the CPU time on n is roughly linear. The coreof both algorithms consists in two embedded loops over n and the CPU time shouldbe a quadratic function of n. The efficiency is improved by making out for each

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    8/25

    Solvent-accessible surface and molar energyof hydration

    3 0MOLSV

    A2)

    ASC 442)Figure 2: correlation of exact MO SV values vs approximote ASC values for van der Waalsmolecuiar sugoce areas the stononsticalata of the linear correlation are given in Table1

    1 200 3 4 XX) aISC 2)

    Figure : correlation of exact MO SV values vs approximate ASC values for solvent-accessiblernolecular surface areas the sfatisticaidafaof the linear correlation are given in Table1J Chim. Phys.

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    9/25

    574 M Ulmschneider and E. Pnigaultatom the list of the neighbouring atoms located within a specific cutoff distance,thus, reducing the number of unnecessary calculations. For a given molecule, theCPU time depends, then, from both n and the average number of atoms selected inthose lists.

    For a given atom, the number of neighbouring atoms selected in the list mayVary according to the size and the topology of the molecule and also the type ofsurface that has to be computed. For instance, in the intersecting sphere approachmore atoms overlap when computing the solvent-accessible surface area: the CPUtimes are comparatively longer than those for van der Waals surfaces.Table II: CPU times in seconds measured with a Micro Vax 3800 for van derWaals and solvent-accessible surface areas calculated by using the MOLSV ndthe ASC models

    3. CALCULATING PARTIAL ATOMIC SURFACE AREAS

    molecule

    benzeneAc AlalO NHMe) helixbovine B 1 insulindeoxymyoglobin

    To assess the ability to estimate partial atomic surfaces, the above results werecompared with accurate partial surfaces for al1 atoms contained in the structurelibrary. The comparison is best performed for individual subsets of atoms,classified by element type E), hybridization index h) and number n) of covalentlybonded non-hydrogen ligands index h is the exponent of hybridization type sph).Tables III and V list the number of occurrences in each atomic subset Ehn) andsummarize the results in terms of mean surface areas and standard deviations forvan der Waals and solvent-accessible partial surfaces, respectively. The data aredisplayed in Figures 4a and 4b. While the mean values scatter reasonably well aboutthe main diagonal, the method is obviously more successful in estimating partial van

    n6

    1611217

    van der Waals surfacesMOLSV ASC1.6 0,0317.9 0,753,l 2 s

    443,7 58 4

    solvent accessible surfacesMOLSV ASC1.4 0,03

    39,9 I ]128.8 4 7174 4 83,7

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    10/25

    Solvent accessible surface ndmolar energy of hydrationder Waals surfaces than partial solvent accessible surfacesTable III: van der Waals surface areas in A or the different individual subsets ofatoms, classified by element type (E), hybridization index (h) and number (n) ofcovalently bonded non-hydrogen ligands (for each subset, the number ofoccurrences in the data set is given, as well as the mean values and the standarddeviations for partial atomic surfaces calculated by MOLSV and ASC)

    Table IV: solvent-accessible surface areas in A or the different individual subsetsof atoms, classified by element type (E), hybridization index (h ) and number n )of covalently bonded non-hydrogen ligands. For each subset, the number ofoccurrences in the data set i s given, as well as the mean values and the standarddeviations for partial atomic surfaces calculated by MOLSV and ASC

    c lcl ,C 2 iC22

    23C3l

    32c 3 3N :N 331O*o~~

    O

    J. Chim. Phys

    number ofoccurrences4

    2426749560

    1160105145 3

    1459124

    322053 84366

    l

    mean valuesMOLSV ASC31.79 32.7915.04 14.043 1.57 32.8417.00 14.965.05 5.41

    27.48 28.0017.70 17.754.70 4.4824.54 25.2823.34 21.7712.64 12.172.46 3.91

    22.27 20.4918.55 14.258.70 8.08

    18.46 14.169.84 9.16

    number ofoccurrences

    42426

    7495601160105 1453

    1459124

    32205384366

    mean valuesMOLSV ASC84.3 1 91.6019.99 28.0374.9 78.5332.93 32.423.49 4.0155.25 54.67

    33.18 34.018.24 10.6871.17 64 5247.33 37.0222.70 22.180.70 4.5944.59 39.4540.24 3 1.2412.39 15.0838.7 1 33.7717.84 20.68

    standard deviationsMOLSV ASC

    0.52 0.161.89 1.541.84 2.201.30 1.491.78 1.5 13.66 4.572.10 2.1 11.35 1.410.07. 0.042.76 4.001.22 1.600.37 0.582.59 4.202.08 3.051.16 1.492.18 3.641.63 1.47

    standard deviationsMOLSV ASC1.44 1 1.638.31 7.32

    11.06 14.247.79 8.162.39 3.4114.82 18.629.29 9.895.48 5.500.06 0.66

    1 1.90 14.105.7 1 4.020.58 1.31

    1 3.18 18.5111.23 12.304.76 4.4 1

    12.80 17.398.67 8.08

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    11/25

    M lrnschneider and E Pnigault

    ASC 2

    6

    ASC

    4

    Figure 4: correlation of exuct ASC values vs exact MO SV values for partial atomic van der Waa ra)and solvent-accessible b ) urface ureas; mean values a,nd standard deviations are plotted for

    individual atomic subsets Eh,. classified according to element type E) ,hybridization index h),andnumber n)of covalently bonded non-hydrogen Iigands: al1 surfaces inA

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    12/25

    Solvent accessible surfaceand molar energy of hydration

    MOLSVA=)

    O 1 2 3 4ASC A2)

    Figure 5: plot of approximateASC values vs exact MO SV values of partial van der Waalssugaceareas of sp3-hybridized carbons atoms with one two and three covalently bonded non-hydrogenligands of al1 molecules in the structure library

    MOLSVA2)

    O 1 2 3ASC A2)

    Figure : plot of approximateASC values vs exact MO TV values ofpartial solvent-accessible areasof sp2-hybridizedcarbons atoms with one two and three covalently bonded non-hydrogen ligandsof al1 molecules in the structure libraryJ Chim Phys

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    13/25

    578 M Ulmschneider and E nigaultThe considerable variations in the magnitudes of standard deviations reflect thediversity of structural contexts and concomitant differences in surface inclusions.The procedure reproduces both mean values and standard deviations reasonablywell. Further analysis shows that such correlations hold for each individual atomicsubset, where data sets are clustered along the main diagonal. This is prototypicallyillustrated in Figures and 6 for the stmcturally important subsets of sp3- and sp2-hybridized carbon atoms with one to three non-hydrogen ligands. These subsets ofrather ubiquitously occurring atomic units are highly populated in the structurelibrary. Hence, the parameter calibration is somewhat biased towards thesestmcturally important elements. This seems to be justified in view of the overallsatisfactory correlations of the results.

    4 EXPERIMENTAL GIBBS FREE ENERGIES OF HYDRATIONThe affinity of a compound for an aqueous environment can be evaluated by

    experimental determination of its vapour pressure over dilute aqueous solutions.The method studied extensively by Wolfenden et al. 38 involves rneasurement ofthe dimensionless equilibrium constant corresponding to the transfer of a substancefrom the vapour phase, in which each molecule exists in virtual isolation, to a diluteaqueous solution for which solute-solute interactions can be disregarded. The dilutesolution is pH-adjusted, if necessary, to maintain the solute in the uncharged state.Measurements are limited to relatively volatile solutes that exhibit substantialvapour pressures above the aqueous phase. Experimental error is generally within afew kJ mol l. This is a direct method for the determination of Gibbs free energiesof hydration. Indirect methods rely on additional organic solvents and thedetermination of partition coefficients. However, the direct method is restricted torelatively volatile compounds and not applicable to biological systems. Thepossibility of extrapolating the data for smaller molecules to larger ones remains anopen issue. An extensive list of 350 small organic compounds, with variousexperimental thermodynamic properties of the corresponding neutral speciesderived from different sources, has been critically reviewed and tabulated byCanabi et al. 1391 From this compilation, 268 entries were selected, excluding

    J chim Phys

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    14/25

    Solvent-accessiblesurface and molar energy of hydration 579halogen-containing compounds. The corresponding three-tridimensional structureswere generated with the united-atom molecular modelling program MOLOC [37]and organized into a structure library. This library contains representative sets ofhydrocarbons, alcohols, phenols, ethers; amines, pyridines, ketones, aldehydes,esters and nitrocompounds. There are also a few carboxylic acids, nitriles, thio-ethers, thiols, but only one amide. In terms of elementary distribution, carbonatoms are clearly the most abundant, followed by oxygen and nitrogen atoms.Sulfur atoms are rare. Apart from the lack of sp2-h~bridized ulfur atoms, spl o rsp2-hybridized carbon and nitrogen atoms, as also sp2-hybridized oxygen atoms arewell represented. Only few charged or polyfunctional compounds are contained inthis library. On the other hand, branched and unbranched, as also cyclic and acyclicstructures are approximately equally abundant.

    5 HYDR TION AND MOLECULAR SURF CESolvation is generally considered to be predominantly a molecular surface

    phenomenon [2 20]. Accordingly, a correlation has been established between thesurface area of apolar solutes and their Gibbs free energies of transfer fromhydrocarbon solvents to water [20 21 25]. Several additivit schemes have beenproposed to estimate Gibbs free energies of hydration based on the notions offragments, functional groups or atomic classes, and their independent contributionsto solvation. An early mode1 [ l ] has been improved by considering the solvent-accessible surface area of individual fragments and assuming proportionalitybetween surface area and contribution to solvation energies [31-341. Formolecular structure with N fragments, the total Gibbs free energy of hydration isthen given by the sum where, for the i-th fragment, the accessible surface area isSiwith its specific contribution gi to the Gibbs free energy of hydration:

    AG g SiChanges in Gibbs free energies of hydration resulting from changes n

    conformation are usually rationalized through the conformational dependence ofexposed fragment atoms. A sirnilar approximation was adopted in the present work,

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    15/25

    58 M Ulmschneider and E Pnigaulttogether with the new ASC method for the analytical calculation of partial atomicsurfaces [35, 361 This method provides a sufficiently detailed description ofmolecular surface topologies, while keeping close to chemical intuition.

    Since the ASC method is an approximate method for the calculation of surfaces,the accuracy with which the ASC method estimates the solvent-accessible and thevan der Waals surface areas for the selected 268 molecular structures contained inthe library, was first assessed.

    Table V: correlation equations of estimated and accurate molecular surfaceareas in A2

    For this purpose, the approximate surface areas estimated by ASC werecompared with the accurate areas obtained with MOLSV [6]. The correlations ofestimated and accurate molecular surface areas are summarized in Table V. TheASC method works successfully for van der Waals surfaces but is somewhat lesssatisfactory for solvent-accessible surfaces. However, in view of the fact thatnumerically accurate van der Waals and solvent-accessible surfaces show a veryclose correlation Table V , the use of ASC- estimated van der Waals surfaces in thesolvation parameterization scheme is justified.

    solvent-accessible areas

    van der Waals areas

    solvent-accessible areasvs. van der Waals areas

    6. o nDIFFERE NTIATION IN THE SURFACE DEFINITION

    Sa MOLS 26.26W5.417) 0.834H.017)n = 2 6 8 s = 1 6 . 2 7 r = 0.898 F = 235

    S = ,124kO.676) 0.969I0.005) SVmy AScn = 268 s = 2.61 r = 0.993 F = 40492

    S = 74.607I1.269) 1.573kO.009) Svdw MO V= 268 s = 4.87 = 0.991 F = 28927

    In the geometrical step of the ASC mode1 a small number of probe points wereplaced around each atom at the vertices of atom-centred reference polyhedra, the

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    16/25

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    17/25

    M. Ulmschneiderand E Penigault

    Figure 7: experime ntal G ibb s free energies of hydration A Gh us total van d er Waa ls areas foraromatic hydrocarbonsAGh,calc= 37.0716.08)+ -0.15f0.04).S,, = 2 7 s = 4 3 9 r 2 = 0 4 1 F = O

    ExperimntalAG,, n W mol-16

    Figure : experimental Gib bs free energies of hydration AGh vs total van der Waa ls areas foraliphatic ketones nd aldehydesAGh,calc= -20.34-+l.71)+ 0.05H.01).S,,= 2 8 s = 2 1 9 r 2 = 0 3 9 F O

    J Chim Phys

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    18/25

    Solvent-accessible sur fac e and. nolar ene rgy of hydration 58Table VI: aromatic hydrocarbons, ketones and aldehydes, w i t h experimentalGibbs free energies of hydration, AGh (in kJ molm1)

    benzene -3.62methylbenzene -3.71ethy lbenzene -3.331,2dimethylbenzene -3.771,3dimethylbenzene -3.5014-dimethylbenzene -3.37pmpy lbenzene -2.23isopropy lbenzene -1.261.2.4-trirnethylbenzene -3.60buty lbenzene 1.66sec-buty benzene 1.88t-buiylbenzene 1.83i-penty benzene -0.74l ,l -biphenyl -1 1O6bipheny lrnethane -1 1.789H-fluorene -14.4 1naphthalene 10.011-rnethylnaphthalene -9.911-ethylnaphthalene 10.02

    Figure : experimental Gibbs free energies of hydration AGh vs AGh values calculated for aliphaticketones and aldehydes with the total van der Waals surface areas of non-polar carbon and polaroxygenAG,,,lc = (-29.16k4.39) (0.04f0.01) .Sc (-0.08k0.02) .Sc (0.00f0.32) S

    8

    12

    E x p n m dA in W mol16

    20 .

    1.3-dimethylnaphthalene 10.351,4-dimethylnaphthalene -1 1.792,3-dirnethylnaphthalene 11 642,64irnethylnaphthalene 11COacenaphthene -1 3.77andiracene -17.70phenanthrene -1 6.53pyrene -16.682-propanone -16.122-butanone -1 5.222-pentanone 14.763-pentanone -14.283-methyl-2-butanone -13.56Zhexanone 13.764-methyl-2-pentanone -12.812-heptanone 12.72Cheptanone 12.242,4-dimethyl-3-pentanone - 11.46

    rJi

    l5 U. a nO O

    G clO cO Ou

    OO

    O

    2-octanone -12.062-nonanone -10.415-nonanone -11.182-undecanone -9.05acetophenone -19.18acetaldehyde 14.66pmpanal 14.40butanal -13.29pentanai 12.68hexanal 1 1.76heptanal -11.18octanal -9.58nonanal -8.69nm-2-butenal -17.68tram-2-hexenal -15.40tram-2-octenal -14.40nanr,irm-2,4-hexadienai 19.39benzaldehyde -16.84

    -La 16 14 12 10 -0

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    19/25

    M.Ulmschneider and E Pnigault

    ExpcrinicaialAG,, Jmol-

    Cllailsied AG n rmPL

    Figure 10: experimental Gibbs free energies of hydration AGh values calculated for aromatichydrocarbons with the dx-type sudace areasAGhpcalc 7.64f2.49)+ -0.01M.02) Sb -0.2 f0.01) Snn=27 s = 1.73 r 2 = 0 . 9 1 F 124

    Figure I I experirnental Gibbsfree energies of hydration AGh vs AGh values calculared for aliphaticketones and aldehydes with the d~ - t y p eurj aceareasAGh.,, = -29.16k4.39)+ 0.04H.01) SC, -0.08f0.02) SC, 0.00fl.32) Sou

    + 1.55M.89) Scn=28 s =1 .09 r2=0.868 F = 3

    J Chim Phys.

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    20/25

    Solvent-accessible surface nd mol r energy of hydr tion7 P R METERIZ TION RELYING ON o x S U R F A C E A R E A S

    Further exploratory data analysis 1351 using this minimal parameter setcomplemented by selected additional parameters revealed the importance ofdifferentiating between atomic hybridization States, protonated and unprotonatedheteroatoms, and the recognition of polarization effects by heteroatoms.

    Further differentiation of the o- and n-type van der Waals surfaces is proposedas an attempt to include such aspects. To be as general as possible, atomic species,i.e. carbon, nitrogen, oxygen and sulfur atoms were considered separately.Heteroatoms were distinguished, whether protonated or not. Differently hybridizedatoms were also separated. To incorporate polarity effects, two types of carbonatoms were used. A carbon atom directly bonded to a heteroatom was assumed tobe strongly influenced by its polar neighbour. The effect of induced polarization atthe carbon atom in a more remote position was disregarded. Carbon atoms directlyconnected alpha) to a heteroatom were differentiated from carbon atoms in moreremote positions, that were treated as carbon atoms in hydrocarbons. The fullyextended parameterization scheme relying on surface area contributions wasdenoted ASC AG and included o- and n van der Waals surface areas of thefollowing atomic categories:1 Carbon atoms in the apolar skeleton: spl, sp2 or sp3-hybridized.

    2) Carbon atoms in alpha position to a heteroatom: spl, sp2 or sp3-hybridized.3) Hydrogenated nitrogen atoms: sp2 or sp3-hybridized.4) Non-hydrogenated nitrogen atoms: spl, sp2 or sp3-hybridized.5) Hydrogenated oxygen atoms: sp2 or sp3-hybridized.6) Non-hydrogenated oxygen atoms: sp2 or sp3-hybridized.7 Sulfur atoms: sp3-hybridized.

    The values of the g-coefficients obtained by multiple regression analysisincorporating the complete data set are given in Table VIl . The number ofoccurrences of each surface category in the data library is also indicated. Figure 12shows the resulting correlation. Experimental and estimated Gibbs free energiesof hydration for the mode1 structures are given in the Appendix. The variance r2is only of 87 . However, when compared to the earlier simplified o/n-mode1 with

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    21/25

    M Ulmschneider and E PnigaultTable VI1: g-coefficients and statistical data for the best fit of theASC-AG model with Gibbs free energies of hydration for thecomplete library data set with the number of occurrences for eachclassn = 268 s = 4.38 r =0.872 F = 69 AGh,cBlc= go + gi Si

    Figure 12: experimental Gibbs ree energies ojhydration ACh v AG alues calculated with theASC-AC model, or complete reference set

    go - 0.06k1.48g c . s p l , a 0.21M.77g c , s p 1 9 -0.08M.19gc . p 2 . a 88 0.5039.08

    88 -0.46f0.05gc p 2 rrg ~ .p 3 a

    221 0.03M.01g a c , p l . 5 -0.39k3.90g n ~s p 2 a 43 0.82M.19g a c . sp , 100 -1.33M.24g a ~ ,p 3 a 1 12 -0.0633.03~ N HSP 2,O -0.81fl.16~ N H ,p 2, -1.0539.20

    22 -0.79M.06g ~ ~p 3 . a

    J Chim Phys

    gN,sp . a 5 42.57185.21g ~ . s p .r 5 1 6.7W29.02g ~ ,p 2 . a 19 -0.96kO.89g~ p 2 rr 27 0.88M.43g ~ , s p 3 , 7 -2.6750.56

    g ~ ~ ,p , a-3.7w3.35

    OH. sp 2 rr 3.30I4.56g ~ ~ . s p. 32 -0.91M.05go. s 2, 77 -2.1M.43go .p 2. 77 2.08M.53go , sp3 ,a 27 -0.73M.34gs .a 6 -0.07fl.06

    -1 - -2go in k~ mol- , gi in k mol A

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    22/25

    Solvent-accessible surface and molar energyo hydration 587seven types of surfaces, the refined surface differentiation results in an increasedprediction ability of the mode1 by nearly 5 percentage points. It is noteworthy thatthe constant term can be dropped. This is a comforting aspect and consistent with acomplete surface area approach. The statistical definition of some parameters is notadequate in the case of atoms that are poorly documented in the reference library,or the experimental data of which are highly scattered. For a given series ofheteroatoms e.g. sp2-hybridized oxygen atoms), the o-surface contributes to anegative favourable) hydration energy value, and the n-surface to a positive one.The parameters are of the same order of magnitude. Carbon atoms follow theopposite pattern: o-type surfaces are more hydrophobic than x-type surfaces. Thistrend has already been observed: carbon atoms in C-C double bonds contribute anoverall hydrophilic increment to the free enthalpy of hydration due to thhydrophilic nature and generally better accessibility of x-type surfaces. The slighthydrophilic nature of the o-surface of sp3-hybridized carbon atoms alpha toheteroatoms is noteworthy. It may reflect the anticipated polarization by theheteroatom.

    The design of ASC-AG is simple and straightforward. It can be extended to anyclass of molecular compounds. When new atomic species are concerned, theparameters for the o and surfaces are just added into the parameter list for thevarious hybridizations. Insofar as these new species are also hydrogen-bond donorsites, it is necessary to include the corresponding parameters. Finally, a newcalibration has to be performed to ensure an overall consistent set of parameters.

    ON LUSIONGiven the conceptual simplicity of the ASC method, the success achieved in

    estimating both molecular and partial atomic surfaces is remarkable. This methodexhibits several potential advantages. Surfaces can be estimated orders of magnitudefaster than with most numerical procedures. The surfaces as also their variations asa function of atomic displacements are given analytically. They can be easilyincorporated into structure minimization or molecular dynamics calculations. Inaddition, the concept of surface probe points, spatially arranged by reference to theJ Chim Phys

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    23/25

    588 M lmschneider and E Pnigaultvalence States of bonded atoms with the inherent possibility to differentiate betweenhydrophobic and hydrophilic polar and non-polar as well as 0 and 7c-type surfaceelements lends itself to chemically intuitive interpretations of surface topologies andassociated molecular propenies. Its utilization to estimate solvation effects andpartition coefficients is arnong the most obvious applications.

    To calculate Gibbs free energies of hydration the ASC AG model has one majoradvantage over more fundamental methods. It yields essentially the same quality ofanswers in a transparent way and with a minimum of computational effort. It restson the additivity of hydrophobic and hydrophilic contributions of the variousatomic surface elements and thus in principle can easily be extended to a largervariety of organic compounds given the availability of expenmental data. Furtherrefinements are conceivable. Carbon atoms alpha to a heteroatom can bedifferentiated according to the nature of the latter. For example carbon atoms inalpha position to an oxygen atom may contribute to hydration in another way thancarbon atoms in alpha position to a nitrogen atom. It may be adequate todifferentiate nitrogen and oxygen atoms according to their electronic charges.Aromatic carbon atoms and heteroatoms may be considered as specific atomicclasses. However these multiple classifications increase the degrees of freedom ofthe model and should be considered only with a substantially larger data set.Refinements could also account for the two different types of H-bond sites at theoxygen atom of a hydroxyl group. Indeed one O-direction corresponds to the-H bond. which acts as an H-bond donor whereas the other two O-directionscorrespond to the lonepairs which can only act as H-bond acceptors. In the presentversion of the model no distinction was made between these probes and the resultsobtained represent an average effect.

    Hydrophilic or hydrophobic surfaces of atoms are currently treated as if therewere no cooperativity effects. These effects involve many-body favourable orunfavourable interactions among different donor/acceptor sites. It .would bepossible to model such effects by an additional analytical inter-probe function of thedistance involving only accessible probes. In this manner special hydration effectslike those observed for small rings or due to favourable or unfavourableinteractions between neighbouring functional groups could be accounted for.

    J Chim Phys

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    24/25

    Solvent accessible surface and rnolar energy of hydration 89ppendix

    List of the 47 molecular structures of the Scheraga paper [32] wit experimentaland calculated Gibbs free energies of hydration AG,, in kJ molm1 experimentaldata source [ 9] xcept for w) molecular values from [38]

    molecular structureacetamideacetic acidacetic acid ethyl esteracetic acid methyl esteranthracenebenzenebenzenethiolbutanebutanoic acid1-butano12-butanolI -butylamine1.3-dimethylbenzene1 4-dimethylbenzene2,2-dimethylbenzeneethaneethanethiolethanolethy lamineethylbenzenehepranehexane1-hexanolI -hexylaminemethanethiolmethanolmethyl ethyl sulfidemethylaminemethylbenzene4-methy limidazole3-methy indoleCmethylphenol2-methylpropanenaphthaleneoctanepentane-pentanol1-pent laminephenorpropanepropanoic acid1-propanol2-propanolpropionamide1-propylaminen-propy lbenzenepropylguanidine

    exp. val.-40.63-28.0512.9513.8717.703.6210.678.70-26.59-1 9.73-19.10-17.973.503.3710.467.665.42-20.9818.843.3310.9610.40-18.26-16.875.19-2 1.406.20 w)-19.093.71-42.92 w)-24.75 w)-25.679.70-10.0112.109.761 8.72-17.14-27.688.1 8-27.09-20.19-1 9.90-39.41 w)18.372.23-45.73 w)

    Scheraga-28.04-29.154.217.40-1 1.697.40-13.588.29-27.79-19.01-16.38-1 8.341.901.898.716.2 12.81-21 O9-20.421.24112 410.3716.9616.294.72-24.04

    -23.802.76-25.6417.1O-32.918.259.5512.259.3317.97-17.30-37.567.25-28.8 1-20.05-1 9.85-21.7819.380.21-59.91

    a lrmode1-39.77-3 1 O4-14.1816.0018.468.03-16.816.74-27.22-20.64-17.30-1 8.593.253.268.034.525.38-22.67-20.694.8510.22

    9.0616.93-16.286.55-25.26O. 13-22.885.65-26.16-20.92-21.786.80-1 1.1811.397.90-18.1217.44-24.155.58-28.32-2 1.79-19.77-37.3219.753.56-59.58

    ASC AG mode1-40.65-32.17-12.30-14.13-14.911.46-12.792.75-28.26-18.51-15.03-20.415.245.263.271.873.3419.28-21.193.974.15

    3.68-17.5919.485.24-22.374.24-24.083.37-20.73-3 1.93-28.162.788.244.63.22-18.04-1 9.95-26.212.29-28.5518.94-16.48-37.48-20.401.76-69.89

  • 8/13/2019 Approximate procedure for estimating van der Waals and solvent-accessible surface areas

    25/25

    590 M. Ulmschneider and E nigaultREFERENCES1 Buttler JAV. Harrover P (1937) Trans Faradav Soc 3 3 . 229-236.2 Langmuir IM (1 925) ~ o l l o i dymposium ~ o n o ~ r a ~ h , ~ h e m i c a latalog Co,New-York. Vol 3.3 Bondi A (1964) J Phys C hem 68, 441-45 1.4 Lee B Richards FM (1971) J Mol Bi01 55,379-400.5 Stouch TR urs PC (1986) J Chem Inf Compu Sci 26 4-12.6 Smith GS (1985) QCPE 509.7 Pearlmann RS (1986) in Partition CoefJicient Determination and E stimationDunn III WJ, Block JH, Pearlmann RS, Eds, Pergamon Press, New-York 3-20.8 Meyer AY (1988) Co m p Chem 9. 18-24.9 Akahane K Nagano Y Umeyama H (1989) Chem P h a m Bull 3 7 , 86-92.10 Shrake A, Rupley JA (1973) 5 Mol Bi01 79 , 35 1-371.11 Cramer CJ, Truhlar DG (1992) J Com p Chem 13, 1089-1097.12 Grand SML, Merz Jr KM (1993) Comp Chem 14 349-352.

    13 Duncan BS, Olson AJ (1993) Biopolymers 33 219-229.14 Connoll ML (1983) Appl Cryst 16 548-558.15 Still WC ,Tem cryk A, HawleyRC Hendrickson T (1990) J Am Chem Soc112 6127-6129.16 Richmond TJ (1984) Mol Bi01 178, 63-89.17 Wodak S, Janin J (1980) Pmc Natl Acad Sci USA 77, 1736-1740.18 Sanner M (1992) PhD Thesis, Universit de Haute-Alsace, Mulhouse, France.19 Agishtein ME (1992) Biomol Struc Dyn 9, 759-768.20 Chothia C (1974) Nature 248 338-339.21 Reynods JA, Gilbert DB, Tanford C (1974) Pmc Natl Acad Sci USA 71 2925-2927.22 Rupley JA, Gratton E Careri G (1983) TIBS 8 18-23.23 Ben-Am A, Marcus Y (1984) Chem Phys 81, 2016-2027.24 Frommel C (1984) Th eor Bi01 111 247-260.25 Sharp KA, Nicholls A, Fine RE, Honig B (1991) Science 252 106-109.26 Sharp KA (1991) Cur Bi01 1 171-174.27 Gao J, Xia X (1992) Science 258 63 -634.28 Soda K. Hirashima H (1990) Phys Soc Jap 59,4177-4185.29 Soda K Hirashima H (1992) Phys Soc Jap 6 1 , 2992-3006.30 Sharp K Jean-Charles A Honig B (1992) Phys Chem 96 3822-3828.31 Hase1W Hendrickson TF Still WC (1988) Tet Co mp Meth 1 Vol 2, 103-106.32 Ooi T Oobatake M, Nmethy G, Scheraga HA (1987) Pmc Natl Acad SciU S A 84, 3086-3090.33 Eisenberg D Wesson M YamashitaM (1989) Chemica Scripta 29A. 217-221.34 Eisenberg D Mc Lachlan AD (1986) Nature 319 199-203.35 Ulrnschneider M (1993) PhD Thesis, Universit de Haute-Alsace, Mulhouse,France.36 Ulmschneider M Pnigault E (1999) J Chim Phys, in press.37 Mller K Ammann HJ, Doran DM Gerber PR Gubernator K Schre fer G(1989) in Trends in Medicinal Chemistry 88 van der Goot GDH, Pal os LTimmermann H Eds, Elsevier Science Publishers, Amsterdam. f38 Wolfenden R Anderson L Cullis PM Southgate CCB (1981) Biochemistry20 849-855.39 Cabani S, Gianni P Mollica V, Lepori L (1981) Sol Chem 10 563-595.

    J Chim hys