4
A halogen-bonding correction for the semiempirical PM6 method Jan R ˇ ezác ˇ a,, Pavel Hobza a,b a Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic and Center for Biomolecules and Complex Molecular Systems, 166 10 Prague, Czech Republic b Department of Physical Chemistry, Palacky University, 771 46 Olomouc, Czech Republic article info Article history: Received 15 February 2011 In final form 6 March 2011 Available online 9 March 2011 abstract We analyse the failure of the semiempirical QM method PM6 to describe halogen bonds and suggest an empirical correction that remedies this problem. Owing to underestimated repulsion in the PM6 method, the halogen-bond interaction energies are dramatically exaggerated and the equilibrium distances are very short. This is addressed by a correction parametrised for all halogens capable of halogen bonding (Cl, Br, and I). The correction is applied on top of the dispersion correction, forming the PM6-D2X method. A comparison with ab initio calculations shows that the method is able to predict the interaction energy of halogen bonds with an error of 10%. Ó 2011 Elsevier B.V. All rights reserved. 1. Introduction The halogen bond [1–5] is a specific non-covalent interaction of a halogen atom (Cl, Br, or I) in halide with an electronegative site (here we have considered the O and N atoms). The nature of this interac- tion is somewhat counterintuitive, because electrostatics represents an important stabilising contribution here. This is possible because the electron density on the halogen atom is anisotropic; a region of positive charge (the so-called r-hole) is present on the axis of the bond in covalently bonded halogens, and this region then interacts with the negatively charged acceptor. This interaction accounts for approximately half of the total interaction energy, the rest is London dispersion energy, as indicated by decomposition of the interaction energy by the means of the symmetry-adapted perturbation theory based on density functional theory calculations [6]. Halogen bonds are now recognised as an important tool for rational molecular de- sign. Halogen substitution is a promising way to improve the activity of drugs; it can improve the energetics of the binding with only a slight impact on the other properties of the molecule. Moreover, the halogenated derivatives can usually be synthesised easily. Com- putationally, it is not difficult to describe the halogen bond. Even the Hartree–Fock (HF) or density functional theory (DFT) can describe the electronic structure of a halogen bond and therefore also the electrostatic contribution to the interaction energy. The dispersion contribution can be calculated by means of post-HF methods or by an empirical correction in the DFT. These methods can yield very accurate interaction energies in small complexes [6]. In order to study larger systems, more efficient methods are needed. Molecular mechanics can be used for very extensive calcu- lations, but the approximation involved in the construction of the force-field limits its accuracy and transferability. Current force- fields are certainly not able to describe halogen bonding because the halogen atom is modelled by a single point charge that can not account for the anisotropic charge distribution on its surface; force-field modifications addressing this issue are being developed but are not available yet. Semiempirical quantum–mechanical methods (SQM) can be applied to systems with thousands of atoms while retaining the advantages of the QM calculations: they are based on a proper physical description of the molecular structure and do not depend on system-specific parameters. Linear scaling algorithms routinely allow the application of these methods to whole proteins [7,8]. Recently, we [9,10] and others[11–14] have shown that semiempirical methods can be successfully applied in drug design for the calculations of protein–ligand binding. How- ever, the SQM methods themselves are not accurate enough in their description of non-covalent interactions, which are crucial in such systems. This had been successfully addressed by the empirical corrections. We have developed corrections for disper- sion (which is missing in the SQM methods) and hydrogen bonding (which is underestimated by most of the SQM methods) for se- lected SQM methods [15,16]. In our drug-design studies, we use the PM6 method [17] with the second generation of corrections [16] (PM6-DH2), because it yields the best results among the SQM methods we tested [16]. The protein–ligand binding free energies calculated using this method correlate well with the experimental data while molecular mechanics fails completely [9,10]. When we applied this method to halogen bonds, we encoun- tered new problems. In principle, the SQM methods are not able to describe a halogen bond, because the minimal basis used cannot describe the anisotropic electron density on the halogen atoms 0009-2614/$ - see front matter Ó 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.cplett.2011.03.009 Corresponding author. Fax: +420 220 410 320. E-mail address: [email protected] (J. R ˇ ezác ˇ). Chemical Physics Letters 506 (2011) 286–289 Contents lists available at ScienceDirect Chemical Physics Letters journal homepage: www.elsevier.com/locate/cplett

A halogen-bonding correction for the semiempirical PM6 method

Embed Size (px)

Citation preview

Page 1: A halogen-bonding correction for the semiempirical PM6 method

Chemical Physics Letters 506 (2011) 286–289

Contents lists available at ScienceDirect

Chemical Physics Letters

journal homepage: www.elsevier .com/locate /cplet t

A halogen-bonding correction for the semiempirical PM6 method

Jan Rezác a,⇑, Pavel Hobza a,b

a Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic and Center for Biomolecules and Complex Molecular Systems,166 10 Prague, Czech Republicb Department of Physical Chemistry, Palacky University, 771 46 Olomouc, Czech Republic

a r t i c l e i n f o a b s t r a c t

Article history:Received 15 February 2011In final form 6 March 2011Available online 9 March 2011

0009-2614/$ - see front matter � 2011 Elsevier B.V. Adoi:10.1016/j.cplett.2011.03.009

⇑ Corresponding author. Fax: +420 220 410 320.E-mail address: [email protected] (J. Rezác).

We analyse the failure of the semiempirical QM method PM6 to describe halogen bonds and suggest anempirical correction that remedies this problem. Owing to underestimated repulsion in the PM6 method,the halogen-bond interaction energies are dramatically exaggerated and the equilibrium distances arevery short. This is addressed by a correction parametrised for all halogens capable of halogen bonding(Cl, Br, and I). The correction is applied on top of the dispersion correction, forming the PM6-D2X method.A comparison with ab initio calculations shows that the method is able to predict the interaction energyof halogen bonds with an error of 10%.

� 2011 Elsevier B.V. All rights reserved.

1. Introduction

The halogen bond [1–5] is a specific non-covalent interaction of ahalogen atom (Cl, Br, or I) in halide with an electronegative site (herewe have considered the O and N atoms). The nature of this interac-tion is somewhat counterintuitive, because electrostatics representsan important stabilising contribution here. This is possible becausethe electron density on the halogen atom is anisotropic; a region ofpositive charge (the so-called r-hole) is present on the axis of thebond in covalently bonded halogens, and this region then interactswith the negatively charged acceptor. This interaction accounts forapproximately half of the total interaction energy, the rest is Londondispersion energy, as indicated by decomposition of the interactionenergy by the means of the symmetry-adapted perturbation theorybased on density functional theory calculations [6]. Halogen bondsare now recognised as an important tool for rational molecular de-sign. Halogen substitution is a promising way to improve the activityof drugs; it can improve the energetics of the binding with only aslight impact on the other properties of the molecule. Moreover,the halogenated derivatives can usually be synthesised easily. Com-putationally, it is not difficult to describe the halogen bond. Even theHartree–Fock (HF) or density functional theory (DFT) can describethe electronic structure of a halogen bond and therefore also theelectrostatic contribution to the interaction energy. The dispersioncontribution can be calculated by means of post-HF methods or byan empirical correction in the DFT. These methods can yield veryaccurate interaction energies in small complexes [6].

In order to study larger systems, more efficient methods areneeded. Molecular mechanics can be used for very extensive calcu-

ll rights reserved.

lations, but the approximation involved in the construction of theforce-field limits its accuracy and transferability. Current force-fields are certainly not able to describe halogen bonding becausethe halogen atom is modelled by a single point charge that cannot account for the anisotropic charge distribution on its surface;force-field modifications addressing this issue are being developedbut are not available yet. Semiempirical quantum–mechanicalmethods (SQM) can be applied to systems with thousands of atomswhile retaining the advantages of the QM calculations: they arebased on a proper physical description of the molecular structureand do not depend on system-specific parameters. Linear scalingalgorithms routinely allow the application of these methods towhole proteins [7,8]. Recently, we [9,10] and others[11–14] haveshown that semiempirical methods can be successfully applied indrug design for the calculations of protein–ligand binding. How-ever, the SQM methods themselves are not accurate enough intheir description of non-covalent interactions, which are crucialin such systems. This had been successfully addressed by theempirical corrections. We have developed corrections for disper-sion (which is missing in the SQM methods) and hydrogen bonding(which is underestimated by most of the SQM methods) for se-lected SQM methods [15,16]. In our drug-design studies, we usethe PM6 method [17] with the second generation of corrections[16] (PM6-DH2), because it yields the best results among theSQM methods we tested [16]. The protein–ligand binding freeenergies calculated using this method correlate well with theexperimental data while molecular mechanics fails completely[9,10].

When we applied this method to halogen bonds, we encoun-tered new problems. In principle, the SQM methods are not ableto describe a halogen bond, because the minimal basis used cannotdescribe the anisotropic electron density on the halogen atoms

Page 2: A halogen-bonding correction for the semiempirical PM6 method

J. Rezác, P. Hobza / Chemical Physics Letters 506 (2011) 286–289 287

(the r-hole), which is responsible for the attractive electrostaticcontribution to halogen bonding. We do not have direct evidencein the form of electrostatic potential surfaces from the semiempir-ical methods, but the following observations support this hypoth-esis: Firstly, no halogen bonding (favourable interaction) isobserved at the HF or DFT level in calculations in the minimal ba-sis, while in the large basis calculations, a shallow minimum isfound. Since these methods do not describe dispersion, this stabi-lisation has to be of an electrostatic nature. The absence of ther-hole in the small-basis-set HF calculation has been confirmedby a visualisation of the electrostatic potential on the molecularsurface. Therefore, we were surprised that PM6 yields interactionenergies in various halogen bonds not far from the benchmark re-sults. Later we found that the potential energy surface is com-pletely wrong, with the interaction energy calculated on thecorrect geometry being reasonable only by chance. The dissocia-tion curve of a prototypical halogen bond between bromobenzeneand acetone (taken from [18]) calculated with the PM6 is shown inFigure 1 along with the reference MP2/aug-cc-pVDZ results. ThePM6 yields too short halogen bond distance and the interactionin the equilibrium geometry is dramatically overestimated. The po-tential also shows some spurious behaviour around the minimum.We attribute the observed behaviour to the repulsion being under-estimated. We have observed a similar problem, but with muchsmaller magnitude, in other types of complexes as well. There,the PM6 usually yields intermolecular distances slightly shorterthan the reference values and, when the potential is examinedmore closely, it is obvious that it is not steep enough at the dis-tances below the van der Waals contact. The PM6 method uses acore–core repulsion term that depends on the orbital overlap anddecays rapidly with distance. Therefore, it is probably not able todescribe the repulsion arising from a more diffuse wave functionin a real molecule.

Although the PM6 method cannot describe the halogen bondproperly by the electronic structure and suffers from the lack of arepulsion, it is still an indispensable tool for us when modellinglarge molecular systems, because it describes other non-covalentinteractions better than any alternatives applicable to large sys-tems. Therefore, it is desirable to introduce a halogen-bond specificcorrection that will extend its applicability to halogen-bondedcomplexes. Analogously to our development of a H-bonding cor-rection, we have developed an empirical correction parametrisedon small model complexes for which accurate QM data are avail-able. We have started with the PM6 augmented with empirical dis-

Figure 1. The dissociation curve of a bromobenzene . . . acetone halogen bond. TheMP2 results (black) serves as a reference. The Hartree–Fock results (grey)demonstrate the effect of the basis set on the presence of a minimum. The PM6(red, dashed) and PM6 with dispersion correction (red, solid) dissociation curvesoverestimate the halogen-bonding energy and yield too short an intermoleculardistance. All of the MP2 and HF interaction energies shown here have beencounterpoise-corrected.

persion (PM6-D2) to make the halogen-bonding correctioncompatible with other types of systems. The dispersion also im-proves the description of the halogen bonding at longer distances(see Figure 1). The correction has been fitted over the whole disso-ciation curve, so that the resulting method, named PM6-D2X,yields not only accurate interaction energies but also correct geom-etries. Finally, the new method has been tested on halogen-bondcomplexes not used in the parametrisation in order to critically as-sess its performance.

2. Methods

2.1. Reference data

The correction had been parametrised on the dissociationcurves of halobenzene . . . acetone (oxygen acceptor) and haloben-zene . . . trimethylamine (nitrogen acceptor). In a geometry with anoptimal arrangement of halogen bond (Figure 2), the distancealong the halogen-bond axis is varied to obtain the curve whilethe other intra- and inter-molecular coordinates are kept fixed.The data for the oxygen acceptor are taken from our previous work[19]; the nitrogen acceptor complexes have been prepared for thisstudy. We have used the MP2 in aug-cc-pVDZ (with the counter-poise correction) as a reference. For iodine and bromine, pseudopo-tentials along with the aug-cc-pVDZ-PP basis set have been used toaccelerate the calculations and to account for relativistic effects,which cannot be neglected in iodine. This method had been shownto underestimate the interaction energies of halogen bonds byabout 10% as compared to the benchmark CCSD(T) calculations[6]. This accuracy is sufficient in this application; the other errorsarising from the approximations involved in the semiempiricalmethods are much larger.

In order to validate the method, we have tested it on a set ofbromobenzene derivatives, where the substitutions on the ben-zene ring affect the strength of the halogen bonds [18]. Again, wehave compared our results to the MP2 calculations with the setupdescribed above.

2.2. Halogen-bonding correction

Because of the nature of the PM6 error, the correction has theform of an additional exponential repulsion

Ecorr ¼ ae�br ð1Þ

where r is the halogen . . . acceptor distance and a and b are theparameters. We take advantage of the overbinding observed inthe PM6 and fit the repulsion so that together with the PM6-D2 po-tential, both the position and depth of the minima match the MP2reference data. The dispersion correction is needed to obtain a goodfit at longer distances. This correction is applied to all of the halogen. . . acceptor pairs in the system. The parameters are derived sepa-rately for each combination of the halogen (Cl, Br, and I) and theacceptor (O and N); we have not found a way to introduce globalparameters while retaining sufficient accuracy.

This correction does not affect the electronic structure of thesystem; therefore it can be applied a posteriori to the SCF calcula-tion. We used a separate program to accomplish this, but it wouldbe straightforward to include it in the semiempirical code by add-ing it as an additional core–core term.

2.3. Computational details

The MP2 calculations were carried out using the Molpro 2010package [20]. MOPAC 2009 [21] was used for the PM6 calculations.

Page 3: A halogen-bonding correction for the semiempirical PM6 method

Figure 2. The geometry of bromobenzene . . . acetone (left) and bromobenzene . . .

trimethylamine (right) complexes in an optimal halogen-bonding arrangement. Thetransparent spheres represent the van der Waals radii of the atoms.

Table 1The parameters in the correction for all of the halogen-bond types.

Halogen Acceptor a (kcal/mol) b (�1)

Cl O 4.6783 � 108 6.867Br O 9.6021 � 103 2.900I O 6.0912 � 105 4.154Cl N 1.0489 � 1012 9.946Br N 1.0226 � 105 3.236I N 1.2751 � 1012 9.534

Figure 3. The dissociation curves of the six halogen-bond types considered, calculated u(blue). All of the plots use the same scale to allow a comparison of the strength of the i

288 J. Rezác, P. Hobza / Chemical Physics Letters 506 (2011) 286–289

3. Results and discussion

3.1. Parametrisation

The two parameters in the correction were optimised for eachhalogen-bond type using the least squares algorithm. In somecases, the fit over the whole dissociation curve resulted in an incor-rect description of the position and depth of the minima, which arethe most important features we want to reproduce. This is becausethe exponential repulsion is not flexible enough to correct the po-tential both in the equilibrium distance and at very short distances.In these cases, we removed the first few points of the dissociationcurve until the fit reproduced the minimum well. This led to a stee-per repulsion at very short distances, but this region is not impor-tant in practical calculations. The parameters obtained from thisprocedure are summarised in Table 1, along with the results ob-tained in the minimum. In some cases, the exponent b is largeand the repulsion is very steep at shorter distances; the correctionwould overestimate the repulsion at distances shorter thanapproximately 80% of the equilibrium distance. However, suchintermolecular distance can hardly be encountered in a relaxedgeometry. The corrected and uncorrected dissociation curves areplotted in Figure 3.

The average error (compared to the MP2 reference calculations)is 0.09 kcal/mol (3% of the average interaction energy); the largestone in the set is 0.25 kcal/mol in the iodobenzene . . . methylaminecomplex. This is the most extreme halogen bond in our set, with aninteraction energy of �5.8 kcal/mol. Another problem was encoun-tered in the chlorobenzene . . . acetone complex, where the repul-sive correction cannot correct the underestimated binding atlonger distances. The PM6-D2X therefore yields correct interactionenergy, but the equilibrium distance is about 0.2 Å shorter. Overall,the fit is very good and the errors because of the simple form of thecorrection are much smaller than the error in the reference data.

sing the MP2 (dotted black line), PM6-D2 (red) and the newly developed PM6-D2Xnteraction.

Page 4: A halogen-bonding correction for the semiempirical PM6 method

Table 2The interaction energies in substituted bromobenzenes (in kcal/mol). A comparison ofthe PM6-D2X with the MP2 reference calculations.

MP2/aug-cc-pVDZ PM6-D2X

Bromobenzene �2.23 �2.322,6-dicyanobromobenzene �3.71 �3.07Pentafluorobromobenzene �4.08 �3.27Meta-C6O2H3Br �4.30 �4.173,5-diaminobromobenzene �1.79 �1.77

J. Rezác, P. Hobza / Chemical Physics Letters 506 (2011) 286–289 289

3.2. Validation

We applied the PM6-D2X method to a set of substituted bromo-benzenes (2,6-dicyanobromobenzene, pentafluorobromobenzene,meta-C6O2H3Br and 3,5-diaminobromobenzene) interacting withacetone. The substitutions on the aromatic ring affect the strengthof the halogen bond by several mechanisms: Firstly, they changethe electron density on the halogen atom which is directly relatedto the halogen bonding. Secondly, they change the overall electro-static interaction in the complex; some of the substitutions reversethe dipole moment of the molecule. Finally, the substituents inter-act with the other molecules by means of other than halogen bond-ing. In the complexes studied, all but the 3,5-diaminobromobenzene strengthen the interaction; the changes inthe binding energy can be as large as 100%.

The question is whether these effects can be described by asemiempirical method with a correction that is not based on elec-tronic properties. From the results (listed in Table 2), it is clear thatthe PM6-D2X method reliably describes the changes in halogenbonding upon substitution. The average error is 0.4 kcal/mol (12%of the interaction energy), and the ordering of the complexes isthe same as that in the reference MP2 calculations. The largest er-ror (0.8 kcal/mol) is observed in pentafluorobromobenzene, wherethe changes in electron density are very large.

4. Conclusion

The correction for halogen bonding presented here is a purelyempirical fix of a serious fault found in the PM6 method. Neverthe-less, we have found this approach to be reliable and to yield resultsvery close to the correlated QM calculations. We believe that suchan empiricism is acceptable in order to correct a method that can-not describe halogen bonding by definition. The correction was fit-ted so that it would reproduce the MP2 interaction energies withan error of 3% and yield equilibrium halogen-bond distances withan average accuracy better than 0.1 Å. In uncorrected PM6 (in opti-mised geometries), the average error in the interaction energy is300% and the distances are shorter by 0.7 Å on average.

The method has been tested on a set of complexes where thesubstitutions on the aromatic ring modulate the strength of thehalogen bond. The PM6-D2X results reproduce the MP2 referencewith an average error of 12% and yield a correct ordering of the

interaction energies. These results indicate that the method canbe safely applied to a wide range of systems where the halogenbonds are strongly affected by the environment.

The correction extends the applicability of the semiempiricalmethods to a novel class of problems with many potential applica-tions. We applied the PM6-DH2X method (PM6 with dispersion,hydrogen bond [16] and halogen-bond corrections to the casein ki-nase 2 interaction with inhibitors based on tetrabromobenzatriazol[22]. In this system, the halogen bonds are a key part of the mech-anism of the inhibition. Our calculations there yield geometriesclose to the X-ray structures and binding free energies that corre-late perfectly with the experimentally measured inhibitor activi-ties. Such results cannot be achieved with any other method atcomparable expense.

Acknowledgement

This work was a part of Research Project No. Z40550506 of theInstitute of Organic Chemistry and Biochemistry, Academy of Sci-ences of the Czech Republic, and was supported by Grants No.LC512 and MSM6198959216 from the Ministry of Education,Youth and Sports of the Czech Republic. The support of PraemiumAcademiae, Academy of Sciences of the Czech Republic, awarded toP.H. in 2007 is also acknowledged.

References

[1] P. Metrangolo, G. Resnati (Eds.), Halogen Bonding, Springer, Berlin Heidelberg,2008.

[2] P. Auffinger, F.A. Hays, E. Westhof, P.S. Ho, P. Natl. Acad. Sci. USA 101 (2004)16789.

[3] P. Politzer, J.S. Murray, T. Clark, Phys. Chem. Chem. Phys. 12 (2010) 7748.[4] P. Politzer, P. Lane, M.C. Concha, Y. Ma, J.S. Murray, J. Mol. Model. 13 (2006)

305.[5] P. Metrangolo, G. Resnati, Chem. Eur. J. 7 (2001) 2511.[6] K. Riley, P. Hobza, J. Chem. Theory Comput. 4 (2008) 232.[7] J.J.P. Stewart, J. Mol. Model. 15 (2008) 765.[8] A.M. Wollacott, K.M. Merz, J. Chem. Theory Comput. 4 (2007) 1609.[9] J. Fanfrlík, A. Bronowska, J. Rezác, O. Prenosil, J. Konvalinka, P. Hobza, J. Phys.

Chem. B 114 (2010) 12666.[10] P. Dobeš, J. Fanfrlík, J. Rezác, M. Otyepka, P. Hobza, J. Comput. Aid. Mol. Des.

(2011) 000.[11] K. Raha, K.M. Merz, J. Am. Chem. Soc. 126 (2004) 1020.[12] K. Raha, K.M. Merz, J. Med. Chem. 48 (2005) 4558.[13] K. Raha, M.B. Peters, B. Wang, N. Yu, A.M. Wollacott, L.M. Westerhoff, K.M.

Merz, Drug Discov. Today 12 (2007) 725.[14] T. Zhou, A. Caflisch, Chem. Med. Chem. 5 (2010) 1007.[15] J. Rezác, J. Fanfrlík, D. Salahub, P. Hobza, J. Chem. Theory Comput. 5 (2009)

1749.[16] M. Korth, M. Pitonák, J. Rezác, P. Hobza, J. Chem. Theory Comput. 6 (2010) 344.[17] J.J.P. Stewart, J. Mol. Model. 13 (2007) 1173.[18] K.E. Riley, J.S. Murray, P. Politzer, M.C. Concha, P. Hobza, J. Chem. Theory

Comput. 5 (2009) 155.[19] K.E. Riley et al., J. Mol. Model. (2011).[20] H. Werner, P.J. Knowles, F.R. Manby, M. Schütz, and others, MOLPRO version

2010.1, a package of ab initio programs, 2010.[21] J.J.P. Stewart, MOPAC 2009, Stewart Computational Chemistry, Colorado

Springs, CO, USA, 2009.[22] P. Dobeš, J. Rezác, J. Fanfrlík, M. Otyepka, and P. Hobza, manuscript in

preparation.