Scoring optimisation of unbound protein–protein docking including protein binding site predictions

Scoring optimisation of unbound protein–proteindocking including protein binding site predictionsSebastian Schneider and Martin Zacharias*

The prediction of the structure of the protein–protein complex is of great importance to better understand molecu-lar recognition processes. During systematic protein–protein docking, the surface of a protein molecule is scannedfor putative binding sites of a partner protein. The possibility to include external data based on either experimentsor bioinformatic predictions on putative binding sites during docking has been systematically explored. The exter-nal data were included during docking with a coarse-grained protein model and on the basis of force field weights tobias the docking search towards a predicted or known binding region. The approach was tested on a large set ofprotein partners in unbound conformations. The significant improvement of the docking performance was foundif reliable data on the native binding sites were available. This was possible even if data for single key amino acidsat a binding interface are included. In case of binding site predictions with limited accuracy, only modest improve-ment compared with unbiased docking was found. The optimisation of the protocol to bias the search towardspredicted binding sites was found to further improve the docking performance resulting in approximately 40%acceptable solutions within the top 10 docking predictions compared with 22% in case of unbiased docking ofunbound protein structures. Copyright © 2011 John Wiley & Sons, Ltd.

Keywords: protein–protein interaction; binding site prediction; biased force field; docking by energy minimisation;protein–protein complex formation

INTRODUCTION

The realistic prediction of protein–protein complex structures(protein–protein docking) is of major importance because only asmall fraction of real and putative protein–protein interactions ina cell can be determined experimentally. Some interactions are onlytransient or weak such that the experimental determination of acomplex structure is difficult or even impossible. Computationalprotein–protein docking methods are of increasing importance togenerate at least model structures of the possible protein–protein complex. Protein–protein docking approaches can also behelpful to evaluate newly designed protein–protein interactionsthat are of increasing interest in the area of biotechnology andmedicinal chemistry. Several classes of dockingmethods have beendeveloped (reviewed in Pons et al., 2009; Moreira et al., 2009;Zacharias, 2010a). These methods systematically search for putativeprotein–protein binding geometries using various surfacematchingapproaches or use force field–based energyminimisation or relatedoptimisation procedures. The initial stage consists typically of asystematic docking search keeping partner structures rigid. Subse-quently, one or more refinement and scoring steps of a set ofpreselected rigid docking solutions are added to achieve closeragreement with the native geometry and to recognise near-nativedocking solutions preferentially as the best or among the bestscoring complexes (Pierce & Weng, 2007, 2008; Pons et al., 2009;Zacharias, 2010b). It is also possible to re-evaluate the dockedcomplexes according to available experimental data on a putativebinding surface or based on predictions from bioinformaticsbinding site prediction approaches (Ben-Zeev & Eisenstein, 2003;Gottschalk et al., 2004; Zhang et al., 2005; de Vries & Bonvin, 2008;Huang & Schröder, 2008; Kowalsman & Eisenstein, 2009; Lianget al., 2006; 2009).

Alternatively, experimental or prediction data can also beincluded directly during the docking search as restraints in forcefield–based docking approaches (de Vries et al., 2006; Melquiond& Bonvin, 2010). An example of this type of approach is theHADDOCK programme (Dominguez et al., 2003), which usesdata-derived restraints during molecular dynamics and energyminimisation to drive the docking towards a target region. Adrawback of including experimental data or binding site predic-tions during docking is the restriction of the search such that atleast some of the data-derived restraints must be fulfilled. If thedata are incorrect or too inaccurate, it may interfere with thesuccess of the docking procedure (de Vries et al., 2010).

Another option is to include external data as weights onputative interacting (or noninteracting) atoms of the partnermolecules. In case of force field–based docking methods, a weightlarger than 1 on an atom implies a stronger interaction with atomsof the partner resulting in an improved score of solutions includ-ing the labelled (weighted) atoms, and in addition it alsoenhances sampling of the labelled region during the search.Compared with restraint-driven approaches, the weighting ofinteractions does not exclude the sampling of regions not coveredby the external data, and an additional advantage is the possibilityto only use available data on one partner structure. The possibilityto include external data as weights has already been used in

* Correspondence to: Martin Zacharias, Physik-Department T38, TechnischeUniversität München, James Franck Str. 185748 Garching, Germany.E-mail: [email protected]

S. Schneider, M. ZachariasPhysik-Department T38, Technische Universität München, James Franck Str. 1,85748 Garching, Germany

Research Article

Received: 16 June 2011, Revised: 19 September 2011, Accepted: 23 September 2011, Published online in Wiley Online Library: 2011

(wileyonlinelibrary.com) DOI: 10.1002/jmr.1165

J. Mol. Recognit. (2011): 15–23 Copyright © 2011 John Wiley & Sons, Ltd.

15

combination with fast Fourier transform correlation-based dock-ing methods (Ben-Zeev & Eisenstein, 2003) by selecting the high-est scoring translations per orientation of the ligand proteinrelative to the receptor. In the present study, the possibility toinclude either experimental data or bioinformatics predictions inthe form of force field weights during docking was explored usingthe ATTRACT docking programme. The ATTRACT protein–proteindocking approach (Zacharias, 2003, May & Zacharias, 2008,Fiorucci & Zacharias, 2010) is based on a coarse-grained proteinmodel intermediate between a residue-level and a full atomisticdescription. During docking, the protein partners are representedby several (up to four) pseudo-atoms per amino acid residue.Docking calculations take into account not only surface comple-mentarity but also physicochemical character of interacting aminoacids. Systematic docking is performed by energy minimisationstarting from thousands of start configurations. Recently, the forcefield parameters have been optimised to optimally identify near-native docking minima among a large set of incorrect decoycomplexes (Fiorucci & Zacharias, 2010).

To further evaluate the performance of the parameters in thisstudy, we performed systematic docking on a set of complexeswith available partner structures in the unbound conformation(Mintseris et al., 2005). The inclusion of knowledge on the bindingsite or binding site predictions on the docking performance in theform of force field weights was systematically tested. Experimentaldata of even single interface residues can be beneficial for dock-ing. Although the inclusion of predicted interface data overall alsoincreased the docking performance the type of weighting as wellas the accuracy of binding site predictions has a significant influ-ence on docking performance. Possibilities to optimise the inclu-sion of binding site predictions have also been explored.

MATERIALS AND METHODS

Protein–protein docking

The ATTRACT docking programme (Zacharias, 2003, Fiorucci &Zacharias, 2010) uses a coarse-grained protein representationwith two pseudo-atoms per residue representing the main chain(located at the backbone nitrogen and backbone oxygen atoms,respectively). Small amino acid side chains (Ala, Asp, Asn, Cys, Ile,Leu, Pro, Ser, Thr and Val) are represented by one pseudo-atom(geometric mean of side chain heavy atoms). Larger and moreflexible side chains are represented by two pseudo-atoms toaccount for the shape and dual chemical character of some sidechains. Effective interactions between pseudo-atoms are describedby soft distance (rij)-dependent Lennard–Jones (LJ)-type potentialsof the following form:

V ¼ eABRABrij

� �8

� RABrij

� �6" #

þ qiqje rij� �

rijin case of attractive pair

replusive pair :

V ¼ �eABRABrij

� �8

þ RABrij

� �6" #

þ qiqje rij� �

rijif rij > rmin

V ¼ 2emin þ eABRABrij

� �8

� RABrij

� �6" #

þ qiqje rij� �

rijif rij≤ rmin

where RAB and eAB are effective pairwise radii and attractive or repul-sive LJ parameters. At the distance rmin between two pseudo-atoms,the standard LJ potential has the energy emin. A Coulomb-typeterm accounts for electrostatic interactions between real charges

(Lys, Arg, Glu and Asp) damped by a distance-dependent dielectricconstant (e=15r). This form allows for purely repulsive interactingpseudo-atom pairs. The attractive and repulsive parameters for eachpseudo-atom pair were iteratively optimised by minimising the rootmean square deviation (RMSD) of near-native docking minima andby comparing the scoring of near-native minima with many high-scoring decoy complexes (published in Fiorucci & Zacharias, 2010).Using bound partner structures, the parameter set ranks near nativesolutions within the top 10 scoring complexes in 90% of the testcases (Fiorucci & Zacharias, 2010).Systematic docking was performed starting from many starting

points (spaced by ~8Å) of the smaller protein (ligand) on thesurface of the larger partner (receptor) and using approximately250 different starting orientations. Each docking run consisted ofa set of energy minimisations in translational and orientationalvariables following published protocols (Zacharias, 2003). Exactlythe same starting placements and docking minimisation condi-tions were used for unbiased docking runs and docking, includingweights on the force field contributions from putative binding siteresidues allowing direct comparison. The unbound docking testset contained of 82 complexes in the unbound form from apublished benchmark set (Mintseris et al., 2005). This test setconsisted of 63 rigid complexes (interface RMSD between boundand unbound on average 0.82Å), 13 medium difficult (interfaceRMSD 1.63Å) and 8 difficult cases (interface RMSD 3.67Å), 23 ofthose complexes are enzyme inhibitor or enzyme substrate, 10antibody antigen, 12 antigen-bound antibody and 39 other com-plexes (Mintseris et al., 2005). The structures with the pdb (proteindata bank) code 1N2C (medium difficulty, other type) and 2VIS(rigid case, antigen–antibody complex) were left out because ofdifficulties with binding site predictions using the meta-PPISPbinding site prediction server (Qin & Zhou, 2007). Docked com-plexes were clustered, and acceptable solutions according to theCAPRI criteria (Lensink et al., 2003) were collected and ranked onlyaccording to the ATTRACT score. The clustering of solutions wasperformed starting from the lowest energy complexes (best scor-ing) using an RMSD of the ligand protein after superposition of re-ceptor proteins (RMSDlig) cutoff between any two solutions of 5Å.An acceptable solution is defined as an RMSD of the ligand(RMSDlig) <10Å and a fraction of native contacts relative to thenative complex ≥0.1 or only a fraction of native contacts of ≥0.3.Various types of weighting interaction potentials were evalu-

ated. These included the increase of all the interactions of aselected pseudo-atom by a factor of 1.5 or 2.0 relative to theunbiased force field as well as weighting (scaling) only the attrac-tive interactions. The details on scaling only selected residues orpatches around a putative binding site are given in the Resultsand Discussion section as well as in figure legends. Protein bind-ing site predictions were obtained using the meta-PPISP server.The meta-PPISP server collects predictions from three differentmethods (cons-PPISP (Chen & Zhou., 2005), PINUP (Liang et al.,2006) and Promate (Neuwirth et al., 2004)) and forms a consensus(meta) prediction. It is considered as one of the most successfulbinding site prediction methods (Zhou & Qin, 2007),

RESULTS AND DISCUSSION

Systematic docking including force field weights on knownbinding regions

The ATTRACT scoring function for evaluating docked protein–protein complexes is based on a coarse-grained representation

S. SCHNEIDER AND M. ZACHARIAS

wileyonlinelibrary.com/journal/jmr Copyright © 2011 John Wiley & Sons, Ltd. J. Mol. Recognit. (2011): 15–23

16

of the protein partners intermediate between atomic resolutionand residue-based models. A recently designed force field basedon optimising the scoring of native interfaces relative to a largeset of decoy surfaces (Fiorucci & Zacharias, 2010) gave very goodperformance on bound partner structures with near-nativesolutions in the top 10 ranked complexes in approximately90% of the test cases. To test the performance on unbound part-ner structures, we performed systematic docking searches on abenchmark set of unbound partner structures (Mintseris et al.,2005). After clustering of the results (see Methods section) in55 of 82 docking cases (~65%), at least one acceptable solution

(according to the CAPRI criteria; Lensink et al., 2007; see alsolegend of Figure 1) was found in the top 100 of all dockingsolutions (Figure 1a). This is quite remarkable because othermethods typically achieve this only in approximately 50% ofthe benchmark cases and typically only after rescoring includinga variety of physicochemical and/or bioinformatics parameters(Cheng et al., 2007; Pierce & Weng, 2008; Liang et al., 2009; deVries & Bonvin, 2011). Also, no knowledge on putativebinding regions of the partner structures was included. However,for many docking cases of biological relevance, experimentaldata on a putative protein binding interface in the form of

Figure 1. (A) Comparison of various types of weighting (scaling) interactions that involve coarse-grained atoms that belong to a known interface. Theterm attractive refers to scaling only the attractive contribution (otherwise both repulsive and attractive) by a factor of 2.0 or 1.5, respectively, comparedwith the unbiased docking run. The term ‘rescore’ indicates that the docking solutions obtained from an unbiased search were rescored using an at-tractive weight of 2.0 on interface atoms. The ranking (x-axis) indicates the rank of the best acceptable solution (according to the CAPRI criteria; seeMethods section). The y-axis indicates the percentage of solutions (out of 82) in each category. (B) Effect of modifying the part of the surface for whichattractive interactions were increased by 1.5. Interface centre indicates weighting of only half of the known interface residues, whereas doubled inter-face means weighting twice as many residues but including the known interface residues. Boxes for approximately 35% overlap indicate weights on anartificial random protein surface area that includes on average approximately 35% of the known interface, whereas 50% of protein surface indicateweights on 50% of both proteins total surfaces including the known binding region. Random indicates weighted random artificial binding sites thatdo not overlap with the known binding site (see also legend of Table 1).

PROTEIN–PROTEIN DOCKING

J. Mol. Recognit. (2011): 15–23 Copyright © 2011 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/jmr

17

mutagenesis data or evolutionary conservation are available.Such data can be included during docking in the form ofdata-driven restraints (e.g. used in the docking programmeHADDOCK) or for re-evaluation after an unrestrained dockingsearch (Huang & Schröder, 2008; Pons et al., 2009).

The possibility to include data on putative binding sites in theform of weights on the force field contributions is ideal for theATTRACT docking method on the basis of docking energy mini-misation in a force field that drives the interaction betweentwo proteins. To test different combinations of weights for bind-ing and nonbinding regions, it was first tested for the case ofknown protein binding regions for both partner structures. Anupper limit for the weights on those atoms that are known tobe involved in binding of 2.0 was used. Although this seems likea small weight in case of a known binding region, it was foundthat higher weights were not beneficial in case of less reliabledata on a binding site (see following paragraphs), and a weightof 2.0 was therefore chosen as upper limit. As expected, theinclusion of data on the known binding protein sites significantlyenhances the docking performance and increases the number ofsuccessful docking cases with acceptable solutions in the top 1or top 2–10 category. Best results were achieved with an upscalingof only attractive interactions involving pseudo-atoms in theknown interface by factors 1.5 or 2.0.

The fact that there are still cases in which acceptable solutionswere only found in the top 10–100 category (or not at all) is dueto the large conformational differences between unbound andbound partner structures in these cases (belonged to the difficultcategory on the benchmark set). In addition, increasing theattraction of a known interface region does not restrict the rela-tive orientation of protein partners upon binding. Includingweights did not only affect the scoring but also produced overallmore acceptable solutions. Although in the reference (unbiased)search 952 acceptable clusters of solutions were found (for allcomplexes together), this number more than tripled in case ofincluding weights on known interface residues (e.g. 2886 in caseof weighting attractive interactions by 2.0). In addition, the num-ber of native contacts for the best acceptable solution increasedfrom on average 51% (SD = 23%) in the reference search to 58%

(SD = 23%) in docking searches including weights. The top 10/top 100 results included structures with on average significantlymore native contacts (25% � 11%/12% � 5%) compared withthe reference docking search (4% � 5%/2% � 2%).Instead of including weights during docking, it is also possible

to rescore solutions from the reference docking search using theforce field weights. Rescoring can change the number of accept-able solutions within a given range of ranking but cannotimprove the quality or number of acceptable solutions. For dock-ing including weights (2.0) on known binding site residues,acceptable solutions were found in 56 of the 82 cases on rank1, whereas only 49 top 1 solutions were found using rescoringof unbiased docking runs (Figure 1). For three complexes, noacceptable solution could be found in the reference run (andof course also not upon rescoring). The weighted dockingruns found for two of those complexes acceptable solutions(pdb1H1V rank 108 and ligand RMSD 8.9 and pdb1IBR rank 22and RMSD 8.6, both cases claimed as difficult in the benchmark).Overall, using weights during docking resulted in an averagerank of acceptable structures of 5.4 and for rescoring 10.3.Especially for difficult and medium difficult cases, the use ofweights during docking improves results (compared with rescor-ing unbiased searches). Docking with weights performed betterin nine medium and difficult cases, and we found for two struc-tures acceptable solutions that had not been found duringunbiased docking.In realistic docking cases, the protein binding site is often only

approximately known. To mimic such conditions, we generatedartificial binding sites (representing possible predictions orknown regions), which represented different scenarios in termsof sensitivity and specificity of the approximately known interac-tions regions (Table 1, Figure 1b). A random placement of puta-tive binding regions (as continuous patch) on the protein surfaceexcluding the known binding site (prediction with zero sensitivi-ty and specificity) resulted in a significant drop of the dockingperformance even below the performance of the reference dock-ing search (without any weights). However, in the top 1000 (top5000), acceptable predictions could be found for 44% (89%) ofthe test cases. For four complexes, results within the top 100

Table 1. Force field weighting of artificial protein surface binding sites

Name Receptor fraction (%) Ligand fraction (%) Sensitivity (%) Specificity (%)

Interface centre 4 8 40 100Doubled interface 18 36 100 55Overlapping site 10 20 35 3550% of total 48 47 100 33Random assignment 8 16 <1 <1Known binding sites 10 � 4 20 � 9 – –

Receptor and ligand fractions correspond to the average fraction of protein surface residues, which included weights comparedwith the total number of surface residues taken from the meta-PPISP predictions. For definition of sensitivity and specificity, seelegend of Table 2. ‘Interface centre’ indicates the inclusion of force field weights on only the central 50% of the known bindingsites residues on both partners. ‘Doubled interface’ refers to artificial binding sites with twice the size of the correspondingknown binding site and including the known binding site completely. ‘Overlapping site’ indicates weights on continuousrandom surface patches with partial overlap with the known binding sites on both partners, and ‘50% of total’ means weightson 50% of the proteins surface residues including the known binding region. For comparison, ‘random assignment’ (excludingalways the known binding region, if possible) and the average fraction of the known binding regions with respect to the totalsurface for all protein complexes are also included. For all patches, the approximate number of binding site residues ofindividual cases was used to create artificial continuous binding site patches.



18

could be found despite a weighting as wrong as possible. Thisindicates a distinct advantage of the present technique of includ-ing putative binding site data as force field weights instead ofrestricting the search to predicted interactions regions (e.g. asrestraints). For example, the inclusion of incorrect predictions(zero sensitivity and zero specificity) as distance or contactrestraints would have resulted in complete failure of the dockingsearches for all complexes (no solutions in the top 100 category).A binding patch that was twice larger than the known binding

site (but included the binding site completely, representing100% sensitivity and on average 55% specificity and thereforecalled doubled interface) gave similar docking performancecompared with a small predicted site, including only the centreof the interface (representing 100% specificity and ~40% sensi-tivity; Table 1 and Figure 1b). In both cases, approximately 80%of acceptable solutions scored among top ranks or top 2–10ranks. Extending the binding region to 50% of all surface resi-dues (including the known binding region completely) stillresulted in acceptable solutions within the top 10 in 63% ofthe cases. However, reducing the precision of the artificial bind-ing site down to approximately 35% sensitivity and specificity,respectively, significantly decreased the performance towardsresults not much better than the reference run (Figure 1b, Table 1overlapping site).Because predictions or experimental knowledge is often of

limited accuracy, the results of the test runs indicated that anoverprediction of the binding site (too large but including thecomplete native binding site) gave the best improvement ofthe docking performance.

Inclusion of knowledge on single residues

Frequently, mutagenesis experiments on proteins can identifyresidues that are likely to be part of a protein–protein bindinginterface. Docking searches including weights only on one partner(receptor) and on one central core residue were performed. Thisshifts the score distribution of acceptable solutions already signif-icantly towards better ranking solutions (Figure 2a). Knowing onesingle residue of one of the partners increased the number of ac-ceptable solutions in the top 10 from approximately 22% in thereference run to approximately 36%. Increasing the weights alsofor residues in the neighbourhood of the selected residue furtherimproved the docking results. Similarly, picking three randompositions in the known interface (instead of a central interfaceresidue) and increasing the interaction weight around themaccording to the procedure given in the legend of Figure 2resulted in acceptable docking solution in the top 10 category in51% of the cases (80% in the top 100 category).

Inclusion of predictions on protein–protein interactionregions

Several methods for predicting putative protein binding sites onproteins have been developed (reviewed by de Vries & Bonvin,2008). The meta-PPISP server combines the results of threeseparate approaches and forms a consensus prediction. It is con-sidered as one of the best prediction methods (Zhou & Qin,2007). The statistical accuracy of the predictions was evaluatedfor the benchmark set of partner structures (using the defaultthreshold for predictions; Table 2 and Figure 3). On average, asensitivity of approximately 37% was observed, indicating thaton average approximately 37% of the predicted residues

overlapped with the native binding site. This only slightly betterthan the test case of generating artificial predicted binding siteswith, on average, approximately 35% sensitivity and approxi-mately 35% specificity, as investigated earlier (Figure 1b).

To account for the limited precision of predictions, we testedtwo variants of interaction weighting. In the first method, allresidue atoms above a prediction threshold of 0.34 wereassigned a constant weight (Figure 4a). In this case, a weight of1.5 or 2.0 (only for attractive interactions) assigned to predictedresidues resulted in improved docking results (Figure 4a).However, the improvement is modest and concerns mostly thedocking of complexes in the top categories. In a second set of

Figure 2. Docking performance in case of scaling the interactions ofsingle residues located approximately at the centre of the receptors pro-tein binding region (only attractive interactions by 1.5). For comparison,the results of scaling not only a single residue but also residues within14 Å of the selected residue (in a linearly decreasing manner accordingto w=1.5� (0.0035� distance) from the selected residue: termed patch)and for three randomly selected residues—with surrounding residues in8 Å of the selected residues (w=1.5� (0.0625� distance)—in the knownbinding region are also shown (3 res, patch). If one residue was nearmore than one other randomly selected residue, the higher weight wasused. On average, the surface residue fraction (out of the total surface)for the 14-Å patch was approximately 20%, and for the three residuespatches each with 8 Å, the fraction was approximately 13%.

Table 2. Prediction accuracy of the meta-PPISP server onthe test set of unbound protein structures

Receptor Ligand Average

Sensitivity 0.34 0.40 0.37Specificity 0.36 0.45 0.40Accuracy 0.87 0.76 0.81

Definition of sensitivity: TP / (TP+ FN); specificity: TP / (TP+ FP);accuracy: (TP+ TN) / (TP+ TN+FP+ FN). TP, true-positive; TN,true-negative; FP, false-positive; and FN, false-negative.Prediction values ≥0.34 are counted for statistics, lowervalues neglected. The statistics include only surface residuesas indicated by the meta-PPISP predictions. Because meta-PPISP provides predictions per residue, all statistics are perresidue as well.



19

systematic docking searches, the prediction values for eachresidue (from the meta-PPISP server) were directly (linearly)translated into attractive weights (weight range of 1–1.5 or1–2.0). This gave a further improvement of docking results withapproximately 40% acceptable solutions in the top 10 and 77%in the top 100 categories (weight 1.5) compared with 22% and67% in the reference case without bias (Figure 4b).

The inclusion of binding site predictions resulted not only inan overall improved ranking of acceptable solutions but also inthe improvement of the quality of the docking solutions in termsof the number of native contacts (Figure 5; the results in terms ofinterface RMSD or RMSDlig relative to the native placement aresimilar; data not shown). Such improvement was observed for16 complexes for the best ranked solutions. Of those structures,11 could not be sampled at all in the reference run whereas 5were ranked worse. For 57 of 82 complexes, the weighted runproduced more acceptable solutions and more clusters thanthe reference run. Neglecting the ranking, the searches includingbinding site predictions found for 36 complexes improved dock-ing quality (in terms of native contacts and RMSDlig). However,for 14 cases, the inclusion of weights based on binding sitepredictions decreased the quality of the docking runs comparedwith the reference searches. Overall, a clear but modest improve-ment compared with the unbiased docking runs was found incase of including weights on known binding regions. Figure 6illustrates successful predictions and docking results for oneexample (pdb2SIC). In this case, with an accurate binding siteprediction (Figure 6a), a dramatic improvement of the dockingperformance can be observed (Figures 6b–6e). However,because of the limited accuracy of binding site predictions, thedata set contained also many examples where the predictionwas completely wrong (illustrated in Figure 6f for pdb1QA9),which resulted in weighting predicted regions with no overlapto the native interface at all. As a result, the scoring of incorrectcomplexes was enhanced compared with solution close to thenative complex. For the example pdb1QA9 (rigid case), the refer-ence run predicted an acceptable solution with RMSDlig of 4.2 Å

at rank 70 and 14 clusters of acceptable structures in total. Theweighted run led to a dramatic decrease of scoring. The bestranked acceptable structure could be found on rank 660 withan RMSDlig of 4.1 (more clusters of acceptable structures weresampled but all scored on ranks worth than 700).For the successful example, pdb2SIC (rigid case), the reference

run predicted an acceptable solution with RMSDlig of 3.4 Å atrank 178. The inclusion of weights for predicted binding regionsduring docking resulted in a solution with RMSDlig of 2.7 Å andrank 4. In the search with weights, 14 acceptable clusters (38independent structures) were found, whereas there had beenonly 5 (29 structures) in the reference run. In this case, severalstructures sampled in the searches with weights have not beensampled in the reference run. For the complete benchmark in45 cases, the inclusion of weights based on meta-PPISP predic-tions led to a better ranking of the best ranked acceptable solu-tion. It stayed the same in five (already rank 1) cases and caused

Figure 3. Specificity of single meta-PPISP (Qin & Zhou, 2007) predictionvalues within a bin size of 0.05. The red line indicates a specificity of 50%.The green line represents the percentage of binding site residues amongall surface residues. Values above the green line are better than randomvalues because the fraction of binding site residues among all surface resi-dues is 15% [10% for receptor (SD = 4%) and 20% for ligand (SD = 9%)] forthe complexes in the test set of complexes.

Figure 4. Effect of including weights on residues predicted to be part ofprotein–protein interfaces using the meta-PPISP approach (Qin & Zhou,2007). (A) All interactions of pseudo-atoms with a meta-PPISP predictionthreshold of 0.34 (recommended significance threshold of the meta-PPISP server) were uniformly assigned a weight of 1.5 and 2.0, respec-tively. (B) The weight on predicted interface atoms was linearly weightedaccording to the prediction of the meta-PPISP server using w=0.5 �meta-PPISPscore + 1.0 with a maximum weight of 1.5 (or 2.0 withw=1.0 x meta-PPISPscore + 1.0),, respectively.



20

a loss of ranking in 26 cases (rest had no acceptable solution inone or both of the searches), leading to an average rank in thereference run of 647 for the best acceptable solution and rank90 for the searches with weights. In addition, 16 complexeshad a lower RMSDlig and/or larger fraction of native contacts inthe searches with weights compared with the reference run,

whereas the opposite was observed in 13 cases. The searcheswith weights also provided for 57 complexes more acceptablesolutions and for 20 less than the reference search.

The ATTRACT programme allows to include the minimisationof docking partners not only in translational and orientationaldegrees of freedom but also in a set of normal modes of thebinding partners (or other collective degrees of freedom; May& Zacharias, 2005, 2008) on the basis of an elastic network modelof proteins. The minimisation in normal modes can accountapproximately for induced-fit effects during the protein–proteinbinding process. The effect of including conformational relaxa-tion of both partners in the five softest normal modes duringsystematic docking was also tested (Figure 7). For some dockingcases, the significant improvement of the results could beobtained (e.g. see Figure 8). However, for several other cases,the inclusion of normal mode minimisation during dockingwas also of benefit for incorrect docking solutions. Hence, itimproved also the induced-fit and scoring of nonnative solutionssuch that overall the docking performance in terms of accept-able solutions improved only modestly for unbiased docking ordocking including binding site predictions (on average, 25.7 ac-ceptable structures per complex for rigid unbiased docking and28.2 for unbiased docking including minimisation in the fivesoftest modes of both partners during docking).

In case of known binding sites, the percentage of acceptablesolutions in the top 10 category increased from approximately84% (reference) to 91% in case of including normal mode relax-ation and resulted in an increased number of top 1 ranks anda slightly better ranking of all best ranked acceptable solutions(average rank, 4.7 in rigid and 3.2 in the flexible case). Also,

Figure 5. Comparison of the quality of acceptable docking solutions(given as number of clusters representing acceptable docking solutions)after systematic docking on all test cases (in terms of native interfacecontacts). Native contact calculations included all inter protein coarse-grained atom contacts within a distance of <7Å. This correspondsclosely to the <5Å criterion used at atomic resolution. Only data forthe inclusion of meta-PPISP-based-predicted interface residues and forthe reference searches have been included.

Figure 6. (A) Meta-PPISP-based-predicted interface residues (red) on the surface of the protein partners forming the complex pdb2SIC (noninterfaceresidues in blue). The receptor protein is shown as molecular surface (ligand as cartoon model). (B) Docking sampling (mapping contacting residues indocked complexes) of the top 1000 solutions including weights on PPISP-predicted interface residues (shown in panel A). Grey indicates no sampling atall, and red indicates dense sampling. (C) Top 100 ligand protein placements (cartoon) from the same docking run (receptor as blue molecular surface;native ligand placement in cyan).(D) Score of docked complexes versus deviation (only ligand) from the known placement of the ligand protein relativeto the fixed receptor protein during an unbiased systematic docking run. (E) Same as in panel D but for a systematic docking run with weights of 1.5 onall meta-PPISP-predicted interface residues. (F) Meta-PPISP-predicted interface residues mapped on the surface of the binding partners of the complexpdb1QA9 (same colour coding as in panel B, receptor shown as cartoon and ligand in van der Waals sphere representation).



21

several>10Å ligand RMSDlig resulting from the rigid case changedto solutions with a lower RMSDlig so that the number of acceptablestructures increased and the average RMSDlig decreased (the aver-age number of sampled acceptable structures raised from 71.3 upto 77.5 per complex). For example, using normal mode minimisa-tion, it was possible to create an acceptable structure for complexpdb1BGX (not possible in searches with rigid structures, illustratedin Figure 8).

CONCLUSIONS

One possibility to improve computational methods for the pre-diction of protein–protein binding geometries is to include

experimental data. A simple restriction of the search to a pre-dicted binding region or inclusion as restraints during the searchcan have the drawback that in case of incorrect data, the dockingrun is limited or guided to the incorrect binding region. Theinclusion of experimental or prediction data in the form of forcefield weights on predicted interface residues creates a biastowards the predicted binding region without excluding otherpossible binding regions and can be used even if only data forone partner protein is available. Inclusion during the dockingitself may also enhance the number of sampled docking solu-tions near a predicted binding region. Indeed, it was found thatthe inclusion of weights during the docking search gives anoverall better docking performance than rescoring of the

Figure 7. Systematic docking search including minimisation in five precalculated soft normal modes of both binding partners (indicated as + flex). Per-formance was evaluated including elastic network-derived normal modes (May & Zacharias, 2008) and for rigid docking on unbound partner proteinsusing otherwise identical docking search conditions. Added force field weights are indicated as known 2.0 (weights of 2.0 on known interface atoms)and prediction 2.0 (weights of 2.0 on residues predicted to be at the interface using the meta-PPISP server).

Figure 8. Docking of the unbound partner structures of complex pdb1BGX (light blue, receptor; dark blue, ligand). On the left in darker red, the dock-ing solution closest to the native complex (12.6 Å and not acceptable) from a systematic ATTRACT docking. On the right, same docking search approachbut including minimisation in the five softest normal modes of both partners during docking. Darker red shows the solution with lowest RMSDlig fromthe native complex (RMSDlig = 9.6 Å, acceptable solution) and as light red cartoon the deformed receptor structure.



22

solutions found in unbiased searches. Because the inclusionduring docking comes at no additional computational cost inour approach, we recommended using this option during theinitial search step. Systematic evaluation of the approach in com-bination with the ATTRACT docking method indicates that it canindeed significantly improve docking performance if reliabledata on putative binding regions are available. Even single iden-tified interface residues can significantly improve dockingresults. However, including prediction data on putative proteinbinding sites gave only modest improvement of the dockingperformance compared with the unbiased reference search. Thisis due to the limited precision of such predictions and fully con-sistent with test cases on designed weighted artificial bindingsites that included on average only part of the known bindingsite. Although it was tested on only one prediction server, it isexpected that the result will be similar with other predictionmethods because the performance of the best available meth-ods differs only by a few percent in terms of prediction accuracy(de Vries & Bonvin, 2008). The inclusion of binding site predic-tions as ambiguous restraints was recently tested on a similarbenchmark (de Vries & Bonvin, 2011). Acceptable solutions werefound in the top 100 docking solutions for approximately 41% ofthe cases (in case of including predictions) and approximately15% for unbiased (ab initio) docking, which compares to 77% (in-cluding predictions) and 65% (unbiased docking) in the present

study. Hence, for the ATTRACT method, the scoring without ad-ditional data performs already quite well, but the gain uponinclusion of prediction data is smaller compared with the studyusing predicted data as ambiguous restraints.

It should be emphasised that even a modest improvement ofthe docking performance and a shift of some of the dockingresults from the top 10–100 category to the top 10 categorycan be very helpful. In the present study, the focus was on thefirst systematic search without the inclusion of any refinementor rescoring of the docking predictions. Typically, a limited setof docking solutions will enter a refinement stage at atomicresolution using molecular mechanics modelling methods andpossible rescoring of the solutions. This can give a further gainin docking prediction accuracy and specificity. A restriction ofthe refinement step on a small set of putative docking solutionsis an important prerequisite for the success of the refinementand rescoring of predicted complexes.

Acknowledgements

The authors thank Drs C. Beier, S. de Vries and P. Setny for help-ful discussions. They also thank the Deutsche Forschungsge-meinschaft for financial support (grant Za-153/5).

REFERENCES

Ben-Zeev E, Eisenstein M. 2003. Weighted geometric docking: incorporatingexternal information in the rotation-translation scan. Proteins 52: 24–27.

Chen H Zhou HX. 2005. Prediction of interface residues in protein–pro-tein complexes by a consensus neural network method: Test againstNMR data. Proteins 61: 21–35.

Cheng TM, Blundell TL, Fernandez-Recio J. 2007. PyDock: electrostaticsand desolvation for effective scoring of rigid-body protein–proteindocking. Proteins 68: 503–515.

Dominguez C, Boelens R, Bonvin AMJJ. 2003. HADDOCK: A protein–protein docking approach based on biochemical and biophysicalinformation. J. Am. Chem. Soc. 125: 1731–1738.

Fiorucci S, Zacharias M. 2010. Binding site prediction and improved scor-ing during flexible protein–protein docking with ATTRACT. Proteins78: 3131–3139.

Gottschalk KE, Neuvirth H, Schreiber G. 2004. A novel method for scoringof docked protein complexes using predicted protein–protein inter-action sites Protein Eng. 17: 183–189.

Huang B, Schröder M. 2008. Using protein binding site prediction toimprove docking. Gene 422: 14–21.

Kowalsman N, Eisenstein M. 2009. Combining interface core and wholeinterface descriptors in postscan processing of protein–proteindocking models. Proteins 77: 297–318.

Lensink MF, Mendez R, Wodak SJ 2007. Docking and scoring proteincomplexes: CAPRI 3 rd Edition. Proteins 69: 704–718.

Liang S, Meroueh SO, Wang G, Qiu C, Zhou Y. 2009. Consensus scoring forenriching near-native structures from protein–protein dockingdecoys. Proteins 75: 397–403.

Liang S, Zhang C, Liu S, Zhou Y. 2006. Protein binding site predictionusing an empirical scoring function. Nucleic Acid Res. 34: 3698–3707.

May A, Zacharias M. 2005. Accounting for global protein deformabilityduring protein–protein and protein–ligand docking. Biochim. Bio-phys. Acta 1754:225–231.

May A, Zacharias M. 2008. Energy minimization in low-frequency normalmodes to efficiently allow for global flexibility during systematic pro-tein–protein docking. Proteins 70: 794–809.

Melquiond ASJ, Bonvin AMJJ. 2010. Data-driven Docking: Using ExternalInformation to Spark the biomolecular Rendez-vous. In Protein–protein complexes: Analysis, Modeling and Drug Design, Zacharias M(ed). Imperial College Press: London; 182–208.

Mintseris J, Wiehe K, Pierce B, Anderson R, Chen R, Janin J, Weng Z. 2005. Pro-tein–protein docking benchmark 2.0:An update. Proteins 60: 214–216.

Moreira IS, Fernandes PA, Ramos MJ. 2009. Protein–Protein docking deal-ing with the unknown. J. Comput. Chem. 31: 210–222.

Neuvirth H, Raz R, Schreiber G. 2004. ProMate: A Structure Based Predic-tion Program to Identify the Location of Protein–Protein BindingSites. J. Mol. Biol. 338: 181–199.

Pierce B, Weng Z. 2007. ZRANK: Reranking protein docking predictionswith an optimized energy function. Proteins 67: 1078–1086.

Pierce B, Weng Z. 2008. A combination of rescoring and refinementsignificantly improves protein docking performance. Proteins 72:270–279.

Pons C, Grosdidier S, Solernou A, Perez-Cano L, Fernandez-Recio J. 2009.Present and future challenges and limitations in protein–proteindocking. Proteins 78: 95–108.

Qin S Zhou HX. 2007. Meta-PPISP: a meta web server for protein–proteininteraction site prediction. Bioninformatics 23: 3386–3387.

de Vries SJ, Bonvin AMJJ. 2008. How proteins get in touch: interface pre-diction in the study of biomolecular complexes. Curr. Prot. Pept. Sci.9: 394–406.

de Vries SJ, Bonvin AMJJ. 2011. CPORT: A Consensus Interface Predictorand Its Performance in Prediction-Driven Docking with HADDOCK.PLoS One 6: e17695.

de Vries S, Melquiond AS, Kastritis PL, Karaca E, Bordogna A, van Dijk M,Rodrigues JP, Bonvin, AMJJ. 2010. Strengths and weaknesses of data-driven docking in critical assessment of prediction of interactions.Proteins 78: 3242–3249.

de Vries SJ, van Dijk A, Bonvin AMJJ. 2006. WHISCY: what informationdoes surface conservation yield? Application of data-driven docking.Proteins 63: 479–489.

Zacharias M. 2003. Protein–protein docking using a reduced model. Pro-tein Sci. 12: 1271–1282.

Zacharias M. 2010a. Accounting for conformational changes duringprotein–protein docking. Curr. Opin. Struct. Biol. 16: 194–200.

Zacharias M. 2010b. Scoring and refinement of predicted protein–protein complexes. In Protein–Protein Complexes: Analysis, Modelingand Drug Design, Zacharias M (ed.). Imperial College Press: London;236–271.

Zhang C, Liu S, Zhou Y. 2005. Docking prediction using biological infor-mation, ZDOCK sampling technique, and clustering guided by theDFIRE statistical energy function. Proteins 60: 314–318.

Zhou HX, Qin S. 2007. Interaction-site prediction for protein complexes: acritical assessment. Bioinformatics 23: 2203–2209.



23

Documents

Scoring optimisation of unbound protein–protein docking including protein binding site predictions