Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Molecular modeling, Interactions in Biological Systems II.
1
INTERREG IIIA Community Initiative Program
Szegedi Tudományegyetem Prirodno-matematički fakultet, Univerzitet u Novom Sadu
„Computer-aided Modelling and Simulation in Natural Sciences“
University of Szeged,
Project No. HUSER0602/066
Molecular modeling, Interactions in Biological
Systems II.
Balázs Jójárt Content: I. Theoretical background II. Practical guide for docking small ligands to the enzyme active site Appendix A. – Tutorial files
Molecular modeling, Interactions in Biological Systems II.
2
I. Theoretical background
I.1. Introduction
As biological research has become increasingly data intensive, biomedical projects require
informatics tools.
In drug discovery research, high-through-put screening often requires the screening of
millions of compounds for a particular protein target. Important tools that can enhance such
screens are molecular docking and database mining.
Molecular docking can be defined as the prediction of the structure of receptor-ligand
complexes, where the receptor is usually a protein or a protein oligomer and the ligand is
either a small molecule or another protein.
There are two key parts to any docking program, namely a search of the configurational and
conformational degrees of freedom and the scoring or evaluation function. The search
algorithm must search the potential energy landscape in enough detail to find the global
energy minimum. In rigid docking this means that the search algorithm explores different
positions for the ligand in the receptor active site using the translational and rotational degrees
of freedom. Flexible ligand docking adds exploration of torsional degrees of freedom of the
ligand to this process.
The most cited and used programs are summarized in Table 1.
Software Algorithm References
DOCK Geometric alignment, incremental ligand building
[1]
FlexX Geometric alignment, incremental ligand building
[2, 3]
SLIDE Geometric alignment, multiconformer ligand dictionary
[4]
AutoDock Genetic algorithm [5] ICM Monte Carlo minimization [6] QXP Monte Carlo minimization [7] MDD Molecular Dynamics [8] Glide Systematic Search [9] GOLD Genetic algorithm [10] PRO_LEADS Tabu Search [11] MOE-Dock Tabu Search / Simuleted annelaing [12] FRED Systematic search, multiconformer
ligand dictionary [13,14]
FLOG Systematic search [15] Table 1 The most cited and used docking programs.
Molecular modeling, Interactions in Biological Systems II.
3
The success of a docking algorithm in predicting a ligand binding pose is normally measured
in terms of the root-mean-square deviation (RMSD) between the experimentally observed
heavy-atom positions of the ligands and the one(s) predicted by the algorithm.
AutoDock uses the so called Lamarckian genetic algorithm to predict binding modes of
ligands in proteins and nucleic acids. The genetic algorithm uses the language of the
evolution, where the genes are the state variables (translation, orientation and conformation),
and the atomic coordinates (3D structure) correspond to the phenotype.
The steps of docking calculations are as follows [The abbreviations in brackets will be
referred to Dokcing Parameter File (dpf) and Grid Paramater File (gpf).]:
1. The rapid energy evaluation is achieved by precalculated atomic affinity potentials for
each atom type in the substrate molecule. In the AutoGrid procedure the protein is
embedded in a three-dimensional grid and a probe atom is placed at each grid point.
The energy of interaction of this single atom with the protein is assigned to the grid
point. An affinity grid is calculated for each type of atom in the substrate, typically
carbon, oxygen, nitrogen and hydrogen, as well as a grid of electrostatic potential,
either using a point charge of +1 as the probe. The time to perform an energy
calculation using the grids is proportional only to the number of atoms in the substrate,
and is independent of the number of atoms in the protein.
2. Generating a random population, where the ga_pop_size (dpf) determines the number
of the individuals. The initial population can be visualized only if the output level is
set to 4 in the dpf.
3. Assinging random values for each gene:
a. 3 values for the translational genes – x, y, z, which determines the position of
the ligand in the binding cavity. The binding cavity is determined by setting up
the grid box, the x, y, z values are between the minimum and maximum
extents of the grid box (gpf)
b. 4 values for the orientation gene of the ligand in the binding cavity.
c. N values for the torsional gene, where N is the number of the rotatable bonds.
4. Translation of the genotype to the phenotype (x, y, z coordinates of the ligand) by
means of MAPPING.
5. In the next step the fitness of the individuals will be determined, which is the sum of
the intermolecular interaction energy between the target molecule and the ligand, and
the intramolecular energy of the ligand. (Every time if the fitness is evaluated the
number of the energy evaluations (ga_num_evals in dpf) is increased.)
Molecular modeling, Interactions in Biological Systems II.
4
6. SELECTION: In this step the program decides which individuals will reproduce.
Thus, individuals that have better- than-average fitness receive proportionally more
offspring.
7. CROSSOVER and MUTATION:
a. Two-point crossover: ABC×abc � AbC and aBc, and the parents are replaced
by this individuals.
b. Mutation: add a random real number to the real variable (gene): ABC � Abc.
c. The crossover and mutation rate is controlled via the ga_mutation_rate
ga_crossover_rate keywords in the dpf file.
8. ELITISM: determines how many of the top individuals survive into the next
generation.
9. AutoDock has also a local search implementation: the proportion of the population set
by the ls_search_freq parameter will undergo local searches.
The scoring function has to be realistic enough to assign the most favourable scores to the
experimentally determined complex. Estimating binding free energies accurately is a time-
consuming process. State-of-the-art efforts are represented by the free energy
perturbation/thermodynamic integration methodology. Although the MM-PBSA and explicit
solvent/implicit solvent (ES/IS) methods can achieve similar accuracy at a smaller
computational cost, these methodologies cannot currently be used in screening large numbers
of ligands against a protein target.
In AutoDock3 version the authors applied a molecular mechanics approach to evaluate
enthalpic contributions such as dispersion/repulsion and hydrogen bonding and an empirical
approach to evaluate the entropic contribution of changes in solvation and conformational
mobility. Empirical weights were applied to each of the components based on calibration
against a set of known binding constants (30 protein-ligand complex). The final
semiempirical force field is designed to yield an estimate of the binding constant. In the
following version, in AutoDock4, the semiempirical scoring function was calibrated on 188
complexes and tested on 100 complexes. In this scoring function a new thermodynamic model
was applied describing the binding process, and a full desolvation term was included. The free
energy of binding is estimated to be equal to the difference between (1) the energy of the
ligand and the protein in a separated unbound state and (2) the energy of the ligand–protein
complex.
The question can be arisen, why choose we the AutoDock for detailed description, learning
and studying. The first answer is: it is free (of course for academic users). The second one
Molecular modeling, Interactions in Biological Systems II.
5
(which is more important): it can be applied in several field of the computer aided drug
design: (1) investigation of various receptor-ligand interactions (saccharides [16],
cytochromes [17], 3D-QSAR (structure-based alignment) [18], alcohol dehydrogenase [19],
CADD [20], HIV [21]); (2) ‘blind docking’ [22,23]; (3) it can be applied in virtual screening
[24].
Molecular modeling, Interactions in Biological Systems II.
6
II. Practical guide for docking small ligands to the enzyme
active site
During the practice session we are going to study the interaction between the cyclooxygenase
(COX) enzyme and a selective inhibitor, SC-588 (Figure 1, PDB ID: 6COX).
Figure 1 The structure of SC-588.
II.1. Obtaining the enzyme – ligand complex structure &
visualization using Visual Molecular Dynamics
Open a browser and write the following address: www.rcsb.org, which is the homepage of the
greatest database of protein, nucleic acid structures (and its complexes, Figure 2).
Molecular modeling, Interactions in Biological Systems II.
7
Figure 2 The homepage of the greatest 3D structure database.
What kind of information can we obtain from this page (Figure 3)?
6COX
Molecular modeling, Interactions in Biological Systems II.
8
Figure 3 Selected and important information about the molecule.
By clicking on the download button ( ) you save the file.
On the console go to 6COX directory and type vmd, the following screens appear (Figure 4).
TM NAME, LIGAND NAME
IMPORTANT, IF THE
RESOLUTION = 0.00 ���� NMR
VIEW THE ASCII FILE
Molecular modeling, Interactions in Biological Systems II.
9
Figure 4 The graphical user intareface (GUI) of VMD.
In the VMD Main window click: File/New Molecule …/; and in the appearing window write
6COX in the filename box and press the Load button. VMD downloads the appropriate file.
VMD Main/Graphics/Representation. On the Graphical Representation window click the
‘Create Representation’ button.
Click on the first representation and write the following in the Selected Atoms box:
protein and chain A, in the Coloring Method choose Structure and in the Drawing Method
NewCartoon.
Click on the second representation and write the following in the Selected Atoms:
resname S58 and chain A, in the Coloring Method choose Name and in the Drawing
Method Licorice.
Change the background color as follows: main window: Graphics/Colors/Display (in
Categories)/Background (in Names)/8 white (in colors).
MAIN WINDOW
COMMAND LINE
INTERFACE
Molecular modeling, Interactions in Biological Systems II.
10
We can save our work as follows: VMD Main/File/Save State � 6cox.vmd. (you can load
your work via VMD Main/File/Load State).
If you would like to make a high quality picture do the following (Figure 5):
VMD Main/File/Render and use from the Render using the Tachyon program, and press
the Start Rendering button.
Figure 5 6COX enzyme in NewCartoon representation complex with SC-588 in Licorice representation.
We can save our work:
VMD Main/Save coordinates/Selected atoms – protein and chain A; filetype: pdb and press
the Save button � 6COX_prot.pdb
VMD Main/Save coordinates/Selected atoms – resname S58 and chain A; filetype: pdb and
press the Save button � 6COX_ligand.pdb.
II.2. Preparing the input files for docking calculation
You obtain the minimized structures (which were prepared via AMBER9), in this section we
are focusing on the input file preparation with the GUI of AutoDockTools1.5 (Figure 6) and
we show also the python scripts.
Molecular modeling, Interactions in Biological Systems II.
11
Figure 6 The GUI of AutoDockTools1.5.
Launch the AutoDockTools by clicking on the appropriate icon.
Molecular modeling, Interactions in Biological Systems II.
12
II.2.1. Ligand file preparation (6COX_lig_min.pdb)
PMV/Ligand/Input/Open and choose
6COX_lig_min.pdb.
A summary window appears with important
information:
• charges were assigned;
• non-polar hydrogens were merged
• the program found 15 aromatic hydrogens
• 5 rotatable bounds were detected �
torsional degree of freedom was set to 5.
Molecular modeling, Interactions in Biological Systems II.
13
PMV/Ligand/Torsion Tree/Detect Root
• the root atom will be depicted with a
green sphere
Molecular modeling, Interactions in Biological Systems II.
14
PMV/Ligand/Torsion
Tree/Choose Torsions
Change the ligand representation to
lines, otherwise you can not see the
rotatable bonds!
• Green bonds are flexible
during the calculation
• Red bonds are rigid during
the calculations.
Molecular modeling, Interactions in Biological Systems II.
15
PMV/Ligand/Output/Save as PDBQT ����
6COX_lig_min.pdbqt
II.2.2. Receptor file and GRID parameter file preparation
(6COX_prot_min.pdbqt, 6COX.gpf)
PMV/Grid/Macromolecule/Open and choose 6COX_prot_min.pdb.
Save the molecule as 6COX_prot_min.pdbqt.
The summary window appears with relevant information:
• how many non-polar hydrogens were found
• non-polar hydrogens were merged.
Press the ‘N’ button in order to normalize the location of the molecules.
Molecular modeling, Interactions in Biological Systems II.
16
In the next steps the necessary parameters for grid parameter files are set up.
PMV/Grid/Set Map
Types/Choose Ligand ...
and choose the
6COX_lig_min.pdbqt.
PMV/Grid/Grid Box … and
Molecular modeling, Interactions in Biological Systems II.
17
Grid Options/Center/Center On Ligand
Grid Options/File/Close saving current we
can save our work by this action.
Molecular modeling, Interactions in Biological Systems II.
18
We can save the gpf:
PMV/Grid/Output/Save GPF
… ���� 6COX.gpf
The size of the grid box is increased in the output gpf file using a simple text editor,
change the line:
npts 40 40 40 # num.grid points in xyz
to
npts 62 46 60 # num.grid points in xyz.
II.2.3. DOCKING parameter file preparation (6COX.dpf)
PMV/Docking/Macromolecule/Set Rigid Filename and choose 6COX_prot_min.pdbqt.
Molecular modeling, Interactions in Biological Systems II.
19
PMV/Docking/Ligand/Choose 6COX_lig_min.pdbqt.
PMV/Docking/Search Parameters/Genetic algorithm and set the following values:
• Number of GA runs: 50
• Population size: 150
• Maximum Number of evals:
5.000.000, and click Accept button.
Molecular modeling, Interactions in Biological Systems II.
20
PMV/Docking/Docking
parameters…
In the window Set Docking Run Options the:
‘for the step size parameters’
are changed as follows:
Translation 0.5; Quaternation
5.0 and Torsion 5.0.
and
‘RMS cluster tolerance’: 1.0
Molecular modeling, Interactions in Biological Systems II.
21
PMV/Docking/Output/Lamarckian GA ���� 6COX.dpf
II.3. Performing the grid and docking calculations
In the first step using the autogrid4 command we can calculate the grid maps by executing:
autogrid4 –p 6COX.gpf –l 6COX.glg
After the grid maps calculation, we can perform the docking calculations also as follows:
autodock4 –p 6COX.dpf –l 6COX.dlg
II.4. Evaluation of the results
The evaluation of the results can be performed via the GUI of AutoDockTools1.5 or by
python scripts. Here we perform evaluation via the GUI; about the scripts you can find a good
description on this site: http://autodock.scripps.edu/faqs-help/faq/where-can-i-find-the-
python-scripts-for-preparing-and-analysing-autodock-dockings.
Read the docking file:
PMV/Analyze/Dockings/Open/6COX.dlg
The molecule appears on the screen in the initial conformation/location. Press the ‘N’ and
after that he ‘C’ buttons in order to normalize and centre the view of the ligand.
Mouse buttons:
MIDDLE � ROTATE (push) & SCALING (rolling)
RIGHT � TRANSLATE
If you also like to visualize the macromolecule:
PMV/Analyze/Choose … and localize the macromolecule filename and press the ‘N’ button.
Here in PMV you can also use different colouring and visualization schemes.
Molecular modeling, Interactions in Biological Systems II.
22
Figure 7 The easiest way to change the representation of the molecules on the GUI of AutoDockTools1.5
To change the background colour:
PMV/3D Graphics/SetBackGroundColor/Edit/Add new Color and change the parameters
in the small boxes as follows: R:0.3, G:0.3; B:0.3 and press ‘Add to custom’ and click the
new colour button in the panel and after all click DISMISS.
Change the colouring method of 6COX_origin_lig to Mol, in this case the ligand is coloured
by blue. (We have to keep in
mind that this structure is still
the initial structure of the
ligand!!!!) You can visualize
the docking posses by green
spheres in the binding pocket as
follows:
PMV/Analyze/Dockings/Show as Spheres ���� and select 6COX_origin_test1.dlg. You can
set up the radii of the spheres, set it to 0.23, as well.
Molecular modeling, Interactions in Biological Systems II.
23
Now we can see the results of the clustering. In order to see the difference between the crystal
and docked structire, we have to load the crystal structure as well:
PMV/File/Read Molecule… and select 6COX_lig_min.pdbqt and select the following
representation for this molecule: S&B and Atom (colouring method).
PMV/Analyze/Clustering/Show
Molecular modeling, Interactions in Biological Systems II.
24
As you can see in the appearing window the cluster analysis revealed in two different
populations, click on the largest one and a new window appears:
Click on the icon in the window � in Set Play Options check the Show Info box
�Conformation_1_1 Info
Molecular modeling, Interactions in Biological Systems II.
25
In case of this conformation we were able to reproduce the binding structure of the ligand
with a 0.54 Ǻ RMS (refRMS = ∑=
n
i
id1
2 ). The inhibition constant is also calculated according
to the following equation: ∆Gbinding=RTlnKi. (We have to keep in mind that the unit of
∆Gbinding is kcal×mol-1 and that of R is J×mol-1×K-1, therefore you have to convert it!!!!) Here
we have to mention, that the torsional free energy term was calculated as follows:
∆Gtors=Ntors×ctors=5×0.274 kcal×mol-1=1.37 kcal×mol-1, where Ntors is the number of the
rotatable bonds, and ctors is the torsional coefficient.
The -10.8 kcal×mol-1 value of the ∆Gbinding is very important if you plan to do virtual
screening on this enzyme, and you would like to use this compound as positive control. This
energy, plus the standard deviation in the predicted ∆Gbind of the AutoDock 4 force field, 2.62
kcal/mol, forms the threshold above which you will be looking for “hits”, molecules with
better ∆Gbind than the positive control’s ∆Gbind.
If you click on the other cluster, you obtain the lowest energy structure, and you can see, that
the ligand recognizes the binding pocket (it binds exactly to the same site), but two aromatic
ringst of the ligand is changed. Nevertheless the refRMS is also significantly higher (4.86 Ǻ)
and the ∆Gbind (-9.57 kcal×mol-1) is also higher (which means lower binding affinity).
Click on the first cluster again, and select 6COX_origin_lig in the PMV window
Now you can save only the docked structure: PMV/File/Save/Write PDB �
6COX_origin_lig.pdb.
Molecular modeling, Interactions in Biological Systems II.
26
Appendix A. – Tutorial files
01_input
− 6COX.pdb – 3D coordinates of COX – S58, downloaded from PDB databank.
− 6COX_lig.pdb – S58 coordinates, retrieved from 6COX.pdb
− 6COX_lig_min.pdb – minimized ligand structure
− 6COX_prot.pdb– enzyme coordinates, retrieved from 6COX.pdb
− 6COX_prot_min.pdb – minimized enzyme structure
02_GUI
− 6COX.dpf – docking parameter file
− 6COX.gpf – grid parameter file
− 6COX_lig_min.pdbqt – ligand coordinate file in AutoDock4.01 format
− 6COX_prot_min.pdbqt – enzyme coordinate file in AutoDock4.01 format
03_autogrid
− 6COX.glg – grid log file
− *.map – atom affinity files
04_autodock
− 6COX.dlg – docking log file
− 6COX_origin_lig.pdb – lowest energy structure of the ligand from the highest
populated cluster
Molecular modeling, Interactions in Biological Systems II.
27
References
1 Kuntz, I.D.; Blaney, J.M.; Oatley, S.J.; Langridge R.; Ferrin T.E. J. Mol. Biol. 1982, 161, 269-288 2 Hindle, S. A.; Rarey, M.; Buning, C.; Lengaue, T. J. Comput.-Aided Mol. Des. 2002, 16, 129-149. 3 Rarey, M.; Kramer, B.; Lengauer, T.; Klebe, G. J. Mol. Biol. 1996, 261, 470-489. 4 Schnecke, V.; Swanson, C.A.; Getzoff, E.D.; Tainer, J.A.; Kuhn, L.A. Proteins 1998, 33, 74-87. 5 Morris, M.G.; Goodsell, D.S.; Halliday, R.; Huey, R.; Hart W.E.; Belew, R.K.; Olson A.J. J. Comp. Chem.
1998, 19, 1639-1662. 6 Abagyan, R.; Totrov, M.; Kuznetsov, D. J. Comput. Chem. 2004, 15, 488 – 506. 7 Mcmartin, C.; Bohacek, R.S. J. Comput.-Aided Mol. Design. 1997, 11, 333 – 344. 8 Di Nola, A.; Roccatano, D.; Berendsen, H.J.C. Proteins 1994, 19, 174-182. 9 Halgren, T.A.; Murphy, R.B.; Friesner, R.A.; Beard, H.S.; Frye, L.L.; Pollard, W.T.; Banks, J.L. J. Med. Chem.
2004, 47, 1750-1759. 10 Friesner, R.A.; Banks, J.L.; Murphy, R.B.; Halgren, T.A.; Klicic, J.J.; Mainz, D.T.; Repasky, M.P.; Knoll,
E.H.; Shaw, D.E.; Shelley, M.; Perry, J.K.; Sander, L.C.; Shenkin, P.S. J. Med. Chem. 2004, 47, 1739-1749. 11 Jones, G.; Willett, P.; Glen, R.C. J. Mol. Biol. 1995, 245, 43-53. 12 MOE (Molecular Operating Environment), Chemical Computing Group Inc., 1010 Sherbrooke St. West, Suite
910, Montreal, Quebec, H3A 2R7, Canada. 13 Baxter, C.A.; Murray, C.W.; Clark, D.E.; Westhead, D.R.; Eldridge M.D. Proteins 1998, 33, 367-382. 14 FRED, OpenEye Scientific Software, 3600 Cerrillos Rd., Suite 1107, Santa Fe, NM 87507 15 Miller, M.D.; Kearsley, S.K.; Underwood, D.J.; Sheridan, R.P. J. Comput.-Aided Mol. Design 1994, 8, 153-
174. 16 Laederach, A.; Dowd, M.K.; Coutinho P.M.; Reilly, P.J. Proteins 199, 37, 166-175.
Coutinho, P. M.; Dowd, M. K.; Reilly, P. J. Industrial & Engineering Chemistry Research, 1998, 37, 2148-2157. Coutinho, P. M.; Dowd, M. K.; Reilly, P. J. Proteins 1997, 28, 162-173. Coutinho, P. M.; Dowd, M. K.; Reilly, P. J. Proteins, 1997, 27: 235-248.
17 Lozano, J. J.; López-de-Briñas, E.; Centeno, N.B.; Guigó, R; Sanz, F. J. Computer-Aided Molecular Design, 1997, 11, 395-408. Matias, P. M.; Saraiva, L. M.; Soares, C. M.; Coelho, A. V.; LeGall, J.; Armenia Carrondo, M. JBIC, 1999, 4, 478-494.
18 Gamper, A.M.; Winger, R.H.; Liedl, K.R.; Sotriffer, C.A.; Varga, J.M.; Kroemer, R.T.; Rode, B.M. J. Med. Chem. 1996, 39, 3882-3888. 19 Kedishvili, N. Y.; Bosron, W. F.; Stone, C. L.; Hurley, T.D.; Peggs, C. F.; Thomasson, H. R.; Popov, K. M.;
Carr, L. G.; Edenberg, H. J. and Li, T.-K. J. Biol. Chem. 1995, 270, 3625-3630. Stone, C. L.; Hurley, T. D.; Peggs, C. F.; Kedishvili, N. Y.; Davis, G. J.; Thomasson, H. R.; Li, T.-K. and Bosron, W. F. Biochemistry 1995, 34, 4008-4014.
20 Lorber, D. M. Chemistry & Biology 199, 6: R227-R228. Walters, W.P.; Stahl, M.T.; and Murcko, M.A. Drug Discovery Today 1998, 3, 160-178.
21 Tummino, P. J.; Ferguson, D.; Jacobs, C. M .; Tait, B.; Hupe, L.; Lunney, E. and Hupe, D. Arch. Biochem. Biophys. 1995, 316, 523-528. Lunney, E. A.; Hagen, S. E.; Domagala, J. M.; Humblet, C.; Kosinski, J.; Tait, B. D.; Warmus, J. S.; Wilson, M.; F erguson, D.; Hupe, D.; Tummino, P. J.; Baldwin, E. T.; Bhat, T. N.; Liu, B. and Erickson, J. W. J. Med. Chem. 1994, 37, 2664-2677. Vara Prasad, J. V. N.; Para, K.S.; Ortwine, D. F.; Dunbar, Jr.; J. B.; Ferguson, D.; Tummino, P. J.; Hupe, D.; Tait, B. D.; Domagala, J. M.; Humblet, C.; Bhat, T. N.; Liu, B.; Guerin, D. M. A.; Baldwin, E. T.; Erickson, J. W. and Sawyer, T. K. J. Am. Chem. Soc. 1994, 116: 6989-6990.
22 Hetenyi, C.; van der Spoel, D. Protein Sci. 2002, 11, 1729-1737. 23 Hetényi, C.; van der Spoel, D. FEBS Letters 2006, 580, 1447-1450. 24 Li, C.; Xu, L.; Wolan, D.W.; Wilson, I.A.; Olson A.J. J. Med. Chem. 2004, 47, 6681-6690.