Upload
phamxuyen
View
217
Download
1
Embed Size (px)
Citation preview
Docking: placing a ligand into a receptor cavity
Esther Kellenberger
Faculté de PharmacieUMR 7200, Illkirch
Tel: 03 90 24 42 21 e-mail: [email protected]
1/34
docking
Geometrical Complementarity
non-covalent inter-molecular interactions
DockingDocking: : thethe locklock & & keykey principleprinciple
substrate
enzyme
ligand
receptor
3D Protein 2/34introduction scoring conclusion
ThermodynamicsThermodynamics ofof associationassociation
RTGK A
0
exp Δ−=
ΔG0; RT in J/mole
AK =LR[ ]
L[ ] R[ ]
Association constant (equilibrium)
no unit
L
RLR
-4.110-3
-8.210-6
-12.310-9
-16.410-12
-20.410-15
ΔG° (kcal/mol)at 298K
KD (M)
docking3D Protein 3/34introduction scoring conclusion
What is gained/lost upon binding?
Free energy ΔG = ΔH – TΔS
Enthalpy H ~ sum of interactions
Entropy S ~ order
L: ligand, R: recepteur, W: water
ThermodynamicsThermodynamics ofof associationassociation
docking3D Protein 4/34introduction scoring conclusion
Thymidylate Synthetase: side-chains rotamersCalmoduline: domain motions
MolecularMolecular flexibilityflexibility: : LockLock & & keykey oror inducedinduced fitfit
Nature 450, 964-972 ( 2007)
docking3D Protein 5/34introduction scoring conclusion
Chapter1:
Proteins are folded
biopolymers
6/34
primaryprimary, , secondarysecondary, , tertiarytertiary & & quaternayquaternaystructuresstructures
20 monomers differ by their R group
Polymerisation reaction
http://www.yellowtang.org
docking3D Protein 7/34introduction scoring conclusion
secondary structure
tertiary structure
quaternary structure
helixsheet
primary structure amino acids
LimitedLimited numbernumber ofof structural organisations structural organisations for a for a hugehuge varietyvariety ofof functionsfunctions
Membrane proteins
• ~30% of human total genes• Signal transducer, transporter(Na+, proton, glucose), channels
β2 adrenergic receptor stimulated by adrenaline flight or fight"
globular proteins
• hydrosoluble (cytoplasm, blood)• enzyme catalysis, defense (toxin, immunoglobulin), transporter (O2, electrons), motion (actin, myosin), regulation (osmotic protein, gene regulators, hormone) ion storage(ferritin, calmodulin)
Fibers
• auto assembly supramolecules
• collagen (cartilage, bone, teeth), keratin (hair, nail), fibrin(blood clots)
Influenza neuraminidase(H1N1 surface protein)Human collagen
http://www.rcsb.org
docking3D Protein 8/34introduction scoring conclusion
ExperimentalExperimental determinationdetermination ofof 3D structure3D structure
X-ray structure of the crystal protein
Quality check: the resolution (Å)
Information about position
docking3D Protein 9/34introduction scoring conclusion
ExperimentalExperimental determinationdetermination ofof 3D structure3D structure
NMR structure of the protein in solution
Quality check: quantity/quality/distribution of information
Information about DISTANCE (NOE)ANGLE (coupling constant)RELATIVE POSITION(residual dipolar coupling)
docking3D Protein 10/34introduction scoring conclusion
since 1998, maintained by Research Collaboratory for Structural Bioinformatics
European Macromolecular
Structure Database
Protein DataBankJapan
NMR data (since 2006)
Weekly updated, data synchronisation on mirror sites
Data deposit, treatementand distribution
SinceSince 2003: a single archive 2003: a single archive ofofpublic 3D structures public 3D structures ofof biomoleculesbiomolecules
docking3D Protein 11/34introduction scoring conclusion
749771388156536NMR
15325315225other(microscopy)
53263332152191746161total
45471241961108642400R-X
totalothersComplexes Nucleic acidsProtein#
Structure Factors available for 34603 entries. NMR constraints for 4189 entries.
PDB PDB statisticsstatistics: : SeptemberSeptember 20082008
Sequence redundancy: 8483 proteins with <30% sequence identity
Structure redundancy : about 700 different folds (ternary structures)
docking3D Protein 12/34introduction scoring conclusion
PDB PDB entryentry: format : format andand contentcontent
Format: Flat file organized into tagged fields
• Header: information
• Connexion table: atom section only (element: C, O, N, S, H),
no explicit bonds for standard amino acids
Incomplete description of protein structure
well defined –CONH2
water molecules
Metal ions, cofactors
hydrogen atoms
NMRX-rayMETHOD
docking3D Protein 13/34introduction scoring conclusion
TargetingTargeting ProteinProtein by by chemicalschemicals
protein function• biochemical function = interaction with other molecules
• biological function = consequence of these interactions
“druggable” binding sitecavity
• depth
• volume
• lipophilic surface
about 6000 « druggable » sites in PDB protein X-ray structures
http://bioinfo-pharma.u-strasbg.fr/scPDB/
ligandCavity
Binding pocket
docking3D Protein 14/34introduction scoring conclusion
Chapter2:
Docking chemicals into
protein cavities
15/34
SemiSemi--flexibleflexible dockingdocking
docking3D Protein 16/34introduction scoring conclusion
number of rotatable bonds
conformational search
cpu time
< 25-30
exhaustive
seconds to hours
≥ 3 per amino acids
partial
huge!
feasible Difficult!
Pose = Orientation & conformation
O
NH
ONH
OHO
NH2
OOrientation
• Position of the ligand in the protein
• Rigid body motions • Translations + rotations of
the whole molecule
Conformation• 3D structure of the molecule• Molecular flexiblity
ONH
protein
protein
OHO
NH2
O
search for the best pose search for the best pose of of flexibleflexible ligandligand into into rigidrigid protein siteprotein site
docking3D Protein 17/34introduction scoring conclusion
geometrygeometry--basedbased vsvs energyenergy--basedbased algorithmsalgorithms
docking3D Protein 18/34introduction scoring conclusion
O
NH
OHO
NH2
O
O
NH
OHO
NH2
O
geometry-baseddeterminist methods
• Protein: feature points in the cavity• Ligand: atoms or feature points• Docking: translations + rotations of
rigid ligand to superpose matching points
energy-basedstochastic methods
• Protein : surface atoms / feature points • Ligand : atoms or feature points• Docking: modification of ligand
position/conformation to optimizean “energy” function that describes molecular interactions
protein protein
ExamplesExamples ofof geometricgeometric algorithmalgorithm
docking3D Protein 19/34introduction scoring conclusion
DOCKKuntz, I.D et al. (1982) J. Mol. Biol
• Protein cavity filled withoverlapping spheres(variable radius).
• Feature points: spherecenter colored accordingto physico-chemicalproperties
FlexxM. Rarey et al. (1996) J. Comp.-Aid. Mol. Design
Interaction centers and interaction surfaces identified on both receptor (a) and ligand (b)
• H bond• Salt bridges• Aromatics• methyl-aromatics• amide-aromatics
Cluster of probes• steric (apolar)• Polar
(NH, CO in Surflex, polar in MOE)
Surflex, MOEJain A (2003) J Med Chem
GeometricGeometric matchingmatching ofof trianglestriangles
docking3D Protein 20/34introduction scoring conclusion
ligandprotein
Ligand Ligand flexibilityflexibility in in geometricgeometric algorithmsalgorithms
docking3D Protein 21/34introduction scoring conclusion
Incremental construction (FlexX, DOCK, Surflex)
Library of conformers• rigid docking of all conformers (DOCK, MOE)• conformer generation usually not included in the docking tool• in MOE-Dock, the conformational search is coupled to docking, using a library
of preferred torsion values for rotatable bonds
energyenergy--basedbased algorithmsalgorithms
docking3D Protein 22/34introduction scoring conclusion
Optimization of an energy function
to find stable conformations of the
ligand / receptor complex
energy
conformational spaceO
NH
OHO
NH2
O
O
NH
OHO
NH2
O
ONH
OHO
NH2
O
protein
iterativeiterative generationgeneration ofofpopulations populations ofof conformersconformers
docking3D Protein 23/34introduction scoring conclusion
starting population
final population
modified population
modification of conformations
Selection of conformations
based on energy
GeneticGenetic algorithmalgorithm (gold, (gold, autodockautodock))
docking3D Protein 24/34introduction scoring conclusion
reproduction
Selection of conformations based
on energy
modification of selection
parameters(soft hard)
Moitessier N (2008) Br J Pharmconvergence
Monte Carlo Monte Carlo (ICM, RosettaLigand)
docking3D Protein 25/34introduction scoring conclusion
Selection of conformations
based on energyx cooling cycles
(T decrease)
Bond rotationTranlationRotation
Ei= +36 kcal/mol Ei= 0 kcal/mol
Ef= +10 kcal/molEf= +22 kcal/mol
if Ef < Ei acceptif Ef > Ei metropolis
⎟⎟⎠
⎞⎜⎜⎝⎛− TKB
z⎟⎟⎠
⎞⎜⎜⎝⎛− >
TKEf -Ei
Bexp
accept
Ef= +2 kcal/mol
…n iterations
Chapter3:
Scoring ligand/receptor
interaction
26/34
empiricalempirical vsvs force force fieldfieldscoringscoring functionsfunctions (SF)(SF)
docking3D Protein 27/34introduction scoring conclusion
Force Field SFEmpirical SF
selected ligand/receptor
X-ray structures
experimental ΔG
non-bonded interactionsdistance between ligand -receptor pairs of atoms
empirical SF knowledge-based SF
S = Eligand + Ecomplexe
Eligand = internal energy
Ecomplexe =
Prediction of free energy by Empirical SFPrediction of free energy by Empirical SF
docking3D Protein 28/34introduction scoring conclusion
S ≈ ∆G = ∑ ki * Fi
Fi : function which describes protein – ligand interaction (H-bond, salt bridge..)
computed using geometry predicted by docking
ki : constant adjusted using the training set
MOE MOE affinity affinity dGdG SFSF
fkGhb
hbbind +=d ∑
k: constantf: count of atomic contact
hb fkml
ml +∑ ml fkhh
hh +∑ hh fkhp
hp +∑ hp fkaa
aa∑ aa
hb: H-bond donor – H-bond acceptorml: ionic metal - ligandhh: hydrophobic atom – hydrophobic atomshp: hydrophobic atom – polar atomaa: any atom - any atom
Böhm (1994)
( ) ( ) NROTGAGRfGRfGGG rothb
lipolipoionic
ionichb
hbbind Δ+Δ+ΔΔ+ΔΔΔ+Δ=Δ ∑∑∑ α,0
+5.4 kJ/mol -4.7 kJ/mol -8.3 kJ/mol -0.17 kJ/mol.Å2 +1.4 kJ/mol/rot
H-bond ionic lipophilic rotation
-50 -45 -40 -35 -30 -25 -20-50
-45
-40
-35
-30
-25
-20
ΔGpr
ed, k
J/m
ol
ΔGexp, kJ/mol
Q2LOO = 0.777, spress=3.15 kJ/mol, n=37 r 2pred = 0.698, s = 5.33 kJ/mol, n=16
-55 -50 -45 -40 -35 -30 -25 -20-60
-55
-50
-45
-40
-35
-30
-25
ΔG
pred
, kJ/
mol
ΔGexp, kJ/mol
test
ExempleExemple of empirical SF performance of empirical SF performance
docking3D Protein 29/34introduction scoring conclusion
training
docking3D Protein 30/34introduction scoring conclusion
StrengthStrength andand weaknessweakness ofof SFSF
• fast calculation• adaptable to custom target
• training set (incompleteness, inaccuracy of data)• binding mode (if few polar interactions, underestimation of score)• missing penalty (steric clashes, polar/apolar match, internal ligand
energy, lost of entropy, geometry of directional interaction, localenvironment of interaction)
• independant of training set
• strongly depends on ligand size• force field accuracy, difficulty to set parameters• no account of entropy• Often includes empirical terms !
Empirical SF
Force Field SF
docking3D Protein 31/34introduction scoring conclusion
postpost--processingprocessing output: pose output: pose selectionselection by by Interaction Interaction FingerprintFingerprint analysisanalysis
8 bits / target site residue
1.1. HydrophobiHydrophobicc contact2.2. AromaticAromatic interaction
(Face/Face)
3.3. AromaticAromatic interaction (Face/Edge)
4.4. HH--bondbond (Donor in ligand)
5.5. HH--bondbond (Donor in protein)
6.6. IonicIonic bond (+) in ligand
7.7. IonicIonic bond (+) in protein
8.8. MetalMetalInteraction FingerPrint
AA1 AA2 … AAn10000000 01001000 … 01000100
ligand protein
Marcou, G et al (2007) J Chem Inf Model
conclusion
What can we achieve?
What remains to be improved?
32/34
docking3D Protein 33/34introduction scoring conclusion
The state of the artThe state of the art
• docking accuracy (Warren et al. 2006 Proteins, Kellenberger et al. 2004 Proteins)
• many programs able to reproduce x-ray conformation
• but performance is highly dependend on the studied protein
• cpu time:
• few seconds to several minutes to dock one compound
• app. screening rate: 1,500 compounds/day/processor
• hit rate (true positives in hit list) in screening by high throughput docking
• ~ 50 % retrospective studies
• from 10% to 30% in prospective studies using X-ray structures
• lower rates for docking using homology models
docking3D Protein 34/34introduction scoring conclusion
The limitationsThe limitations
Pre-processing of the protein
Lacking hydrogens, hydrogen bonds networks, protonation states of
his, lys, asp, glu
Water molecule(s) involved in the binding mode
Pre-processing of the ligands
Protonation states, tautomers
Flexibility of the ligand
Flexibility of the protein /binding site
Fuzzy scoring functions
no serious problem
a difficult problem
the biggest problem