Upload
jackie
View
28
Download
0
Embed Size (px)
DESCRIPTION
Advanced Bioinformatics Lecture 7: Computer-aided lead identification. ZHU FENG [email protected] http://idrb.cqu.edu.cn/ Innovative Drug Research Centre in CQU. 创新药物研究与生物信息学实验室. Table of Content. Schematic of DOCKing Pharmacophore-based docking INVDOCK Strategy - PowerPoint PPT Presentation
Citation preview
Advanced BioinformaticsLecture 7: Computer-aided lead identification
http://idrb.cqu.edu.cn/Innovative Drug Research Centre in CQU
创新药物研究与生物信息学实验室
1. Schematic of DOCKing
2. Pharmacophore-based docking
3. INVDOCK Strategy
4. Ligand-based drug design
5. Classification of drugs by SVM
Table of Content
2
Given two molecules find their correct association
What is docking?
3
+ =Recep
tor
Ligand
T
Complex
Computationally predict the structures of protein-ligand complexes from
their conformations and orientations. The orientation that maximizes the
interaction reveals the most accurate structure of the complex.
Ligand
−Molecule that binds
with a protein
Protein active
site(s)
−Allosteric binding
−Competitive binding
Function of
binding interaction
−Natural and artificial
General protein–ligand binding
4
Docking strategy
5
PDB file
Surface Representation
Patch Detection
Matching Patches
Scoring & Filtering
Candidatecomplexes
Schematic of docking methodology
6
(A) the target binding site is
filled with site points
(B) distances between atoms in
a molecule are matched to
that of site points
(C) a transformation matrix is
calculated for an orientation
(D) the molecule is docked into
the binding site, and the fit
of that conformer is scored
Design of HIV-1 protease inhibitorStep 1: creation of spheres to fit a cavity
7
Design of HIV-1 protease inhibitorStep 2: place a ligand to match the position of spheres
8
Design of HIV-1 protease inhibitorStep 3: check chemical complementarity
9
Scoring in ligand-protein dockingPotential energy description
10
Surface representation, that efficiently represents the docking surface and identifies the regions of interest
− Connolly surface
− Lenhoff technique etc.
Some techniques
Dense MS surface (Connolly) Sparse surface (Shuo Lin et al.)
11
Each atomic sphere is given the van der Waals radius of the atom
Rolling a Probe Sphere over the Van der Waals surface leads to the Solvent Reentrant Surface or Connolly surface
Connolly surface
12
Computes a “complementary” surface for the receptor instead of the Connolly surface, i.e. computes possible positions for the atom centers of the ligand
Lenhoff technique
13
Atom centers of the ligand
van der Waals surface
Pharmacophore-based dockingBasic idea
14
Appropriate spatial disposition of a small
number of functional groups in a molecule is
sufficient for achieving a desired biological
effect.
The ensemble formation will be guided by
these functional groups
5.2
4.2-4.7
6.7
4.8
5.1-7.1
3-D representation of a protein binding site
15
Distances betweenbinding groupsin Angstroms and the type of interactionis searchable
Pharmacophore Fingerprint
16
Appropriate spatial disposition of a small
number of functional groups in a molecule is
sufficient for achieving a desired biological
effect.
The ensemble formation will be guided by
these functional groups
Schematic of PhDOCK methodology
17DOCK PhDOCK
Advantages and disadvantages of PhDOCK
18
Advantages: speed increase due to (1) rapid elimination of
ligands containing functional groups which would interfere
with binding. (2) speed increase over docking of individual
molecules. (3) more information pertaining to the entire
molecule is retained (no rigid portions). (4) Chemical matching
and critical clusters are encouraged.
Disadvantages: (1) complex queries are extremely slow. (2) the
majority of the information contained in the target structure is
not considered during the search.
19
Existing methods
Given a protein, find putative
binding ligands from chemical
database
Given Lock, find Key
Forward lead identification
Science 1992; 257:1078
INVDOCK methods
Given a ligand, find putative
protein targets from protein
database
Given Key, find Lock
Backward MOA prediction
Proteins 1999; 36:1
INVDOCK Strategy
INVDOCK Test on Drug Target Prediction Anticancer Drug Tamoxifen
20
PDB Id Protein Experimental Findings1a25 Protein Kinase C Secondary Target1a52 Estrogen Receptor Drug Target1bhs 17 beta HSD dehydragenase Inhibitor1bld bFGF Factor Inhibitor1cpt Cytochrome P450-TERP Metabolism1dmo Calmodulin Secondary Target
Proteins. 1999; 36:1
Tamoxifen is a famous anticancer
drug for treatment of breast cancer.
It was approved by FDA in 1998 as
the 1st cancer preventive drug. 30
million people are expected to use it.
Compound
Number of experimentally confirmed or implicated toxicity targets
Number of toxicity targets predicted by INVDOCK
Number of toxicity targets missed by INVDOCK
Number of toxicity targets without structure or involving covalent bond
No. of INVDOCK predicted toxicity targets without experimental finding
Aspirin 15 9 2 4 2
Gentamicin 17 5 2 10 2
Ibuprofen 5 3 0 2 2
Indinavir 6 4 0 2 2
Neomycin 14 7 1 6 6
Penicillin G 7 6 0 1 8
Tamoxifen 2 2 0 0 4
Vitamin C 2 2 0 0 3
Total 68 38 5 25 29
INVDOCK Test on Drug Target Prediction Drug Toxicity Targets (J. Mol. Graph. Mod. 2001, 20, 199)
21
The docked (blue) and crystal (yellow) structure of ligands in some PDB ligand-protein complexes. The PDB Id of each structure is shown. 22
Results of docking studies
Protein-Protein cases from protein-protein docking benchmark:Enzyme-inhibitor – 22 cases Antibody-antigen – 16 cases
Protein-DNA docking: 2 unbound-bound cases
Protein-drug docking: tens of bound cases (Estrogen receptor, HIV protease, COX)
Performance: Several minutes for large protein molecules and seconds for small drug molecules on standard PC computer.
Dataset and Testing Results
Endonuclease I-PpoI (1EVX) with DNA (1A73). RMSD 0.87Å, rank 2
DNA
Endonuclease
Docking solution
Estrogen receptor
Estradiol molecule from complex
Docking solution
Estrogen receptor with estradiol (1A52). RMSD 0.9Å, rank 1, running time: 11 seconds 23
A drug is classified as either belong (+) or not belong (-) to a class
Drug class: inhibitor of a protein, BBB penetrating, genotoxic, etc.
Protein class: enzyme EC3.4 family, DNA-binding, etc.
By screening against all classes, the property of a drug or the function of a protein can be identified
Drug
Class-1 SVM
Class-2 SVM
……
-
+
Classification of Drugs by SVM
Class-n SVM -
-
Drug belongsto class-2
24
What is SVM?
• Support vector machines, a machine learning method based on
artificial intelligence, learning by examples, statistical learning,
classify objects into one of the two classes.
Advantages of SVM:
• Diversity of class members (no racial discrimination).
• Use of structure-derived physico-chemical features as basis for
drug classification (no structure-similarity required in the
algorithm).
Classification of drugs by SVM
25
Artificial Intelligence (AI)
26
Inductive learning (example-based learning)
Machine learning method
27
A = (1, 1, 1)B = (0, 1, 1)C = (1, 1, 1)D = (0, 1, 1)E = (0, 0, 0)F = (1, 0, 1)
Machine learning methodFeature vectors
28
A=(1, 1, 1)B=(0, 1, 1)C=(1, 1, 1)D=(0, 1, 1)E=(0, 0, 0)F=(1, 0, 1)
Z
Input space
X
Y
BAE
F
Feature vector
Machine learning methodFeature vectors in input space
29
SVM Method
BorderNew border
Project to a higher dimensional space
Drug familymembers
Nonmembers
Drug familymembers
Nonmembers
30
Support vector
Support vector
New border
Protein familymembers
Nonmembers
SVM Method
31
Protein familymembers
Nonmembers
New border
Support vector
Support vector
SVM Method
32
Best Linear Separator?
33
c
d
Find closest points in convex hulls
34
c
d
Plane bisect closest points
35
Maximize distanceBetween two parallel supporting planes
Distance = “Margin” =
36
Best Linear SeparatorSupporting plane method
Best Linear SeparatorSupporting plane method
37
Border line is nonlinear
38
SVM Method
Non-linear transformation: use of kernel function
39
SVM Method
40
SVM Method
41
SVM Method
42
SVM Method
43
SVM Method
44
SVM Method
SVM for classification of drugs
How to represent a drug?• Each structure represented by specific feature vector
assembled from structural, physico-chemical properties Simple molecular properties (molecular weight, no. of
rotatable bonds etc. 18 in total) Molecular Connectivity and shape (28 in total) Electro-topological state polarity (84 in total) Quantum chemical properties (electric charge,
polaritability etc. 13 in total) Geometrical properties (molecular size vector, van
der Waals volume, molecular surface etc. 16 in total)
J. Chem. Inf. Comput. Sci. 44,1630 (2004) J. Chem. Inf. Comput. Sci. 44, 1497 (2004)
Toxicol. Sci. 79,170 (2004)
45
Computer loaded with SVMProt
SVMclassifier for every
Drug class
Identified classes
Drug designed or property predicted
Send structure to classifier
Input structurethrough internet
Option two
Option one
Input structureon local machine
Your drug structure
Which class your drug belongs to?
DrugChemical Structure
Chemical Structure
46
SVM-based drug design and property prediction software
Protein inhibitor/activator/substrate prediction• 86% of the 129 estrogen receptor activators and 84% of 101 non-
activators correctly predicted.
• 81% of 116 P-glycoprotein substrates and 79% of 85 non-substrates correctly predicted
Drug toxicity prediction• 97% of 102 TdP+ and 84% of 243 TdP- agents correctly predicted
• 73% of 229 genotoxic and 93% of 631 non-genotoxic agents correctly predicted
Pharmacokinetics prediction• 95% of 276 BBB+ and 82% of 139 BBB- agents correctly predicted
• 90% of 131 human intestine absorption and 80% of 65 non-absoption agents correctly predicted.
47
SVM drug prediction results
Projects Q&A!
1. Biological pathway simulation
2. Computer-aided anti-cancer drug design
3. Disease-causing mutation on drug target
48
Any questions? Thank you!