Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
11/31/3111ESCUELA TÉCNICA SUPERIOR
Using “Consensus clustering” to help understand how multiple ligands might bind
to multiple target sites. How this helps to choose the Right Query in Shape-Based Virtual Screening
Violeta Pérez-Nueno, David RitchieOrpailleur Team, INRIA Nancy - Grand Est
LORIA (Laboratoire Lorrain de Recherche en Informatique et ses Applications), INRIA Nancy – Grand Est, 615 rue du Jardin Botanique, 54506 Vandoeuvre-lès-Nancy, France
22/31/3122
1. Calculating SH Shapes
2. Calculating SH Consensus Shapes and their performance in VS
3. Consensus Clustering: Exploring CCR5 Multiple Binding Sites
4. All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
5. Choosing the Right Query in DUD Shape-Based VS
6. Examples of how Consensus Clustering can identify Multi-Site Pockets
Presentation Overview
33/31/3133
1. Calculating SH Shapes
44/31/3144
• Real SHs: • Coefficients:• Encode radial distances
from origin as SH series…• Solve coefficients by
numerical integration…
Surface shapes are represented as radial distance expansions of the molecular surface with respect to the center of the molecule.
Spherical Harmonic Surfaces
Ritchie, D.W. and Kemp, G.J.L. J. Comp. Chem. 1999, 20, 383–395.
55/31/3155
• Distance:
• Orthogonality:
• Rotation:
• Carbo:
• Hodgkin:
• Tanimoto:
• Multi-property:
ParaFit
Ritchie, D.W. and Kemp, G.J.L. J. Comp. Chem. 1999, 20, 383–395.
ParaFit calculates superpositions between pairs of molecules by exploiting the special rotational properties of the SH functions
66/31/3166
2. Calculating SH Consensus Shapes and their performance in VS
77/31/3177 Pérez-Nueno et al. J. Chem. Inf. Model. 2008, 48, 2146–2165.
1. Do all-v-all SH comparison2. Find best pair-wise match3. Calculate SH average of pair4. Treat average as new seed5. Superpose all onto seed6. Compute new average seed7. Rotate all onto new seed8. Iterate until convergence...9. Result = SH pseudo-molecule
Calculating Consensus Shapes
88/31/3188
CCR5 Antagonists (424):1) SCH-C derivatives 2) 1,3,5-trisubstituted pentacyclics3) Diketopiperazines4) 1,3,4-trisubstituted pyrrolidinepiperidines5) 5-oxopyrrolidine-3-carboxamides6) N,N’-Diphenylureas7) 4-aminopiperidine or tropanes8) 4-piperidines9) TAK derivatives10) Guanylhydrazone drivatives11) 4-hydroxypiperidine derivatives 12) Phenylcyclohexilamines13) Anilide piperidine N-oxides14) 1-phenyl-1,3-propanodiamines15) AMD derivatives16) Other
CXCR4 antagonists (248):1) AMD derivatives
2) Macrocycles
3) Tetrahydroquinolinamines
4) KRH derivatives5) Dipicolil amine zinc(II) complexes6) Other
PLUS…4696 inactive compounds from the
Maybridge Screening Collection with
similar 1D properties to the actives
Virtual Screening DatasetsThe consensus approach was first validated in a retrospective VS of two databases containing CCR5 and CXCR4 active compounds from the literature, and presumed inactive compounds from Maybridge.
99/31/3199
SH Consensus Shapes of theThree Most Active Inhibitors
Pseudo-molecules were obtained from the consensus shapes of the most active molecules for both CXCR4 and CCR5 targets, and used as VS queries against the database of known actives and decoys.
CCR5
CXCR4
Consensus shape KRH derivate superposition
Macrocycle derivate superposition
AMD derivate superposition
Consensus shape 1,3,4-trisubstituted pyrrolidinepiperidinederivate superposition
SCH derivate superposition
Piperidine derivate superposition
1010/31/311010
CXCR4
CCR5
Pérez-Nueno et al. J. Chem. Inf. Model. 2008, 48, 2146–2165.
Consensus Shape-Based VS
CXCR4 families very similar shapes -> results do not substantially improve
CCR5 families verydifferent shapes -> results considerably improve
1111/31/311111
3. Consensus Clustering: Exploring CCR5 Multiple Binding Sites
1212/31/311212
• Wards hierarchical clustering of chemical fingerprints
• We then used Kelley’s method to find the optimal number of clusters (16)
• These were manually merged to 10 groups based on known CCR5 families
•SH consensus shapes were calculated for the 10 groups
• These were then compared in ParaFit (all-vs-all)
Exploring CCR5 Multiple Binding Sites: Clustering the 424 CCR5 Ligands
1313/31/311313
From Consensus Shapes to Super-Consensus Clusters
•Another round of Ward’s clusteringproposed four super-consensusclusters
1414/31/311414
• Each SC pseudo-molecule was used as a VS query:
• NB. merging SC shapes significantly worsens the AUCs…• SC queries => CCR5 ligands form no less than FOUR groups
Using Super-Consensus Shapes as VS Queries
AUC=0.785 AUC=0.413
AUC=0.905 AUC=0.630
VS Super Consensus A VS Super Consensus B
VS Super Consensus C VS Super Consensus D
1515/31/311515
• SC-A docks to Site-1
(TMs 1, 2, 3, 7)
• SC-C docks to Site-2
(TMs 3, 5, 6)
• B and D dock to Site-3
(TMs 3, 6, 7)
• 3D pseudo-molecules were created as the union of all superposed ligands in each SC family for docking in Hex
Hex Blind Docking of SC Pseudo-Molecules to CCR5
Pérez-Nueno et al. J. Chem. Inf. Model. 2008, 48, 2146–2165.
1616/31/311616
• To confirm that the SC shapes were properly matched to their predicted target sites, the three proposed binding sites were treated as if they were separate targets for docking-based VS:• SC-As treated as actives for Site 1 (SCs B, C, D treated as inactives)• SC-Cs treated as actives for Site 2 (SCs A, B, D treated as inactives)• SC-B/Ds assumed active for Site 3 (SCs A and C treated as inactives)
• As before, merging SCs worsens the AUCs…• SC docking => no less than THREE CCR5 pocket sub-sites
A -> Site-1 C -> Site-2
B,D -> Site-3
Hex Docking VSw.r.t. Three CCR5 Sub-Sites
Pérez-Nueno et al. J. Chem. Inf. Model. 2008, 48, 2146–2165.
AUC=0.834 AUC=0.963
AUC=0.846
Docking VS onto Site 1 Docking VS onto Site 2
Docking VS onto Site 3
1717/31/311717
4. All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
1818/31/311818
All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
Cross-docking on DUDHuang et al. work
Huang et al. J. Med. Chem. 2006, 49, 6789-6801.
1919/31/311919
All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
Cross-shape matching on DUD• All against all shape comparison for the ligands of the 40 DUD targets
• Clustering of ligands by shape into 40 groups
• Select a threshold that emphasizes the overall similarity to the cross-docking experiment
•Count the membership over the threshold for ligands belonging to each of the targets for all the 40 clusters and sum memberships.
•Normalize this sum by the total number of ligands for each target. This gives the cross-shape matching score.
1234...
.
.
.
40
When ace is over the threshold. Sum of the normalized number of ligands of each of the other targets over the threshold for all the clusters
2020/31/312020
Ligands for a given target are split into 2 or 3 sub-groups, which could suggest they might bind to different sub-sites of the same target
p38 ligands split in C36, C30, C29
All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
2121/31/312121
Ligands from different targets are grouped together. It suggests that they could bind to the same targets.
Evidence of cox2 cross-docked with hsp90
Evidence of sahh cross-docked with tk and ada
All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
2222/31/312222
All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
8.45 h several months (from 20h to 12d/target)
Huang et al. J. Med. Chem. 2006, 49, 6789-6801.
2323/31/312323
5. Choosing the Right Query in DUD Shape-Based VS
2424/31/312424
Choosing the Right Query in DUD Shape-Based VS
Queries used:
• PARAFIT_B : PARAFIT bound conformation.
• PARAFIT_CM : PARAFIT consensus molecule.
• PARAFIT_C1_CM - PARAFIT_C3_CM: consensus molecules for SH clusters 1, 2, 3.
• PARAFIT_RCM: PARAFIT real center molecule closest to consensus.
• PARAFIT_C1_RCM - PARAFIT_C3_RCM: real center molecules for clusters 1, 2, 3.
• PARAFIT_SHEF_C: best SHEF molecule that fits pocket as PARAFIT query.
• ROCS_B: ROCS bound conformation.
• ROCS_RCM: real center molecule from PARAFIT as ROCS query.
• ROCS_C1_RCM - ROCS_C3_RCM: real PARAFIT center molecules as ROCS queries.
• ROCS_SHEF_C: best SHEF molecule that fits pocket as ROCS query.
• SHEF shape-based docking.
• GOLD conventional docking.
ROC plots Bar graphs
2525/31/312525
Choosing the Right Query in DUD Shape-Based VS
2626/31/312626
Choosing the Right Query in DUD Shape-Based VS
2727/31/312727
6. Examples of how Consensus Clustering can identify Multi-Site Pockets
2828/31/312828
Multi-Site Pockets: p38• 353 p38 DUD ligands clustered using Wards hierarchical clustering of chemical fingerprints
• Kelley’s method to find the optimal number of clusters (15)
• SH consensus shapes were calculated for the 15 groups
• These were then compared in ParaFit(all-vs-all)
• Another round of Ward’s clustering proposed three super-consensus clusters
2929/31/312929
Multi-Site Pockets: p38
Glu 71
Asp 168
Met 109
Glu 71
Asp 168Met 109
SC_A SC_B SC_ C
Glu 71
Asp 168
Met 109
3030/31/313030 Pargellis et al. Nature Structural Biology 2002, 9 (4), 268-272.
Multi-Site Pockets: p38
SCH3
ON
N
HN
I
ATP
1
2
2
BIRB
1
1
3131/31/313131
Multi-Site Pockets: p38
Glu 71
Asp 168
Met 109
Glu 71
Asp 168Met 109
SC_A SC_B SC_ C
Glu 71
Asp 168
Met 109
ATP
1
21
SITE 1 SITE 1SITE 2
SITE 2SITE 1
SITE 2
SITE 1
(Allosteric site)(ATP site) (ATP site)
(ATP site)(ATP site) (Allosteric site)
(Allostericsite)
diaryl urea inhibitors
pyridinyl-imidazole inhibitors
SC_A
SC_B
3232/31/313232
Multi-Site Pockets: Alr2• 26 alr2 DUD ligands clustered using Wards hierarchical clustering of chemical fingerprints
• Kelley’s method to find the optimal number of clusters (5)
• SH consensus shapes were calculated for the 5 groups
• These were then compared in ParaFit (all-vs-all)
• Another round of Ward’s clustering proposed three super-consensus clusters
3333/31/313333
Multi-Site Pockets: Alr2
SC_A SC_B SC_ C
Tyr 48His 110
Trp 111
Thr 113
Leu 300 Cys 298
His 110
Trp 111
Thr 113
Leu 300
Tyr 48
Cys 298
His 110
Trp 111Thr 113
Leu 300
Tyr 48
Cys 298
3434/31/313434
Multi-Site Pockets: Alr2
3
2
(d) (e)
Steuber et al. J. Mol. Biol. 2007, 369, 186-197.
3535/31/313535
Multi-Site Pockets: Alr2
SC_A SC_B SC_ C
Tyr 48His 110
Trp 111
Thr 113
Leu 300 Cys 298
His 110
Trp 111
Thr 113
Leu 300
Tyr 48
Cys 298
His 110
Trp 111Thr 113
Leu 300
Tyr 48
Cys 298
Small ligand Sub-site 1 (catalytic cleft)
Super ligand Sub-site 2 (whole pocket)
Super ligand Sub-site 2 (whole pocket)
Super ligand Sub-site 2 (whole pocket)
Small ligand Sub-site 1 (catalytic cleft)
3636/31/313636
• Consensus shape provides good queries. It is also a good
way to select a center molecule.
• Cross-shape matching is much faster than cross-docking in
identifying possible cross-docked targets.
• Consensus clustering allows to detect multi-site targets.
• Multi-site targets are more common than previously assumed.
We recommend clustering ligands if poor performance is
observed with the single query.
Conclusions
3737/31/313737
Acknowledgements
• Dave Ritchie• Vishwesh Venkatraman• Lazaros Mavridis
• INRIA Nancy - Grand Est• IQS, Universitat Ramon-Llull
ParaSurf + ParaFit: http://www.ceposinsilico.de/
Papers: http://www.loria.fr/~pereznue/http://www.loria.fr/~ritchied/
• Agence Nationale de la Recherche (ANR-08-CEXC-017-01)• Generalitat de Catalunya-DURSI (FI2008 and BE-DGR2009)
3838/31/313838
Thank you!
3939/31/313939
• From MOPAC or VAMP, calculate:– Density contours of 2x10-4e/A3 (i.e. approx = SAS)– MEP – electrostatic potential– IEL – ionization energy– EAL – electron affinity– aL – polarizability
• Encode as Spherical Harmonic expansions to order L=15…
Lin, J.-H. and Clark, T. J. Chem. Inf. Model. 2005, 45, 1010–1016.
Shapes/Properties From Semi-Empirical QM
ParaSurf
4040/31/314040
Choosing the Right Query in DUD Shape-Based VS
0.8920.9040.6890.9600.908RXR_ALPHA
0.8810.9470.9350.9450.919COX2
0.9230.9310.8710.9440.859SAHH
0.6420.4520.8040.7690.572ACE
0.5100.7690.6610.7520.524CDK2
0.5810.7160.6520.7580.574COX1
0.6760.7410.7520.7970.591PR
0.7750.8080.4260.8060.623THROMBIN
0.6290.5560.4720.6090.598VEGFR2
0.7530.8090.5520.8250.498p380.5790.6780.6670.7420.448PDGFRB0.6390.6390.7400.7280.207TRYPSIN0.3780.6740.5080.3510.371COMT
0.4190.6170.5200.6090.508ALR2
0.630(B) 0.413(D)
0.505 (B+D)
0.9050.7850.8660.594CCR5
PFT_C3_CMPFT_C2_CMPFT_C1_CMPFT_CMPF_BTarget
POOR
POOR
POOR
POOR
POOR
RANDOM
RANDOM
RANDOM
RANDOM
RANDOM
RANDOM
SUCCESSFUL
SUCCESSFUL
SUCCESSFUL
4141/31/314141
Choosing the Right Query in DUD Shape-Based VS
4242/31/314242
Choosing the Right Query in DUD Shape-Based VS