Upload
jenna-mich
View
224
Download
0
Tags:
Embed Size (px)
Citation preview
A new paradigm for virtual screeningA Research Councils Basic Technology Research Programme
BackgroundCross research council endeavouradministered by EPSRCFunding for research to create a new technologyChange the way we do scienceUnderpin the future industrial base
Atom based modellingQSAR & QSPRAlmost all modelling techniques are based on atomistic descriptions of moleculesAlthough these techniques have been successful over several decades, they have disadvantagespoor scaling characteristicslack of a solid physical justification, e.g. scoring functionsinterpretation difficult due to abstract nature of many descriptorstendency to produce high dimensional models
What is the true dimensionality of chemical space?This has been investigated as follows:1.Choose 26 descriptors that appear again and again in our QSPR-models2. Calculate them for the entire Maybridge database3. Calculate the principal components (factors)4. What is the dimensionality of physical property space, what are the descriptors?
Scree plot of the PC eigenvalues
Physical property Space
PC
Main descriptors
Interpretation
1
Polarizability, molecular weight, volume, surface area, globularity
Size, shape
2
Maximum MEP, mean positive and negative MEPs, total variance
Complementary electrostatic surface descriptor
3
Minimum MEP, mean negative MEP, balance parameter
Complementary electrostatic surface descriptor
4
Total MEP-derived charges on nitrogens,
# H-bond donors
Complementary Hydrogen-bonding descriptor
Physical property Space
PC
Main descriptors
Interpretation
5
Total MEP-derived charges on H and O, minimum MEP,
# aromatic rings
Complementary hydrogen bonding descriptor
6
Dipole moment, dipolar density
Dipolar polarity
7-9
Total MEP charges on different types of atom
Chemical diversity
Improved molecular modelling?Can we define a more parsimonious and explicit description of molecules than has so far been achieved using atomistic models?leading to better prediction AND a clearer understanding of the properties of molecules and how they arise
A non-atom based approachWe are developing an alternative approach in which molecules are described by their surfaces Benzodiazepine analogues
A non-atom based approachThe approach is based on calculation of a set of local properties at or near the molecular surfacethe local molecular electrostatic potential (MEP) the local ionisation energy (LIE, IEL)the local electron affinity (LEA, EAL)the local polarisability (LP, L)
The local surface properties
Molecular Electrostatic Potential
n = number of atoms in moleculeZi = nuclear charge of atom i located at Ri (r) = electron density function
Local Polarizability
Density due to a singly occupied atomic orbital j
Coulson population of atomic orbital j
Mean polarizability calculated for atomic orbital j
Local electron affinity - EAL
Local Ionization Energy
Calculation of thesurface propertiesMolecules defined as isodensity surfacesusing semi-empirical AM1 electron densitycan also be defined using a shrink-wrap or a marching cube algorithmFitted to a spherical harmonic expansionthe shape of the shrink-wrapped surface, orthe four local propertiesMEP, LIE, LEA & LP
Describing surface shape:spherical harmonic expansionThe accuracy of the surface description is a function of the order N of the expansionThe greater N, the larger the computational penalty
Advantages of this approachThis gives a completely analytical description of the molecules shape & the 4 local properties intermolecular binding properties & chemical reactivitySpherical harmonics can be truncated at low orders for fast QSAR scans (HTS), fast superposition of molecules & rapid calculation of similarity indicesfor ligands (MW < 750), N = 6-8for peptides & proteins (MW > 5,000), N = 25-30
Putative resolutions for in silico screeningFor ligands N=6
For receptors N=25
MEP & LIE
MEP
IEL
Application to QSAR & QSPRSeveral classes of QSAR/QSPR descriptors can be derived from the local properties, including:the spherical harmonics coefficients for constant order Nthe number of coefficients is invariant of the number of atoms in a moleculethe critical points for each surface propertymaxima, minima & saddle points the distribution of field intensities at the molecular surfacefour fields with local intensities varying between moleculessample using grid points?the surface integrals for each field
Public domain datasetsSmallConsensus Set of 74 Drug Molecules (diverse)QSAR set (31 CoMFA steroids)MediumWDI subset (2,400 compounds)Harvard Chembank dataset (2,000 compounds)LargeWDI (50,000)Maybridge (50,000)
Small molecule showing tesselated surface
An example grid of surface points A grid is placed on this molecular surface in order to reduce the number of surface points from 4038 to 55
Gradient flows & molecular surface property graphsCharacterize the behaviour of a property f : S on a molecular surface S, in terms of a directed graph G on S derived from the gradient vector field x = grad f(x)The molecular surface property graph G is defined byVertices (G) = fixed points of grad f = critical points of f Edges (G) = stable and unstable manifolds of the saddle points
Example MoleculeAllopurinol
Allopurinol RGB SurfacesLIE encoded on Red channelLEA encoded on Green ChannelLP or MEP encoded on Blue Channel
Critical points of allopurinol 8 maxima 7 minima13 saddlesNo. of maxima no. of saddles + no. of minima = Euler characteristic (S) = 2
Distribution based descriptors34 descriptors were measured includingmaximum field intensityminimum field intensitymean field intensityrange of field intensitiesvariance of field intensitiesThe Principal Components of the descriptors were calculated to provide a set of orthogonal descriptors derived from the local properties at the molecular surface
Distribution of Allopurinol Local Properties
Other distribution based descriptorsMoments1st Mean2nd Variance3rd Skewness4th Kurtosis> 4th Higher moments as requiredOverlapping GaussiansKernal density procedure
Correlation Matrix for properties of allopurinol
Correlations of Local Properties: Maybridge db
MEP
LIE
LEA
LP
MEP
1
LIE
0.15
1
LEA
-0.12
0.18
1
LP
0.29
0.19
0.51
1
QSPR & QSAR modelsModels derived from Local PropertiesDrug LikenessSOMs trained on WDI (drugs) & Maybridge (general)Parameters from PC of Local Property Descriptors Medium sized datasets superimposed on SOMsSurface Integral Model for Solvation EnergyRMS Error ~ 0.75 Kcal
Physical-Property MappingMaybridge used as the chemistry datasetUse the top six principal components to train a 100 100 Kohonen net (unsupervised training)2,105 compounds selected from the World Drug Index as real drugs used as the drug dataset
Physical Property MapchemistryTrainKohonenNet
Physical Property Map: Drugs
Physical Property Map: steroid hormones
Surface-integral free energiesCritical for scoring functions, which otherwise use the force-field intermolcular energiesProvide an attractive alternative to descriptor-plus-interpolation QSPR-modelsSolvation , lattice energies ?, vapour pressures , partition coefficients ?, solubilities ?.....
Surface-integral modelsP = target propertyAi = area of triangle intri = number of triangles
Free energies & enthalpies of hydration, free energies of solvation for n-octanol & chloroform
Pattern matching on molecular surfacesCan we recognise similar surfaces?Can we recognise similar surface fragments?Can we identify the most similar surface to our target?How do we compare field descriptors on the molecular surface?
Surface comparisonTwo different approaches:Using spherical harmonic molecular surfaces [J. Comp. Chem. 20(4) 383-395; Ritchie and Kemp 2000; University of Aberdeen].Partial molecular alignment via local structure analysis [J. Chem. Inf. Comput. Sci. 40(2) 503-512 ; Robinson, Lyne and Richards 1999; University of Oxford].
Voting pairs provide possible local alignmentsTry all possible voting pairs to produce a large number of alignments. The choice of voting pairs can have a critical effect on the quality of the surface alignment.
Example alignments1342
Pattern matching of surface properties: RMSD = 0.75AB
ParaSurf v1.0SurfacesIsodensity SurfacesShrink WrapMarching CubeSurfaces fit to Spherical HarmonicsPropertiesMEP, LIE, LEA and LPEncoded at points on the surfaceEncoded as Spherical Harmonic Expansions
GRID ComputingParaSurf compiled onSGI IRIXWindowsLinux (SUSE)IBM AIXFuture PlatformsSUN SolarisGRID enabling at Portsmouth, Southampton and Oxford.
Provisional TimingsSGI R10k, 256MBVAMP ~ 30s/compoundParaSurf ~ 10s/compoundIntel 1.8 Xeon/ AMD Athlon XP-2000+ParaSurf ~ 2s/compoundSGI FUEL Workstation R14KParaSurf ~ 2s/compound
SummaryCompound screeningSpherical harmonicrepresentationAberdeen
ConclusionsProperties can be calculated at the surface of moleculesThese properties can be RGB encodedThe properties are localDescriptor sets derived from these properties can be used for robust QSPR & QSAR modelsThe algorithms will soon be available commercially for use in virtual high throughput screening
ParaSurf in silico Screening TechnologyBasic Technology Funding for October 2003 to September 2004Proof of concept studiesConsortia building networkingAcademic partnersUniversity of PortsmouthUniversity of ErlangenUniversity of SouthamptonUniversity of AberdeenUniversity of Oxford