20
Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions Structure-Based Virtual Screening (SVS) is a Structure-Based Virtual Screening (SVS) is a proven technique for lead discovery proven technique for lead discovery Still significant room for improvement Still significant room for improvement Efforts generally focused on the creation of novel Efforts generally focused on the creation of novel scoring functions scoring functions In this presentation In this presentation Present a novel technique for scoring function development Present a novel technique for scoring function development Highlight problems encountered Highlight problems encountered Illustrate the potential of pharmacophore constraints to Illustrate the potential of pharmacophore constraints to mitigate some of these issues mitigate some of these issues Analyze implications for current and future SVS Analyze implications for current and future SVS technology technology

Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Embed Size (px)

Citation preview

Page 1: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Structure-Based Virtual Screening:New methods, Old Problems and “Ancient” Solutions

Structure-Based Virtual Screening:New methods, Old Problems and “Ancient” Solutions

Structure-Based Virtual Screening (SVS) is a proven technique Structure-Based Virtual Screening (SVS) is a proven technique for lead discoveryfor lead discovery

Still significant room for improvementStill significant room for improvement Efforts generally focused on the creation of novel scoring functionsEfforts generally focused on the creation of novel scoring functions

In this presentationIn this presentation Present a novel technique for scoring function development Present a novel technique for scoring function development

Highlight problems encounteredHighlight problems encountered Illustrate the potential of pharmacophore constraints to mitigate some of Illustrate the potential of pharmacophore constraints to mitigate some of

these issues these issues

Analyze implications for current and future SVS technologyAnalyze implications for current and future SVS technology

Page 2: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Scoring Functions: Development Within an SVS algorithm Framework

Scoring Functions: Development Within an SVS algorithm Framework

We are interested in the top ranking molecules from SVSWe are interested in the top ranking molecules from SVS

Do not care about the nature of the score itselfDo not care about the nature of the score itself

Alternative strategy - design function around optimization of active Alternative strategy - design function around optimization of active

molecule rankmolecule rank Accomplished using docking data from selected SVS algorithmsAccomplished using docking data from selected SVS algorithms

No binding data requiredNo binding data required Complexed ligand should top the rank listComplexed ligand should top the rank list

Allows metrics that describe reasons for lack of bindingAllows metrics that describe reasons for lack of binding High scoring docked inactivesHigh scoring docked inactives

Effect of docking algorithm limitations can be better understoodEffect of docking algorithm limitations can be better understood Optimizing within the framework to be used for SVS calculations Optimizing within the framework to be used for SVS calculations

Page 3: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Data Set Selection: Filtering

Data Set Selection: Filtering

Initial set of ~300 complexes extracted from the work of Böhm, Keske Initial set of ~300 complexes extracted from the work of Böhm, Keske and Dixonand Dixon

Gschwend et al. J. Mol. Recognit. 1996, 9, 175-186. Gschwend et al. J. Mol. Recognit. 1996, 9, 175-186. Böhm. J. Comput.-Aided Mol. Design 1994, 8, 243-256.Böhm. J. Comput.-Aided Mol. Design 1994, 8, 243-256.

To ensure diversity and quality, filters applied:To ensure diversity and quality, filters applied: Discard complexes with:Discard complexes with: More than 50 heavy atoms / Resolution>2.5Å / covalently More than 50 heavy atoms / Resolution>2.5Å / covalently

bound / incompletely modeledbound / incompletely modeled

Data weighted towards specific targets - many close analoguesData weighted towards specific targets - many close analogues If a general scoring function is required, these need to be filteredIf a general scoring function is required, these need to be filtered

Initial efforts were set on removing all repeat targets. Initial efforts were set on removing all repeat targets. Too drastic - multiple complexes of the same target kept as long as ligand Too drastic - multiple complexes of the same target kept as long as ligand

represented a unique chemotype (no analogues)represented a unique chemotype (no analogues)

Page 4: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Data Set Selection:Unexpected OdditiesData Set Selection:Unexpected Oddities

Odd interactions brought about by extreme crystallization conditionsOdd interactions brought about by extreme crystallization conditions 1rnt - acidic crystallization conditions (pH5.0) produce unusual protonation state1rnt - acidic crystallization conditions (pH5.0) produce unusual protonation state

Multiple points of crystal contact with symmetry related moleculesMultiple points of crystal contact with symmetry related molecules 4gr1 has more interactions with symmetry related protein than deposited structure4gr1 has more interactions with symmetry related protein than deposited structure

An extreme case, but problem significant enough for inclusion in Relibase: An extreme case, but problem significant enough for inclusion in Relibase:

http://www.ccdc.cam.ac.uk/news/14_12_01.htmlhttp://www.ccdc.cam.ac.uk/news/14_12_01.html

Page 5: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Scoring Function Development :Data Set Selection - the Final Tally

Scoring Function Development :Data Set Selection - the Final Tally

Once all filters applied > 75% of complexes removedOnce all filters applied > 75% of complexes removed

Highlights significant problems in generating a clean data setHighlights significant problems in generating a clean data set

An under-appreciated problem in scoring function developmentAn under-appreciated problem in scoring function development

The need to analyze and exploit all available PDB data including the most recently deposited structuresThe need to analyze and exploit all available PDB data including the most recently deposited structures

Requires much manual interventionRequires much manual intervention

Poster 110 Sadowski et al.Poster 110 Sadowski et al.

Poster 251 by Fenu et al.Poster 251 by Fenu et al.

Final selections - 20 training set and 10 test set complexesFinal selections - 20 training set and 10 test set complexes

Page 6: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Scoring Function Development : Basic Strategy

Scoring Function Development : Basic Strategy

Use a GA and stored metric data to simultaneously optimize rank of

“active” orientations in each target site

DOCK (4.0) Ligand data set into each active site

(“active” ligand + molecular noise)

Feed all docked orientations into metric generator

Page 7: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Scoring function Development:GA Implementation

Scoring function Development:GA Implementation

GA optimizes GA optimizes

average rank of the average rank of the

“active” orientations “active” orientations

within a data set of within a data set of

docked molecules docked molecules

and targets. and targets.

Score = a*metric1 + b*metric2 + ...

Page 8: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Scoring function Development:Tests run

Scoring function Development:Tests run

Three primary experiments undertaken:Three primary experiments undertaken: Optimize rank using crystallographic ligand orientation (CLO) studyOptimize rank using crystallographic ligand orientation (CLO) study

Replace CLO with orientation produced on reDOCKing ligand binding Replace CLO with orientation produced on reDOCKing ligand binding

conformer into target site - closest docked orientation (CDO) studyconformer into target site - closest docked orientation (CDO) study

Compare results with standard DOCK scoring functions (contact / force Compare results with standard DOCK scoring functions (contact / force

field)field)

“Typical” CDO orientation compared to CLO binding mode for 7est. Heavy atom RMS=1.56Å

Page 9: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Scoring function Development: CLO test

Scoring function Development: CLO test

High ranking in both training and test sets High ranking in both training and test sets

(4/24000 - 37/22000)(4/24000 - 37/22000)

Clash descriptor scores highlyClash descriptor scores highly CLASH weights against ligand protein bumps CLASH weights against ligand protein bumps

Rare for CLO Rare for CLO

More common in DOCK orientations.More common in DOCK orientations.

Effectively acts as an indicator variableEffectively acts as an indicator variable

TrainComplex Rank

1phd 33

3cpa 5

9aat 4

1ak3 3

2pk4 3

2tmn 3

1rnt 2

7est 2

1abe 0

1apv 0

1pph 0

1rbp 0

1snc 0

2tsc 0

3gap 0

4dfr 0

4phv 0

4sga 0

6tim 0

7cat 0

Average rank

TestComplex Rank

1dr1 286

1xli 39

1phg 30

2ifb 1

3fx2 1

4mdh 05p21 0

5tmn 0

8cpa 0

Averagerank

4

37

Optimized coefficients

(Normalized % Contribution to CLO scores )

Clash Hyd Surf Hbond Electro

-0.816 (-3.7) 0.006 (44.0) 0.574 (42.6) 0.070 (17.1)

Page 10: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Scoring function Development: CDO test

Scoring function Development: CDO test

4 training set and 1 test set compound unable 4 training set and 1 test set compound unable

to dock within 2.0Å RMS of CLOto dock within 2.0Å RMS of CLO Removed from analysisRemoved from analysis

Test results look less impressiveTest results look less impressive

Due to docking inaccuracies Due to docking inaccuracies H bond network breakdownH bond network breakdown

Clash term importance drops significantly now, as Clash term importance drops significantly now, as

CDO, unlike CLO often contains bumpsCDO, unlike CLO often contains bumps

TestComplex Rank

1xli 11785

1dr1 9503

2ifb 848

1phg 99

4mdh 29

5tmn 4

3fx2 2

3tpi 2

8cpa 1

AverageRank

TrainComplex Rank

1phd 260

7est 114

1rnt 111

2tmn 83

3cpa 66

9aat 37

2tsc 27

1abe 246tim 22

2pk4 16

1ak3 13

4dfr 13

3gap 9

1pph 2

1rbp 1

1snc 1

Average Rank 51

2476

Optimized coefficients

(Normalized % Contribution to CDO scores )

Clash Hyd Surf Hbond Electro

-0.20 (-8.9) 0.01 (51.7) 0.97 (37.6) -0.10 (19.6)

Page 11: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Scoring function Development: Average test set Rank Comparisons

Scoring function Development: Average test set Rank Comparisons

CDO orientations in CLO scoring function Average rank = 19337 CLO orientations in CDO scoring function Average rank = 75 CDO performance more robust

Due to a reduction in sensitivity to steric clashes

CDO orientations and DOCK contact score average rank = 2690 CDO orientations and DOCK force field score average rank = 16518 All atom model and R12 repulsion oversensitive to clashes Contact score user controlled steric clash penalty permits sensitivity control

Comparison of CDO and contact score shows a slight improvement average ranks = 2476 / 2690 H bond/electrostatics adding some additional resolution

Page 12: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Scoring function Development: Conclusions

Scoring function Development: Conclusions

Results highlight potential pitfalls in scoring function designResults highlight potential pitfalls in scoring function design More robust data sets required (More robust data sets required (c**p in - c**p out )c**p in - c**p out ) Xtal data performance not necessarily representative of real world SVSXtal data performance not necessarily representative of real world SVS

CLO scoring functionCLO scoring function

High resolution descriptors are not always compatible with binding modes of 1.0-2.0High resolution descriptors are not always compatible with binding modes of 1.0-2.0ÅÅ

accuracy often seen at current sampling levelsaccuracy often seen at current sampling levels H bond net work breakdown even with near-hit binding modesH bond net work breakdown even with near-hit binding modes

Need to consider alternative scoring metricsNeed to consider alternative scoring metrics lower resolution descriptors / non-binding event measures lower resolution descriptors / non-binding event measures

Scoring and sampling are not separable problemsScoring and sampling are not separable problems To take scoring functions to the next level need to focus on SVS technology with more To take scoring functions to the next level need to focus on SVS technology with more

exhaustive sampling paradigms exhaustive sampling paradigms Additional CPU essential: Distributed (grid-based) computingAdditional CPU essential: Distributed (grid-based) computing

Page 13: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Exploiting an old Trick:SVS and Pharmacophore constraints

Exploiting an old Trick:SVS and Pharmacophore constraints

Another major scoring function failingAnother major scoring function failing Inability to differentiate H bond/Salt bridge strengthsInability to differentiate H bond/Salt bridge strengths

H bonds often measured by presence or absenceH bonds often measured by presence or absence

Salt bridges despite there importance are often ignoredSalt bridges despite there importance are often ignored

SVS searches are generally undertaken with a binding hypothesis in SVS searches are generally undertaken with a binding hypothesis in

mindmind Exploitation of known target structural biologyExploitation of known target structural biology

Scoring functions often struggle to incorporate such informationScoring functions often struggle to incorporate such information

Pharmacophore constraints provide a sampling-based alternative Pharmacophore constraints provide a sampling-based alternative

paradigm to mitigate these issuesparadigm to mitigate these issues

Page 14: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Pharmacophoric Constraints: DOCK Chemical Matching and Critical Regions

http://www.cmpharm.ucsf.edu/kuntz/dock4/html/Manual.47.html#pgfId=20180

Pharmacophoric Constraints: DOCK Chemical Matching and Critical Regions

http://www.cmpharm.ucsf.edu/kuntz/dock4/html/Manual.47.html#pgfId=20180

# acyl sulphonamide # acyl sulphonamide

definition O.2 ( C.2 ( N.am ( H ) ( S ( 2 O.2 ) ) ) )definition O.2 ( C.2 ( N.am ( H ) ( S ( 2 O.2 ) ) ) )

# deprotonated carboxyl# deprotonated carboxyl

definition O.co2 ( C )definition O.co2 ( C )

Region 1 + 2

acceptor / donor

Region 3

Hydrophobic

In-house DOCK pharmacophore types:

heavy atom

donor

acceptor

hydrophobe

aromatic

aromatic_hydrophobic

acid

base

donor_and_acceptor

special (e.g. metal chelator)Sample Kinase site definition

Sample acid site point definitions DOCK permits creation of user DOCK permits creation of user

defined pharmacophore elementsdefined pharmacophore elements

When combined with critical regions, When combined with critical regions,

DOCK can simultaneously undertake DOCK can simultaneously undertake

1000’s of binding site constrained 1000’s of binding site constrained

pharmacophore searchespharmacophore searches

Page 15: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Pharmacophoric Constraints:Comparison Test sets

Pharmacophoric Constraints:Comparison Test sets

5 Targets analyzed5 Targets analyzed ~10000 noise molecules plus active compound data set ~10000 noise molecules plus active compound data set

docked into each active sitedocked into each active site

Enrichment analysis based on chemotype rather than Enrichment analysis based on chemotype rather than

headline hit rate to prevent active analogue bias headline hit rate to prevent active analogue bias

Target Active chemotypedefinitions

Defined critical regions(associated pharmacophore type(s))

Serine protease 1 P1 substituent / P1-P4linker substituent

S1 sub site (base or hydrophobe)S4 sub site (hydrophobe)

Serine protease 2 P1 substituent / P1-P4linker substituent

S1 sub site (base)S4 sub site (hydrophobe)

Fatty acid bindingprotein 1

Core linking acid moiety toremaining substituents

Acid binding sub site (acid)Rear hydrophobic pocket

(hydrophobe)Fatty acid binding

protein 2Core linking acid moiety to

remaining substituentsAcid binding sub site (acid)

KinaseMoiety mimicking adenine /

main core of moleculesAdenine hydrogen bonding regions

(donor/acceptor) rear hydrophobic pocket (hydrophobe)

Page 16: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Averaged Chemotype EnrichmentsAveraged Chemotype Enrichments

Constrained contact search enrichment stands out Constrained contact search enrichment stands out Force field performance limited by aforementioned over-Force field performance limited by aforementioned over-

sensitivity to steric clashessensitivity to steric clashes

0

1

2

3

4

5

6

7

100 200 300 400 500Compound rank

Che

mo

typ

es f

oun

d Generic forcefield

Constrainedforce field

Genericcontact

Constrainedcontact

Page 17: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Searches Across Different SVS Paradigms: Kinase Pocket

Searches Across Different SVS Paradigms: Kinase Pocket

Performance improves as Performance improves as

scoring function simplifiedscoring function simplified Prometheus in particular led astray Prometheus in particular led astray

by spurious h bondsby spurious h bonds

Flexible site / inactivated formFlexible site / inactivated form Challenging targetChallenging target

Constrained contact score Constrained contact score

performs bestperforms best Unable to implement Constraints in Unable to implement Constraints in

Prometheus and GOLDPrometheus and GOLD

1 0 0

3 0 0

5 0 0

0

2

4

6

8

Che

mot

ypes

hi

t

C o m p o u n d r a n k

K i n a s e ( 1 4 c h e m o t y p e s t o t a l )

Page 18: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Pharmacophoric Constraints:Conclusions

Pharmacophoric Constraints:Conclusions

Pharmacophores offer numerous attractive features in SVS Pharmacophores offer numerous attractive features in SVS Improved hit ratesImproved hit rates

Binding orientations constrained by user hypotheses to biologically Binding orientations constrained by user hypotheses to biologically

relevant regions of spacerelevant regions of space known structural biology known structural biology

For algorithms such as DOCK, large increases in search speed (typically For algorithms such as DOCK, large increases in search speed (typically

1-2 orders of magnitude)1-2 orders of magnitude)

Simple scoring functions still have a role to play in SVSSimple scoring functions still have a role to play in SVS more tolerance to errors in binding mode and limitations in active site more tolerance to errors in binding mode and limitations in active site

resolutionresolution

Page 19: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

AcknowledgementsAcknowledgements

Thank youThank you to to

GA scoring functionGA scoring function

designdesign

Ryan SmithRyan Smith

Dan GschwendDan Gschwend

Andrew LeachAndrew Leach

Rod HubbardRod Hubbard

Pharmacophore searchingPharmacophore searching

Tim PerkinsTim Perkins

Dan CheneyDan Cheney

Doree SitkoffDoree Sitkoff

John TokarskiJohn Tokarski

Yi LiYi Li

Jonathan Mason and all my otherJonathan Mason and all my other

BMS colleagues past and presentBMS colleagues past and present

TEC / GA source available to all interested partiesTEC / GA source available to all interested [email protected]@bms.com

Page 20: Structure-Based Virtual Screening: New methods, Old Problems and “Ancient” Solutions ä Structure-Based Virtual Screening (SVS) is a proven technique for

Searches Across Different SVS Paradigms: Generic vs Constrained(*) Searches

Searches Across Different SVS Paradigms: Generic vs Constrained(*) Searches

FAB protein 2 well FAB protein 2 well

defined rigid pocketdefined rigid pocket Good SVS targetGood SVS target

All methods perform wellAll methods perform well high percentage of high percentage of

chemotypes foundchemotypes found

In all cases constrained In all cases constrained

search outperforms its search outperforms its

generic equivalentgeneric equivalent