20
SimBioSys Inc.© 20 p://www.simbiosys.ca/ Conformational sampling in protein- ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents: Introduction: conformation sampling requirements of scoring Statistics on bound ligand conformations Search space for exhaustive flexible docking, de novo design Comparative evaluation of conformation sampling algorithms http://www.simbiosys.ca/

SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

Embed Size (px)

Citation preview

Page 1: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Conformational sampling in protein-ligand complex environment

Zsolt Zsoldos SimBioSys Inc., © 2004

Contents:

● Introduction: conformation sampling requirements of scoring● Statistics on bound ligand conformations● Search space for exhaustive flexible docking, de novo design● Comparative evaluation of conformation sampling algorithms

http://www.simbiosys.ca/

Page 2: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Acknowledgement

●A. Peter Johnson

●Val Gillet

●ICAMS, University of Leeds

Page 3: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Contents: Introduction

●Introduction

●Pose sampling requirements for interactions

●Orientation and dihedral precision needed

●Statistics on bound ligand conformations

●Search space size for flexible ligand docking

●Comparative evaluation of conformation sampling algorithms for use in flexible docking, de novo design

Page 4: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Pose sampling requirements of good interaction geometry for scoring

❑ H-bond geometryH-acceptor distance range 1.6Å to 2.2Å, i.e. 1.9ű0.3Å

❑ hydrophobic contactcarbon-carbon distance range3.2Å to 4.2Å, i.e. 3.7ű0.5Å

❑ discretization must be fine enough to differentiate any atom movement >=0.5Å

D H

ALp

dHA

C C

vdwd

CC

Page 5: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Orientation and dihedral angle precision required

❑ Goal: differentiate any atom movement >=0.5Å❑ Drug-like molecules can have

heavy atoms at 7Å distance froma rotation axis (see figure)

❑ Simple trigonometric calculation:tangential movement of 0.5Å is caused by rotation of about 5°at a rotation radius of 7Å

❑ Consequence: orientation and dihedral sampling must be finer than 5°

~7Å

PDB code: 1CX2

Page 6: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Contents: Statistics

●Introduction

●Statistics on bound ligand conformations

●Dihedral angle distribution statistics

●Frequency of low energy conformations

●Statistics and examples of high energy dihedrals

●Search space for exhaustive flexible docking

●Comparative evaluation of conformation sampling algorithms for use in flexible docking, de novo design

Page 7: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Receptor-ligand complexes from RCSB PDB

High resolution: less than 2.5Å5000 entries

Ligand dihedrals of rotatable bonds considered only

Dihedral angle distribution statistics on bound ligand conformations

Page 8: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Frequency of low energy conformations in 5000 PDB ligands

❑ All dihedrals of the ligand at energy minimum:❑ 86 (1.7%) with ±5° error

❑ 131 (2.6%) with ±10° error

❑ 186 (3.7%) with ±15° error

❑ All low energy dihedrals: ❑ 108 (2.2%) with ±5° error

❑ 211 (4.2%) with ±10° error

❑ 315 (6.3%) with ±15° error

PDB code: 1hak

Page 9: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Statistics on high energy dihedrals in 5000 bound ligand conformations

❑ High energy, strained torsion occurance frequency:

❑ 1836 (36.9%) with ±5°

❑ 2721 (54.7%) with ±10°

❑ 3192 (64.2%) with ±15°

❑ Worst possible dihedral:

❑ 240 (4.8%) with ±5°

❑ 288 (5.8%) with ±10°

❑ 339 (6.8%) with ±15°

❑ 486 (9.8%) with ±30°

PDB code: 1cle

Page 10: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Scatter plot of consecutive dihedrals in bound ligand conformations

Dihedral sequences of three consecutive rotatable bonds

e.g. in

-CH2-CH2-CH2-CH2-

The angles are fully scattered and fill the whole range

Page 11: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Examples of high energy dihedrals...

Sampling at every 60° for each rotatable bond:

Miss 97% of X-ray ligand conformations by more than 5° error

PDB code: 1c2t

Page 12: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Examples of high energy dihedrals...

PDB code: 1uvs

Page 13: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Examples of high energy dihedrals...

PDB code: 1yds

Page 14: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Examples of high energy dihedrals...

PDB code: 1kmv

Page 15: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Search space size for exhaustive flexible ligand docking

❑ Number of poses to examine with sampling defined:

Translations(0.5Å) * Rotations(5°) * Dihedrals(5°) = 203 * 723 * 72n ~ n=6 rot.bonds => 2*1020 poses per ligand

❑ Brute force evaluation 2000/s => 3 billion years

❑ Stochastic methods explore a tiny fraction of the space with no guarantee to find the best or even any good-enough solutions

Page 16: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Contents: Algorithm comparison

●Introduction

●Statistics on bound ligand conformations

●Search space for exhaustive flexible docking

●Comparative evaluation of conformation sampling algorithms for docking, de novo design

●Conformational sampling test

●Accuracy comparison (RMSD from target)

●Speed comparison

Page 17: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Conformational sampling test

●The test problem:● A target hydrocarbon chain conformation is generated randomly –

number of carbons is given

● The algorithms have to find the target conformation based on scoring function value – which is simply the RMSD

●Algorithms tested:● Taboo search

● Genetic Algorithm

● Incremental construction with continuous tweak

Page 18: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Accuracy Comparison of the algorithms: best RMSD found

Number of atoms in chain range: 5-20

Number of iterations: 10,000

Population, taboo list size: 100

10 test runs each

Page 19: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Speed comparison of the algorithms

Hardware:

2.6GHz Pentium 4

laptop

Runtime displayed:

CPU seconds

Page 20: SimBioSys Inc.© 2004 Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:

SimBioSys Inc.© 2004http://www.simbiosys.ca/

Summary

❑ Receptor bound conformations of flexible ligands deviate widely from low energy conformations

❑ Strained dihedrals occur with high frequency (37%)

❑ Interaction scoring sets low error limit (5°)

❑ Sampling of few hundred low energy conformers miss 97% of ligand conformations in X-ray data

❑ Sufficient sampling implies huge search space

❑ Stochastic methods are slow and inaccurate