20
Accelerating Virtual High-Throughput Ligand Docking Screening One Million Compounds Using a Petascale Supercomputer Sally R. Ellingson, PhD Candidate Department of Genome Science and Technology, University of Tennessee Center for Molecular Biophysics, UT/ORNL Advisor: Dr. Jerome Baudry 2012 Emerging Computational Methods for the Life Sciences Workshop (In Conjunction with HPDC12 Delft, Netherlands)

Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Embed Size (px)

Citation preview

Page 1: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Accelerating Virtual High-Throughput

Ligand Docking Screening One Million Compounds Using a Petascale Supercomputer

Sally R. Ellingson, PhD Candidate

Department of Genome Science and Technology, University of Tennessee

Center for Molecular Biophysics, UT/ORNL

Advisor: Dr. Jerome Baudry

2012 Emerging Computational Methods for the Life Sciences Workshop (In Conjunction with HPDC12 Delft, Netherlands)

Page 2: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Outline

• What is virtual molecular docking?

• What is the importance of a virtual high-throughput screening?

• Autodock4 and Autodock4.lga.MPI

▫ Implementation details

▫ Case study: million compound screen

• What is the importance of multi-protein docking?

▫ Limitations with current screening software

▫ Future opportunities using Autodock Vina

Page 3: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

What is virtual molecular docking?

• Predicts conformation of a protein-ligand complex

• Predicts binding affinity of the ligand to the protein

Diller, D. J. and Merz, K. M. (2001), High throughput docking for library design and library prioritization. Proteins, 43: 113–124.

(+) Reproduce correct bound conformation (+) Assign better scores to high-affinity ligands than to decoys (enrichment) (-) Generate scores that correlate with measured binding affinities

Page 4: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Why is virtual docking important in

novel drug discovery?

• Many medications act by binding and inhibiting a specific target

• Early stage drug discovery consist of identifying ligands that bind to specific proteins with a high affinity and retain favorable pharmacological properties.

http://www.chemistry-blog.com/2012/01/04/tedtalk-medicine-for-the-99-hes-about-99-wrong/

Page 5: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

What is the importance of a virtual

high-throughput screening?

(A) Sally R. Ellingson and Jerome Baudry. High-Throughput Virtual Molecular Docking: Hadoop Implementation of AutoDock4 on a Private Cloud. In Proceedings of the second international

workshop on Emerging computational methods for the life sciences (ECMLS '11). ACM, New York, NY, USA, 33-38. DOI=10.1145/1996023.1996028 http://doi.acm.org/10.1145/1996023.1996028.

(B) Sally R. Ellingson, Sivanesan Dakshanamurthy, Milton Brown, Jeremy C. Smith, and Jerome Baudry. Accelerating Virtual High-Throughput Ligand Docking: Screening One Million Compounds

Using a Petascale Supercomputer. Proceedings of the third international workshop on Emerging computational methods for the life sciences (ECMLS '12) (accepted)

(A) (B)

Page 6: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Why is high-throughput virtual

screening important in drug discovery?

http://www.chemistry-blog.com/2012/01/04/tedtalk-medicine-for-the-99-hes-about-99-wrong/

Virtual screenings: -Faster and more cost efficient -Allows larger search space of chemical compounds -Creates a wider, shorter funnel

Page 7: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Autodock4 http://autodock.scripps.edu/

Free, open source docking software developed at The Scripps Research Institute

Conformational Search using Lamarckian Genetic Algorithm

Morris, G. M., Goodsell, D. S., Halliday, R. S., Huey, R., Hart, W. E., Belew, R. K. and Olson, A. J. (1998), Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J. Comput. Chem., 19: 1639–1662.

Page 8: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Autodock4 http://autodock.scripps.edu/

Free, open source docking software developed at The Scripps Research Institute

Scoring of generated conformations

Huey, R., Morris, G. M., Olson, A. J. and Goodsell, D. S. (2007), A semiempirical free energy force field with charge-based desolvation. J. Comput. Chem., 28: 1145–1152.

Page 9: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Autodock4 http://autodock.scripps.edu/

Free, open source docking software developed at The Scripps Research Institute

Virtual Docking Process

Precalculated Affinity Grids

Receptor PDBQT

Ligand PDBQT

Docking Parameter File

AutoDock Docking Log File

This process must be done for every ligand in a high-throughput screening

Page 10: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Autodock4.lga.MPI

Task-parallel message passing interface implementation of Autodock4 for docking of very large databases of compounds using high-performance super-computers. B. Collignon, R. Schulz, J.C. Smith and J. Baudry J. Comput. Chem. (2011) 32 (6): 1202–1209

Main Improvements for Virtual Screening -Separation of parameters associated with the screening and individual ligands -Concatenated binary grid files (HDF5) -Reduced output size

A high-throughput virtual screening tool

Goal -Develop a virtual screening tool that runs on high-performance supercomputers (MPI)

Page 11: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Autodock4.lga.MPI

Task-parallel message passing interface implementation of Autodock4 for docking of very large databases of compounds using high-performance super-computers. B. Collignon, R. Schulz, J.C. Smith and J. Baudry J. Comput. Chem. (2011) 32 (6): 1202–1209

A high-throughput virtual screening tool

using 196 CPUs

maps.h5 19MB -53MB → 9.8MB-28MB

Page 12: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Autodock4.lga.MPI

Task-parallel message passing interface implementation of Autodock4 for docking of very large databases of compounds using high-performance super-computers. B. Collignon, R. Schulz, J.C. Smith and J. Baudry J. Comput. Chem. (2011) 32 (6): 1202–1209

A high-throughput virtual screening tool

Page 13: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Postdocking (analysis)

TUTORIAL http://www.bio.utk.edu/baudrylab/autodockmpi.htm

Sally R. Ellingson, Sivanesan Dakshanamurthy, Milton Brown, Jeremy C. Smith, and Jerome Baudry. Accelerating Virtual High-Throughput Ligand Docking: Screening One Million Compounds Using

a Petascale Supercomputer. Proceedings of the third international workshop on Emerging computational methods for the life sciences (ECMLS '12) (accepted)

Predocking (file preparation)

Million Compound Screening

on a petascale supercomputer

Workflow controlled by python scripts Runs on Lens (analysis cluster - Jaguar)

Page 14: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Sally R. Ellingson, Sivanesan Dakshanamurthy, Milton Brown, Jeremy C. Smith, and Jerome Baudry. Accelerating Virtual High-Throughput Ligand Docking: Screening One Million Compounds Using

a Petascale Supercomputer. Proceedings of the third international workshop on Emerging computational methods for the life sciences (ECMLS '12) (accepted)

Million Compound Screening

on a petascale supercomputer

65k processors

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33

# o

f c

om

po

un

ds

Rotatable Bonds (Degrees of Freedom)

Million Compound Library

Page 15: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

What is the importance of

multi-protein docking?

http://www.chemistry-blog.com/2012/01/04/tedtalk-medicine-for-the-99-hes-about-99-wrong/

Multi-protein docking

Many proteins of important function

Drug Candidate

Also for many conformations of the same protein – to model receptor flexibility

Multi-protein docking: -Determine toxicity and side effects -Predict failures earlier in the process -Increase overall success rate

Page 16: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Multi-protein docking and limitations

with current screening software

Multi-protein docking

Many proteins of important function

Drug Candidate

Autodock4.lga.MPI -Separate MPI jobs for each receptor -Binary grid files for each receptor

What is needed? A tool that allows an increase in the number of receptors used in a screening with a minimal increase in the amount of I/O per docking task

Receptor PDBs Ligand PDBs

Multi-protein

screening

All combinations

Page 17: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Autodock Vina Potential as docking engine for multi-protein screening

• Scoring function: machine-learning approach

• Conformational search: iterated local search global optimizer step mutation, local optimization, Metropolis acceptance criterion

Trott, O. and Olson, A. J. (2010), AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem., 31: 455–461. doi: 10.1002/jcc.21334

Average time in minutes per complex 2-quad core processors

Autodock4 Autodock Vina

Page 18: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Autodock Vina Potential as docking engine for multi-protein screening

• Calculates grid maps efficiently during docking and does not store them on disk

• Result clustering and ranking details hidden (reduced output)

• Limitations removed (i.e. maximum # of rotatable bonds)

• Already multi-threaded (each docking potentially more efficient)

Page 19: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Summary

• High-throughput molecular docking is an important tool to increase the cost and time efficiency of drug discovery

• Current screening tool, Autodock4.lga.MPI, allows for a million compounds to be screened in less than 24 hours

• Future development will focus on using multiple receptors

Page 20: Virtual High-Throughput Molecular Dockingsalsahpc.indiana.edu/ECMLS2012/slides/ECMLS12... · Outline •What is virtual molecular docking? •What is the importance of a virtual high-throughput

Acknowledgements

• Genome Science and Technology, UT • Center for Molecular Biophysics, UT/ORNL

▫ Jeremy C. Smith • SCALE-IT, NSF/IGERT

Scalable Computing and Leading Edge Innovative Technologies

• National Center for Computational Sciences • Georgetown University • NIH-CTSA • ECMLS12 workshop organizers