Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Attacking HIV with Petascale Molecular Dynamics Simulations on Titan and Blue Waters
James Phillips Beckman Institute, University of Illinois http://www.ks.uiuc.edu/Research/namd/
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Developers of the widely used computational biology software VMD and NAMD
250,000 registered VMD users 80,000 registered NAMD users
600 publications (since 1972) over 54,000 citations
5 faculty members 8 developers 1 systems administrator 17 postdocs 46 graduate students 3 administrative staff
research projects include: virus capsids, ribosome, photosynthesis, protein folding, membrane reshaping, animal magnetoreception
Tajkorshid, Luthey-Schulten, Stone, Schulten, Phillips, Kale, Mallon
NIH Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics
Achievements Built on People
Renewed 2012-2017 with 10.0 score (NIH)
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Related talks (to stream)• S6623 - Advances in NAMD GPU Performance
• Antti-Pekka Hynninen, Oak Ridge National Laboratory
• S6253 - VMD: Petascale Molecular Visualization and Analysis with Remote Video Streaming • John Stone, University of Illinois at Urbana-Champaign
3
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Need for petascale: Simulation follows structural discovery
1990 1994 1998 2002 2006 2010104
105
106
107
108
2014
Lysozyme ApoA1
ATP Synthase
STMV
Ribosome
HIV capsid
Num
ber
of a
tom
s
1986
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Structural Route to the HIV-1 Capsid
Zhao et al. , Nature 497: 643-646 (2013)
High res. EM of hexameric tubules, tomography of capsids, all-atom model of capsid by MDFF
Pornillos et al. , Cell 2009, Nature 2011
crystal structures of separated hexamer and pentamer
Ganser et al. Science, 1999
1st TEM (1999) 1st tomography (2003)
Briggs et al. EMBO J, 2003Briggs et al. Structure, 2006
cryo-ET (2006)
Byeon et al., Cell 2009Li et al., Nature, 2000
hexameric tubules
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Host Cell
Capsid uncoating
Integration into the host’s
chromatin
Virion
Nuclear Import
Binding Fusion
Budding
Capsid shepherds HIV RNA inside the cell
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Fusion/Entry inhibitors
Protease inhibitors
Reverse Transcription (RNA to DNA) inhibitors
Integrase inhibitors
Currently no drug targets capsid uncoating or nuclear import !
How is HIV treated today?
Host Cell
HIV capsid contains 186 hexamers, 12 pentamers, 1300+ proteins, 4.2 million atoms
Malleability of HIV-1 CA
Hexamer of hexamers bite angles along chiral axis
Native capsid bite angle distribution
pentamers
hexamers
1300 proteins in different conformations
G. Zhao, et al. Nature 497 (2013)
One-Microsecond Simulation Includes 64 Million Atoms
Key person: Juan Perilla (UIUC)
Capsid acts as an osmotic regulatorResults from 64 M atom, 1 µs molecular dynamics
simulation!
Chloride ions permeate through the hexameric center
A204
E213
K203
I201
Nature497,643-646 A204C mutant in vitroPeijun Zhang - U. Pittsburgh
Curvature is regulated by the trimer interface at the atomic level
G. Zhao, et al. Nature 497 (2013)HIV-CA wild-type in vitro
HIV uncoating relies on cell factor Cyclophilin A
F.Diaz-Griffero,Viruses(2011)
Binding of cypA
Infection of nucleus
Binding of E2
Premature uncoating
No infection
cypA binding pattern prevents TRIM binding, but leaves nuclear pore interactions intact
TRIMlattice
Cell Nucleus
Outside of Cell
Simulation reveals how CypA stabilizes capsid
interactionconfirmedbyNMR
only polarizable force field yields stable bridge interaction
A204
E213
K203
I201
Curvature regulatedby trimeric interface.Nature, 497 643-646
CypA bridges adjacent capsid subunitsNature Communications, 7 10714
Chemical Detail is Essential to Capsid Function
Ions permeate through the capsid.In preparation.
Pfizer PF74 inhibits HIV-1 infection at an early step during infection.Journal of Physical Chemistry Letters, accepted
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Peijun Zhang Angela M. Gronenborn Department of Structural Biology Center for HIV Protein Interactions University of Pittsburgh School of Medicine
Christopher Aiken Department of Pathology and Immunology Vanderbilt University School of Medicine
Juan R. Perilla Klaus Schulten Theoretical and Computational Biophysics Group
HIV Acknowledgments
Laxmikant Kale Parallel Programming Lab Dept. of Computer Science
University of Illinois at Urbana-Champaign
MDFF produces optimal overlap with EM data“Dock” array models into EM density
Beyond Viruses: Modeling the Bacterial Brain
Cassidy, C. K. et al. "CryoEM and computer simulations reveal a novel kinase conformational switch in bacterial chemotaxis signaling." eLife 4 (2015): e08419.
Chemosensory Array
Core-signaling Unit (PDB: 3JA6)
IF apo
IF bound
in
out
in
out
OF apoout
in
out
in
OF bound
apo
+substrate
Law,%et#al.,%Biochemistry#46,%12190%(2007).%
Beyond Large: Complete Description of Transport Cycle via Bias-Exchange Umbrella Sampling (Moradi, Tajkhorshid)
M. Moradi, G. Enkavi, and E. Tajkhorshid (2015) Nature Communications, 6:8393.
Beyond Illinois: Influenza Virus Coat, Amaro Lab, UCSD
http://www.ncsa.illinois.edu/news/story/blue_waters_enables_massive_flu_simulations
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
GPUs are critical for visualization and analysisLarge memory GPU-accelerated workstations can be accessed remotely from our facility today, but for future machines must be embedded at supercomputer centers.
1 Gigabit Network
Compressed Video
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Remote Visualization Now• TACC Stampede supports this today
– Includes nodes with 1TB memory – Not virtualized, allocate full dedicated node – New Maverick cluster added
• Blue Waters – no visualization resource • Titan – Rhea adds large-memory GPU nodes • NIH Center - using NICE DCV for remote access
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
NAMD is based on Charm++
Complete info at charmplusplus.org
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Charm++ Used by NAMD• Parallel C++ with data driven objects. • Asynchronous method invocation. • Prioritized scheduling of messages/execution. • Measurement-based load balancing. • Portable messaging layer.
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
• Spatially decompose data and communication. • Separate but related work decomposition. • “Compute objects” facilitate iterative, measurement-based load balancing system.
NAMD Hybrid DecompositionKale et al., J. Comp. Phys. 151:283-312, 1999.
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Phillips et al., SC2002.
Offload to GPU
Objects are assigned to processors and queued as data arrives.
NAMD Overlapping Execution
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Overlapping GPU and CPU with Communication
Remote Force Local ForceGPU
CPU
Other Nodes/Processes
LocalRemote
xf f
f
f
Local x
x
Update
One Timestep
x
Phillips et al., SC2008
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Phillips et al., SC14Torus Adaptation• Job partitioning for multiple
copy sampling algorithms • Mapping NAMD spatial
decomposition domains onto machine torus
• Mapping particle-mesh Ewald (PME) electrostatics onto spatial decomposition
Additional Techniques• Coarsening of PME grid to
reduce long-range communication
• Offloading of PME interpolation onto GPUs
• Removal of implicit synchronization in pressure control algorithm
NAMD 2.11 Release (December 2015)
• Improved GPU kernels from Antti-Pekka Hynninen • Reciprocal forces - avoid duplicate calculations • 30% faster explicit solvent • 100% faster implicit solvent
• Improved scaling for GPU-accelerated simulations • Stream results from GPU • Better CPU-side parallelization of PME • Described in GTC 2015 talk
• Asynchronous multi-copy Tcl scripting interface • Enables central work queue, workflow-style programming • Enables multiplexing replicas, hiding load imbalance
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Streaming CPU Results to CPU• Allows incremental results from a single grid to be
processed on CPU before grid finishes on GPU • Allows merging and prioritizing of remote and local work • GPU side:
– Write results to host-mapped memory (also without streaming) – __threadfence_system() and __syncthreads() – Atomic increment for next output queue location – Write result index to output queue
• CPU side: – Poll end of output queue (int array) in host memory
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Non-Streaming Kernel
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Streaming Kernel
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Non-bonded local kernel
Non-bonded remote kernel
Results from GPU
Incoming positions
Integration
Bonded on CPU
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Streaming Kernel Performance Impact - GTC 2015
(2fs timestep)
4
8
16
32
64
256 512 1024 2048 4096Number of Nodes
21M atoms
Perfo
rman
ce (n
s pe
r day
)
Blue Waters XK7 (new streaming)Blue Waters XK7 (no streaming)
+10%
+30%
+10-30%
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Performance Comparison - GTC 2015
(2fs timestep)
0.25
0.5
1
2
4
8
16
32
64
256 512 1024 2048 4096 8192 16384Number of Nodes
21M atoms
224M atoms
Perfo
rman
ce (n
s pe
r day
)
Blue Waters XK7 (GTC15)Titan XK7 (GTC15)
Edison XC30 (SC14)Blue Waters XE6 (SC14)
+50-200%+100-200%
+70%
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Topology-Aware Scheduling on Blue Waters
• Map jobs to convex sets to avoid network interference
• NCSA, Cray, Adaptive • Enabled January 2015 • Most likely explanation for
Blue Waters performance advantage over Titan
• See Enos et al., CUG 2014
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
NAMD 2.11 Performance - GTC 2016
(2fs timestep)
0.25
0.5
1
2
4
8
16
32
64
256 512 1024 2048 4096 8192 16384Number of Nodes
21M atoms
224M atoms
Perfo
rman
ce (n
s pe
r day
)
Blue Waters XK7 (GTC15)Titan XK7 (GTC16)Titan XK7 (GTC15)
Edison XC30 (SC14)Blue Waters XE6 (SC14)
+50-200%
+100-200%
+70%
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
NAMD 2.11 Performance - GTC 2016
(2fs timestep)
0.25
0.5
1
2
4
8
16
32
64
256 512 1024 2048 4096 8192 16384Number of Nodes
21M atoms
224M atoms
Perfo
rman
ce (n
s pe
r day
)
Blue Waters XK7 (GTC16)Titan XK7 (GTC16)
Edison XC30 (SC14)Blue Waters XE6 (SC14)
+50-200%+100-200%
+70%
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Looking forward to Summit• Highest throughput: Volta GPU • Fastest single-thread: Power 9 • Fastest data transfer: NVLink • Fewer, fatter nodes: Only 3,400 • Five times Titan performance • Potential for remote visualization • “Molecular Machinery of the Brain”
early science project
40
Synaptic vesicle and pre-synaptic membrane
Biomedical Technology Research Center for Macromolecular Modeling and Bioinformatics Beckman Institute, University of Illinois at Urbana-Champaign - www.ks.uiuc.eduGTC 2016
Thanks to: NIH, NSF, ORNL (Antti-Pekka Hynninen), ANL (Brian Radak), NVIDIA (Sarah Tariq, Patric Zhao, Sky Wu, Justin Luitjens, Nikolai Sakharnykh),
Cray (Sarah Anderson, Ryan Olson), NCSA (Robert Brunner),PPL (Eric Bohm, Yanhua Sun, Gengbin Zheng, Nikhil Jain)
and 20 years of NAMD and Charm++ developers and users.
James Phillips Beckman Institute, University of Illinois http://www.ks.uiuc.edu/Research/namd/