11
Efficient and Accurate Mul0scale Reac0ve Molecular Simula0ons Chris Knight (ANL/CI) PI: Gregory A. Voth (Univ. Chicago/ANL)

Efficient(and(Accurate(Mul0scale(Reac0ve ... - Rice Universitycscads.rice.edu/chris_knight.pdf · Efficient(and(Accurate(Mul0scale(Reac0ve(Molecular(Simula0ons(Chris&Knight(ANL/CI)&

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Efficient(and(Accurate(Mul0scale(Reac0ve ... - Rice Universitycscads.rice.edu/chris_knight.pdf · Efficient(and(Accurate(Mul0scale(Reac0ve(Molecular(Simula0ons(Chris&Knight(ANL/CI)&

Efficient  and  Accurate  Mul0scale  Reac0ve  Molecular  Simula0ons  

Chris  Knight  (ANL/CI)  PI:  Gregory  A.  Voth  (Univ.  Chicago/ANL)  

!

Page 2: Efficient(and(Accurate(Mul0scale(Reac0ve ... - Rice Universitycscads.rice.edu/chris_knight.pdf · Efficient(and(Accurate(Mul0scale(Reac0ve(Molecular(Simula0ons(Chris&Knight(ANL/CI)&

A)  Project  Overview  •  The  project  

–  ImplementaGon  of  mulGple  program  algorithms  for  mulGstate  simulaGons.  

•  Science  goals  –  Fundamental  understanding  of  charge  transport  processes  and  

mulGscale  phenomena  

•  The  parGcipants,  descripGon  of  team  –  Graduate  students  and  postdocs  at  U.  Chicago  and  ANL.  

•  History  –  ReacGve  methodology  in  development  since  early  90’s  –  Current  code  is  about  3  years  old.  

•  Goals    –  Scale  a  single  simulaGon  (~1  million  atoms)  to  leadership  scale  

Page 3: Efficient(and(Accurate(Mul0scale(Reac0ve ... - Rice Universitycscads.rice.edu/chris_knight.pdf · Efficient(and(Accurate(Mul0scale(Reac0ve(Molecular(Simula0ons(Chris&Knight(ANL/CI)&

B)  Science  Lesson  

•  Construct  linear  combinaGon  of  chemical  bond  topologies  

•  Integrate  Newton’s  EOM  

H =

V11 V12 V13 V14V12 V22 0 0V13 0 V33 0V14 0 0 V44

"

#

$ $ $ $

%

&

' ' ' '

Fj = cmcnFjmn

mn∑

Page 4: Efficient(and(Accurate(Mul0scale(Reac0ve ... - Rice Universitycscads.rice.edu/chris_knight.pdf · Efficient(and(Accurate(Mul0scale(Reac0ve(Molecular(Simula0ons(Chris&Knight(ANL/CI)&

C)  Parallel  Programming  Model  

•  LAMMPS  &  RAPTOR  (developed  in  our  group)  –  h[p://lammps.sandia.gov/  

•  Hybrid  MPI  +  OpenMP  (now  w/  a  flavor  of  mulGple  program)  –  Also  supports  GPU  (Cuda  and  OpenCL)  

•  Highly  portable  C++  and  only  requires  FFTW    •  What  pla`orms  does  the  applicaGon  currently  run  on?  

–  All  popular  pla`orms.    

•  Current  status  &  future  plans  for  the  programming  model  –  Focusing  heavily  on  mulGple  program  parallelizaGon  

Page 5: Efficient(and(Accurate(Mul0scale(Reac0ve ... - Rice Universitycscads.rice.edu/chris_knight.pdf · Efficient(and(Accurate(Mul0scale(Reac0ve(Molecular(Simula0ons(Chris&Knight(ANL/CI)&

D)  ComputaGonal  Methods  

•  SpaGal  domain-­‐decomposiGon  

•  Rely  heavily  on  FFTs  for  calculaGon  of  electrostaGc  interacGons.  

 •  We  are  currently  working  on  

extending  to  N  >  2  parGGons.  

•  Extensions  to  mulGple  reacGve  species  (mulGple  Hamiltonian  matrices).  

!

H =

V11 V12 V13 V14V12 V22 0 0V13 0 V33 0V14 0 0 V44

"

#

$ $ $ $

%

&

' ' ' '

Page 6: Efficient(and(Accurate(Mul0scale(Reac0ve ... - Rice Universitycscads.rice.edu/chris_knight.pdf · Efficient(and(Accurate(Mul0scale(Reac0ve(Molecular(Simula0ons(Chris&Knight(ANL/CI)&

E)  I/O  Pa[erns  and  Strategy  

•  One  set  of  output  files  per  job  –  Everyone  sends  data  to  rank  0  

•  Approximate  sizes  of  inputs  and  outputs  –  Inputs  <  1  GB  –  Outputs  ~100s  GB  in  binary  format  (dcd,  xtc,…)    

•  Checkpoint  /  Restart  capabiliGes:    –  Coordinates,  velociGes,  etc…  (<  1  GB)  in  binary  format    

•  Current  status  and  future  plans  for  I/O  –  ContemplaGng  strategy  for  n  ranks  to  write  output  (n  <  N)  

Page 7: Efficient(and(Accurate(Mul0scale(Reac0ve ... - Rice Universitycscads.rice.edu/chris_knight.pdf · Efficient(and(Accurate(Mul0scale(Reac0ve(Molecular(Simula0ons(Chris&Knight(ANL/CI)&

F)  VisualizaGon  and  Analysis  

•  How  do  you  explore  the  data  generated?  –  Molecular  visualizaGon  typically  w/  VMD  (also  serves  as  tool)  –  At  larger  scales,  might  need  to  switch  to  parallel  visualizaGon  program  –  Reduce  resoluGon  (coarse-­‐graining)  –  Analysis  of  1-­‐D  and  2-­‐D  distribuGon  funcGons  

   •  Current  status  and  future  plans  for  your  viz  and  analysis  

–  Port  analysis  codes  to  GPUs  

Page 8: Efficient(and(Accurate(Mul0scale(Reac0ve ... - Rice Universitycscads.rice.edu/chris_knight.pdf · Efficient(and(Accurate(Mul0scale(Reac0ve(Molecular(Simula0ons(Chris&Knight(ANL/CI)&

G)  Performance  

•  What  tools  do  you  use  now  to  explore  performance?  –  TAU  and  HPCToolkit    

•  What  do  you  believe  is  your  current  bo[leneck  to  be[er  performance?  –  We  know  that  the  current  bo[leneck  is  FFTs  and  electrostaGc  

calculaGons.  

•  What  do  you  believe  is  your  current  bo[leneck  to  be[er  scaling?  –  MulGple  Program  strategy  to  simultaneously  evaluate  matrix  elements  

•  Current  status  and  future  plans  for  improving  performance  –  MulGple  Program  +  OpenMP    –  approximate  electrostaGc  algorithms  (currently  in  develoment)  

Page 9: Efficient(and(Accurate(Mul0scale(Reac0ve ... - Rice Universitycscads.rice.edu/chris_knight.pdf · Efficient(and(Accurate(Mul0scale(Reac0ve(Molecular(Simula0ons(Chris&Knight(ANL/CI)&

H)  Tools  

•  How  do  you  debug  your  code?  –  gdb,  totalview    

•  What  other  tools  do  you  use?  –  VMD  w/  Tcl  scripts    

•  Current  status  and  future  plans  for  improved  tool  integraGon  and  support  

Page 10: Efficient(and(Accurate(Mul0scale(Reac0ve ... - Rice Universitycscads.rice.edu/chris_knight.pdf · Efficient(and(Accurate(Mul0scale(Reac0ve(Molecular(Simula0ons(Chris&Knight(ANL/CI)&

I)  Status  and  Scalability  

•  How  does  your  applicaGon  scale  now?  –  First  Gme  ever,  we’ve  recently  run  reacGve  simulaGons  at  16K  cores  

•  Where  do  you  want  to  be  in  a  year?  –  MulGple  racks  of  Mira…    

•  What  are  your  top  5  pains?  (be  specific)  –  1-­‐5:  Cost  of  FFTs  for  accurate  electrostaGcs  

•  What  did  you  change  to  achieve  current  scalability?    (replaced  algorithms,  data  structures,  math  library,  etc)  –  MulGple  program,  MulGple  program,  MulGple  program    

•  Current  status  and  future  plans  for  improving  scaling  –  di[o…  

Page 11: Efficient(and(Accurate(Mul0scale(Reac0ve ... - Rice Universitycscads.rice.edu/chris_knight.pdf · Efficient(and(Accurate(Mul0scale(Reac0ve(Molecular(Simula0ons(Chris&Knight(ANL/CI)&

J)  Roadmap  

•  Where  will  your  science  take  you  over  the  next  2  years?  –  Efficient  and  accurate  simulaGon  of  arbitrary  chemical  reacGons  in  

condensed  phases  over  mulGple  Gme  and  length  scales.    

•  What  do  you  hope  to  learn  /  discover?  –  Fundamental  understanding  of  charge  transport  processes  and  mulGscale  

phenomena  Solid Lines: Single Program (SP) Dashed Lines: MP

: Exact Energies & Forces : Approximate Off-Diagonal Couplings : All matrix elements approximated.