23
The Blue Gene Collabora/on Past, Present and Future April 17 th , 2015 Paul Whi<ord ([email protected]) Liaison to USP for HPC Rice University Assistant Professor of Physics Northeastern University usp.rice.edu

"The BG collaboration, Past, Present, Future. The new available resources". Prof. Dr. Paul Whitford (Rice University)

  • Upload
    lccausp

  • View
    27

  • Download
    4

Embed Size (px)

Citation preview

The  Blue  Gene  Collabora/on  Past,  Present  and  Future  

April  17th,  2015    

Paul  Whi<ord  ([email protected])    

Liaison  to  USP  for  HPC  Rice  University  

 Assistant  Professor  of  Physics  

Northeastern  University  

usp.rice.edu  

Discussion  Points  •  Blue  Gene/P  specifica/ons  •  Considera/ons  for  running  and  op/mizing  calcula/ons  •  Status  of  opera/ons  •  Addi/onal  considera/ons  and  resources  •  Newest  Resource:  BG/Q!  

usp.rice.edu  

3  

IBM  Blue  Gene/P  at  Rice  University  

•  377  in  the  Top500  supercomputer  list  of  June  2012  

•  84  teraflops  of  performance  

•  24,576  cores    

usp.rice.edu  

USP-­‐Rice  Blue  Gene  Supercomputer  

•  Low-­‐power  processors,  but  massively  parallel  –  PowerPC  450  processors:  850  MHz  

– High  bandwidth  and  low  latency  connec/ons  

•  Was  ranked  in  the  top  500  supercomputers  in  the  world  

•  30%  dedicated  for  USP  

6  racks  192  cards  6,144  cards  24,576  cores   usp.rice.edu  

Each  chip  is  a  4-­‐way  symmetric  mul/processor  (SMP)  with  4Gb  memory    Roughly  equivalent  to  one  3-­‐Ghz  Intel  core.  

 Can  be  run  in:  

 SMP  mode  (shared  memory,  threading)-­‐  full  memory  per  MPI  process    VN  (virtual  node)  mode  –  4  independent  MPI  processes,  ¼  memory  per  MPI  process      shared  computa/on  and  communica/on  loads  

       DUAL  (dual  node)  mode  -­‐2  processes  w/  2  threads  each,  ½  memory  per  MPI  process    Jobs  are  allocated  in  set  of  4  Node  Card  units  (512  cores)  

 -­‐less  scalable  applica/ons  should  go  elsewhere    General  sugges/on:  high  I/O  uses  VN,  high  compu/ng  use  SMP  (use  what  is  fastest)  

Technical  Considera/on  1:  Distribu/on  of  Calcula/ons  

usp.rice.edu  

•  Frontend  –  SUSE  Linux  

•  Backend  –  ppc450  compute  cores  –  Minimal  OS,  lacking  many  system  calls  

•  Code  is  compiled  on  frontend  and  executed  on  the  backend.  

•  BG/Q  also  requires  cross  compila/on  

•  Can  be  very  difficult  for  users  new  to  BG  –  There  is  support  staff  to  help  with  compiling  

Technical  Considera/on  2:  Cross  Compila/on  

usp.rice.edu  

Technical  Considera/on  3:  Scalability  

•  You  should  always  start  with  scalability  tests.  – Frequently  reveals  computaBonal  boDlenecks.  

•  User  feedback  is  incredibly  valuable!  •  Please  share  your  performance  stats  with  us  

– Can  help  users  apply  for  /me  on  other  resources  (e.g.  XSEDE,  INCITE,  PRACE)  

– Will  facilitate  more  efficient  calculaBons  – Will  help  us  opBmize  builds  

usp.rice.edu  

Examples  of  scaling  data  100k  atom  explicit-­‐solvent  simula/on  using  NAMD  on  BG/P  (Performed  by  Filipe  Lima)  

200k-­‐2.5M  atom  simula/ons  using  an  implicit-­‐solvent  model  in  Gromacs  on  Stampede  

Many  considera/ons  impact  scalability  •  Is  your  specific  calcula/on  well  suited  for  paralleliza/on?  

–  For  example,  molecular  dynamics  simula/ons  for  large  systems  tend  to  scale  to  higher  numbers  of  cores  (weak  vs.  strong  scaling)  

•  Have  you  enabled  all  appropriate  flags  at  run/me?  –  Sopware  packages  open  guess…  you  need  to  test  various  serngs  –  For  example,  in  MD  simula/ons,  you  must  tell  the  code  how  to  distribute  atoms  

across  nodes  •  Is  MPI,  openMP,  hybrid  MPI+openMP  enabled  in  the  code?  

–  MPI  launched  many  processes,  each  with  its  own  memory  and  instruc/ons  –  openMP  launches  threads  that  share  memory  –  hybrid:  each  MPI  process  has  its  own  memory,  which  is  then  shared  with  its  own  

openMP  threads  •  Are  you  trying  the  most  appropriate  combina/on  of  calcula/on,  

paralleliza/on  and  flags  for  the  machine  you  are  using?  •  Does  your  calcula/on  take  long  enough  to  complete  that  scalability  is  

no/ceable?  –  Some/mes,  ini/alizing  paralleliza/on  may  take  minutes,  or  longer.  

•  Is  the  performance  reproducible?  •  How  well  is  the  code  writen?  With  so  many  factors  impac/ng  performance,  it  is  essen/al  that  you  perform  scaling  tests  before  doing  produc/on  calcula/ons  

For  when  the  code  needs  improvement…  High-­‐performance  tuning  with  HPC  toolkit  

HPC  Toolkit  funcBonality  •  Accurate  performance  

measurement  •  Effec/ve  performance  analysis  

•  Pinpoin/ng  scalability  botlenecks  

•  Scalability  botlenecks  on  large-­‐scale  parallel  systems  

•  Scaling  on  mul/core  processors  •  Assessing  process  variability  •  Understanding  temporal  behavior  

HPC  Toolkit  developed  at  Rice  (Mellor-­‐Crummey)  Available  on  the  BG/P  

Experts  are  available  to  help  you  profile  your  code,  even  if  you  think  it  works  well!  

Sopware  on  BG/P  Programs  •  Quantum  Espresso  •  Abinit  •  cp-­‐paw  •  dl_poly  •  wi-­‐aims  •  Gromacs  •  Repast  •  Siesta  •  Yambo  •  LAMMPS  •  GAMESS  •  mpiBLAST  •  NAMD  •  NMChem  •  SPECFEM3D-­‐GLOBE  •  VASP  •  WRF  •  CP2K  

 

Installed  libraries  •  boost  •  yw  •  lapack  •  scalapack  •  essl  •  netcdf  •  zlib  •  szip  •  HDF5  •  Pnetcdf  •  Jasper  •  Java  •  Libpng  •  tcl  

usp.rice.edu  

Current  Ac/vi/es  •  First  phase  (May-­‐August  2012,  pre-­‐contract)  

– Material  Science  and  Quantum  Chemistry  •  15  users  from  5  research  groups  •  Installed  10  packages  for  QC/MS  calcula/ons  so  far  

– Weekly  teleconference  mee/ngs  with  Rice-­‐HPC  and  USP  users  and  faculty  

•  Second  Phase  (August-­‐December  2012)  – USP-­‐Rice  contract  went  into  effect  – Assessed  the  demands  of  addi/onal  scien/fic  communi/es  

– Collabora/on  Webpage  established  (usp.rice.edu)  – First  HPC  Workshop  at  USP:  December  2012  

usp.rice.edu  

Current  Ac/vi/es  •  Con/nued  growth  (2013-­‐2014)  

– Hired  addi/onal  technical  staff  –  Xiaoqin  Huang  –  Regular  teleconference  mee/ngs  

•  Open  for  users  – Opened  DaVinci  resource  for  tes/ng  –  2WHPC  workshop  –  May  2014  

•  Current/Ongoing  Ac/vi/es  –  BG/Q  went  online  (March  2015)  –  Possible  compe//ve  alloca/on  process  

•  Scaling  data  important!  –  Student  exchanges    

•  To  Rice  and  to  USP  –  Build  scien/fic  collabora/ons  

usp.rice.edu  

U/liza/on  of  BG/P  by  USP  Users  >100  users  from  USP,  USP  São  Carlos,  USP  Ribeirão  Preto  and  FMUSP,  across  many  departments  

The  newest  resource:  BG/Q  •  1-­‐rack  Blue  Gene/Q  •  Already  installed  sopware  

–  abinit,  cp2k,  quantum  espresso,  wi-­‐aims,  gamess,  gromacs,  lammps,  mpi-­‐blast,  namd,  OpenFOAM,  repast,  siesta,  vasp,  wps,  wrf;  blacks,  blas,  cmake,  yw,  flex,  hdf5,  jasper,  lapack,  libint,  libpng,  libxc,  pnetcdf,  scalapack,  szip,  tcl,  zlib.  

•  Qualifies  for  Top500  ranking  (~293)  •  Accep/ng  USP  users  

–  Get  a  BG/P  account  –  Email  request  to  LLCA  for  BG/Q  account  ac/va/on  

–  Tech  support  available  –  Details  will  be  provided  at  usp.rice.edu  

Differences  between  BG/P  and  BG/Q  ADribute   BG/P   BG/Q  

Maximum  Performance   84  TFLOPS   209  TFLOPS  

Power  Consump/on   371  MFLOPS/W   2.1  GFLOPS/W  

Total  number  of  cores   24576   16384  

Number  of  nodes   6144   1024  

Cores  per  node   4   16  +  1  spare  +  1  for  O/S  

Memory  per  node   4  GB   16  GB  

Possible  threads  per  core   1   4  

CPU  clock-­‐speed   850  MHz   1.6  GHz  

Minimum  nodes/job   128  (512  physical  cores)   32  (512  physical  cores)  

Scratch  disk  space   245TB   120TB  

4X  cores  and  memory  per  node.  Mul/-­‐level  paralleliza/on  can  lead  to  large  performance  increases  with  BG/Q,  and  larger-­‐memory  applica/ons.  

17

Additional Resource: DAVinCI –  25 TeraFLOPS x86 IBM iDataPlex

•  ~20M CPU-hours/year •  2.8GHz Intel Westmere (Hex-Core) •  192 Nodes (>2,300 cores & >4,600 threads) •  192-MPI nodes

•  16 nodes w/ 32 NVIDIA FERMI GPGPUs •  9.2 Tbyte memory (48GB per node) •  ~450 TeraBytes storage •  4x QDR InfiniBand interconnect (compute/

storage) •  Fully operational: 7/2011 •  Available to USP users on a testing basis

•  10  TeraFLOPS  IBM  POWER8  S822  •  3.0GHz  POWER8  12  core  processors  •  8  nodes  (192  cores,  1,536  threads)  •  3.5  TiB  memory  (mix  of  256  GiB  and  1  TiB)  •  361  TiB  storage  •  FDR  Infiniband  Interconnect  •  To  be  commissioned  in  2015  •  For  applica/ons  with  very  high  memory  demands  •  If  interested  in  tes/ng,  contact  [email protected]  

Additional Resource: PowerOmics  

19

XSEDE (National Major Supercomputing Resource formerly known as TeraGrid)

•  Extreme Science and Engineering Discovery Environment •  Name changed but The Song (and Mission) Remains the Same •  Rice Campus Champions:

–  Roger Moye –  Qiyou Jiang –  Jan Odegard

•  If you need MILLIONS of CPU hours, XSEDE is for you •  They are your advocates for getting XSEDE accounts and CPU

time so let us help you with your biggest Science

•  XSEDE requires US a researcher to be part of the team, but many people would be enthusiastic about collaborations

•  In Europe, there is PRACE, which provides similar opportunities •  Worldwide, there are many internationally competitive resources

20  

Top-­‐caliber  support  and  service  for  research  compu/ng  

A  thriving  community  of  HPC  scien/sts  sharing  research  and  building  collabora/ons  with  a  significant  presence  in  na/onal  cyberinfrastructure  issues.  

GeVng  Help,  Organizing  CollaboraBve  Efforts  

usp.rice.edu  

“One-­‐stop  shop”  Informa/on  on:  •  Gerng  accounts  •  Logging  on  •  Compiling  code  •  Running  jobs  •  Gerng  help  •  Updates  and  ac/vi/es  •  Transferring  data  •  Acknowledgements  

•  Will  be  updated  soon  with  BG/Q  informa/on  

USP-­‐Rice  Blue  Gene  CollaboraBon  Webpage  

For  more  informa/on  •  To  request  access  to  BG/P  and  BG/Q,  send  request  to  LCCA  ([email protected]).    Include:  – Your  name  – Email  – Research  group,  department,  PI  – Research  area  – Code  to  be  run  – Scalability  informa/on  (if  available)  

•  For  addi/onal  ques/ons,  email  me:  – Paul  Whi<ord  ([email protected])  

usp.rice.edu  

Any  ques/ons?  

Feel  free  to  contact  all  of  us.    We  are  here  to  help    LCCA  [email protected]  Paul  Whi<ord  [email protected]  Xiaoqin  Huang  [email protected]  Franco  Bladilo  [email protected]