"The BG collaboration, Past, Present, Future. The new available resources". Prof. Dr. Paul Whitford (Rice University)

The Blue Gene Collabora/on Past, Present and Future

April 17th, 2015

Paul Whi<ord ([email protected])

Liaison to USP for HPC Rice University

Assistant Professor of Physics

Northeastern University

usp.rice.edu

Discussion Points •  Blue Gene/P specifica/ons •  Considera/ons for running and op/mizing calcula/ons •  Status of opera/ons •  Addi/onal considera/ons and resources •  Newest Resource: BG/Q!

usp.rice.edu

3

IBM Blue Gene/P at Rice University

•  377 in the Top500 supercomputer list of June 2012

•  84 teraflops of performance

•  24,576 cores

usp.rice.edu

USP-‐Rice Blue Gene Supercomputer

•  Low-‐power processors, but massively parallel –  PowerPC 450 processors: 850 MHz

– High bandwidth and low latency connec/ons

•  Was ranked in the top 500 supercomputers in the world

•  30% dedicated for USP

6 racks 192 cards 6,144 cards 24,576 cores usp.rice.edu

Each chip is a 4-‐way symmetric mul/processor (SMP) with 4Gb memory Roughly equivalent to one 3-‐Ghz Intel core.

Can be run in:

SMP mode (shared memory, threading)-‐ full memory per MPI process VN (virtual node) mode – 4 independent MPI processes, ¼ memory per MPI process shared computa/on and communica/on loads

DUAL (dual node) mode -‐2 processes w/ 2 threads each, ½ memory per MPI process Jobs are allocated in set of 4 Node Card units (512 cores)

-‐less scalable applica/ons should go elsewhere General sugges/on: high I/O uses VN, high compu/ng use SMP (use what is fastest)

Technical Considera/on 1: Distribu/on of Calcula/ons

usp.rice.edu

•  Frontend –  SUSE Linux

•  Backend –  ppc450 compute cores –  Minimal OS, lacking many system calls

•  Code is compiled on frontend and executed on the backend.

•  BG/Q also requires cross compila/on

•  Can be very difficult for users new to BG –  There is support staff to help with compiling

Technical Considera/on 2: Cross Compila/on

usp.rice.edu

Technical Considera/on 3: Scalability

•  You should always start with scalability tests. – Frequently reveals computaBonal boDlenecks.

•  User feedback is incredibly valuable! •  Please share your performance stats with us

– Can help users apply for /me on other resources (e.g. XSEDE, INCITE, PRACE)

– Will facilitate more efficient calculaBons – Will help us opBmize builds

usp.rice.edu

Examples of scaling data 100k atom explicit-‐solvent simula/on using NAMD on BG/P (Performed by Filipe Lima)

200k-‐2.5M atom simula/ons using an implicit-‐solvent model in Gromacs on Stampede

Many considera/ons impact scalability •  Is your specific calcula/on well suited for paralleliza/on?

–  For example, molecular dynamics simula/ons for large systems tend to scale to higher numbers of cores (weak vs. strong scaling)

•  Have you enabled all appropriate flags at run/me? –  Sopware packages open guess… you need to test various serngs –  For example, in MD simula/ons, you must tell the code how to distribute atoms

across nodes •  Is MPI, openMP, hybrid MPI+openMP enabled in the code?

–  MPI launched many processes, each with its own memory and instruc/ons –  openMP launches threads that share memory –  hybrid: each MPI process has its own memory, which is then shared with its own

openMP threads •  Are you trying the most appropriate combina/on of calcula/on,

paralleliza/on and flags for the machine you are using? •  Does your calcula/on take long enough to complete that scalability is

no/ceable? –  Some/mes, ini/alizing paralleliza/on may take minutes, or longer.

•  Is the performance reproducible? •  How well is the code writen? With so many factors impac/ng performance, it is essen/al that you perform scaling tests before doing produc/on calcula/ons

For when the code needs improvement… High-‐performance tuning with HPC toolkit

HPC Toolkit funcBonality •  Accurate performance

measurement •  Effec/ve performance analysis

•  Pinpoin/ng scalability botlenecks

•  Scalability botlenecks on large-‐scale parallel systems

•  Scaling on mul/core processors •  Assessing process variability •  Understanding temporal behavior

HPC Toolkit developed at Rice (Mellor-‐Crummey) Available on the BG/P

Experts are available to help you profile your code, even if you think it works well!

Sopware on BG/P Programs •  Quantum Espresso •  Abinit •  cp-‐paw •  dl_poly •  wi-‐aims •  Gromacs •  Repast •  Siesta •  Yambo •  LAMMPS •  GAMESS •  mpiBLAST •  NAMD •  NMChem •  SPECFEM3D-‐GLOBE •  VASP •  WRF •  CP2K

Installed libraries •  boost •  yw •  lapack •  scalapack •  essl •  netcdf •  zlib •  szip •  HDF5 •  Pnetcdf •  Jasper •  Java •  Libpng •  tcl

usp.rice.edu

Current Ac/vi/es •  First phase (May-‐August 2012, pre-‐contract)

– Material Science and Quantum Chemistry •  15 users from 5 research groups •  Installed 10 packages for QC/MS calcula/ons so far

– Weekly teleconference mee/ngs with Rice-‐HPC and USP users and faculty

•  Second Phase (August-‐December 2012) – USP-‐Rice contract went into effect – Assessed the demands of addi/onal scien/fic communi/es

– Collabora/on Webpage established (usp.rice.edu) – First HPC Workshop at USP: December 2012

usp.rice.edu

Current Ac/vi/es •  Con/nued growth (2013-‐2014)

– Hired addi/onal technical staff – Xiaoqin Huang –  Regular teleconference mee/ngs

•  Open for users – Opened DaVinci resource for tes/ng –  2WHPC workshop – May 2014

•  Current/Ongoing Ac/vi/es –  BG/Q went online (March 2015) –  Possible compe//ve alloca/on process

•  Scaling data important! –  Student exchanges

•  To Rice and to USP –  Build scien/fic collabora/ons

usp.rice.edu

U/liza/on of BG/P by USP Users >100 users from USP, USP São Carlos, USP Ribeirão Preto and FMUSP, across many departments

The newest resource: BG/Q •  1-‐rack Blue Gene/Q •  Already installed sopware

–  abinit, cp2k, quantum espresso, wi-‐aims, gamess, gromacs, lammps, mpi-‐blast, namd, OpenFOAM, repast, siesta, vasp, wps, wrf; blacks, blas, cmake, yw, flex, hdf5, jasper, lapack, libint, libpng, libxc, pnetcdf, scalapack, szip, tcl, zlib.

•  Qualifies for Top500 ranking (~293) •  Accep/ng USP users

–  Get a BG/P account –  Email request to LLCA for BG/Q account ac/va/on

–  Tech support available –  Details will be provided at usp.rice.edu

Differences between BG/P and BG/Q ADribute BG/P BG/Q

Maximum Performance 84 TFLOPS 209 TFLOPS

Power Consump/on 371 MFLOPS/W 2.1 GFLOPS/W

Total number of cores 24576 16384

Number of nodes 6144 1024

Cores per node 4 16 + 1 spare + 1 for O/S

Memory per node 4 GB 16 GB

Possible threads per core 1 4

CPU clock-‐speed 850 MHz 1.6 GHz

Minimum nodes/job 128 (512 physical cores) 32 (512 physical cores)

Scratch disk space 245TB 120TB

4X cores and memory per node. Mul/-‐level paralleliza/on can lead to large performance increases with BG/Q, and larger-‐memory applica/ons.

17

Additional Resource: DAVinCI –  25 TeraFLOPS x86 IBM iDataPlex

•  ~20M CPU-hours/year •  2.8GHz Intel Westmere (Hex-Core) •  192 Nodes (>2,300 cores & >4,600 threads) •  192-MPI nodes

•  16 nodes w/ 32 NVIDIA FERMI GPGPUs •  9.2 Tbyte memory (48GB per node) •  ~450 TeraBytes storage •  4x QDR InfiniBand interconnect (compute/

storage) •  Fully operational: 7/2011 •  Available to USP users on a testing basis

•  10 TeraFLOPS IBM POWER8 S822 •  3.0GHz POWER8 12 core processors •  8 nodes (192 cores, 1,536 threads) •  3.5 TiB memory (mix of 256 GiB and 1 TiB) •  361 TiB storage •  FDR Infiniband Interconnect •  To be commissioned in 2015 •  For applica/ons with very high memory demands •  If interested in tes/ng, contact [email protected]

Additional Resource: PowerOmics

19

XSEDE (National Major Supercomputing Resource formerly known as TeraGrid)

•  Extreme Science and Engineering Discovery Environment •  Name changed but The Song (and Mission) Remains the Same •  Rice Campus Champions:

–  Roger Moye –  Qiyou Jiang –  Jan Odegard

•  If you need MILLIONS of CPU hours, XSEDE is for you •  They are your advocates for getting XSEDE accounts and CPU

time so let us help you with your biggest Science

•  XSEDE requires US a researcher to be part of the team, but many people would be enthusiastic about collaborations

•  In Europe, there is PRACE, which provides similar opportunities •  Worldwide, there are many internationally competitive resources

20

Top-‐caliber support and service for research compu/ng

A thriving community of HPC scien/sts sharing research and building collabora/ons with a significant presence in na/onal cyberinfrastructure issues.

GeVng Help, Organizing CollaboraBve Efforts

usp.rice.edu

“One-‐stop shop” Informa/on on: •  Gerng accounts •  Logging on •  Compiling code •  Running jobs •  Gerng help •  Updates and ac/vi/es •  Transferring data •  Acknowledgements

•  Will be updated soon with BG/Q informa/on

USP-‐Rice Blue Gene CollaboraBon Webpage

For more informa/on •  To request access to BG/P and BG/Q, send request to LCCA ([email protected]). Include: – Your name – Email – Research group, department, PI – Research area – Code to be run – Scalability informa/on (if available)

•  For addi/onal ques/ons, email me: – Paul Whi<ord ([email protected])

usp.rice.edu

Any ques/ons?

Feel free to contact all of us. We are here to help LCCA [email protected] Paul Whi<ord [email protected] Xiaoqin Huang [email protected] Franco Bladilo [email protected]

Technology

"The BG collaboration, Past, Present, Future. The new available resources". Prof. Dr. Paul Whitford (Rice University)