Upload
others
View
28
Download
0
Embed Size (px)
Citation preview
Scott McMillan
September 2018
MAKING CONTAINERS EASIER WITH HPC CONTAINER MAKER
2
NVIDIA GPU CLOUD (NGC)
Discover 35 GPU-Accelerated ContainersDeep learning, HPC applications, NVIDIA HPC visualization tools, and partner applications
Innovate in Minutes, Not WeeksGet up and running quickly and reduce complexity
Access from AnywhereContainers run on supported cloud providers, NVIDIA DGX Systems, and PCs with NVIDIA Volta or Pascal™ architecture GPUs
Simple Access to Ready–to-Run, GPU-Accelerated Software
3
RAPID USER ADOPTION
HPC APP CONTAINERS ON NGC
RAPID CONTAINER ADDITION
GAMESS
CHROMACANDLE
GROMACS LAMMPS NAMD RELION
LATTICE
MICROBESMILC PIConGPU BigDFT
PGI
4
GET STARTED TODAY WITH NGC
To learn more about all of the GPU-accelerated software from NGC, visit:
nvidia.com/cloud
To sign up or explore NGC, visit:
ngc.nvidia.com
Sign Up and Access Containers for Free
5
WHAT IF A CONTAINER IMAGE IS NOT AVAILABLE FROM NGC?
6
TYPICAL BARE METAL WORKFLOW
Login to system (e.g., CentOS 7 with Mellanox OFED 3.4)
$ module load PrgEnv/GCC+OpenMPI
$ module load cuda/9.0
$ module load gcc
$ module load openmpi/1.10.7
Steps to build application
What is the equivalentcontainer workflow?
Result: application binary suitable for that
particular bare metal system
7
CONTAINER WORKFLOW
Login to system (e.g., CentOS 7 with Mellanox OFED 3.4)
$ module load PrgEnv/GCC+OpenMPI
$ module load cuda/9.0
$ module load gcc
$ module load openmpi/1.10.7
Steps to build application
FROM nvidia/cuda:9.0-devel-centos7
Result: application binary suitable for that
particular bare metal system
8
OPENMPI DOCKERFILE VARIANTSReal examples – which one should you use?
RUN OPENMPI_VERSION=3.0.0 && \wget -q -O - https://www.open-
mpi.org/software/ompi/v3.0/downloads/openmpi-${OPENMPI_VERSION}.tar.gz | tar -xzf - && \
cd openmpi-${OPENMPI_VERSION} && \./configure --enable-orterun-prefix-by-default --with-cuda --
with-verbs \--prefix=/usr/local/mpi --disable-getpwuid && \
make -j"$(nproc)" install && \cd .. && rm -rf openmpi-${OPENMPI_VERSION} && \echo "/usr/local/mpi/lib" >> /etc/ld.so.conf.d/openmpi.conf &&
ldconfigENV PATH /usr/local/mpi/bin:$PATH
WORKDIR /tmpADD http://www.open-mpi.org//software/ompi/v1.10/downloads/openmpi-1.10.7.tar.gz /tmpRUN tar -xzf openmpi-1.10.7.tar.gz && \
cd openmpi-*&& ./configure --with-cuda=/usr/local/cuda \--enable-mpi-cxx --prefix=/usr && \make -j 32 && make install && cd /tmp \&& rm -rf openmpi-*
RUN mkdir /logsRUN wget -nv https://www.open-mpi.org/software/ompi/v1.10/downloads/openmpi-1.10.7.tar.gz && \
tar -xzf openmpi-1.10.7.tar.gz && \cd openmpi-*&& ./configure --with-cuda=/usr/local/cuda \--enable-mpi-cxx --prefix=/usr 2>&1 | tee /logs/openmpi_config
&& \make -j 32 2>&1 | tee /logs/openmpi_make && make install 2>&1
| tee /logs/openmpi_install && cd /tmp \&& rm -rf openmpi-*
RUN apt-get update \&& apt-get install -y --no-install-recommends \libopenmpi-dev \openmpi-bin \openmpi-common \
&& rm -rf /var/lib/apt/lists/*ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/openmpi/lib
RUN wget -q -O - https://www.open-mpi.org/software/ompi/v3.0/downloads/openmpi-3.0.0.tar.bz2 | tar -xjf - && \
cd openmpi-3.0.0 && \CXX=pgc++ CC=pgcc FC=pgfortran F77=pgfortran ./configure --
prefix=/usr/local/openmpi --with-cuda=/usr/local/cuda --with-verbs --disable-getpwuid && \
make -j4 install && \rm -rf /openmpi-3.0.0
COPY openmpi /usr/local/openmpiWORKDIR /usr/local/openmpiRUN /bin/bash -c "source /opt/pgi/LICENSE.txt && CC=pgcc CXX=pgc++ F77=pgf77 FC=pgf90 ./configure --with-cuda --prefix=/usr/local/openmpi”RUN /bin/bash -c "source /opt/pgi/LICENSE.txt && make all install"
A
B
C
D
E
F
9
HPC CONTAINER MAKER
• Tool for creating HPC application Dockerfiles and Singularity recipe files
• You will still need to build the container images
• Makes it easier to create HPC application containers by encapsulating best practices into building blocks
• Open source (Apache 2.0)https://github.com/NVIDIA/hpc-container-maker
• pip install hpccm
https://devblogs.nvidia.com/making-containers-easier-with-hpc-container-maker/
11
BUILDING BLOCKS TO CONTAINER RECIPES
# OpenMPI version 3.0.0RUN yum install -y \
bzip2 file hwloc make openssh-clients perl tar wget && \rm -rf /var/cache/yum/*
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://www.open-mpi.org/software/ompi/v3.0/downloads/openmpi-3.0.0.tar.bz2 && \
mkdir -p /var/tmp && tar -x -f /var/tmp/openmpi-3.0.0.tar.bz2 -C /var/tmp -j && \cd /var/tmp/openmpi-3.0.0 && CC=gcc CXX=g++ F77=gfortran F90=gfortran FC=gfortran ./configure --
prefix=/usr/local/openmpi --disable-getpwuid --enable-orterun-prefix-by-default --with-cuda=/usr/local/cuda --with-verbs && \
make -j4 && \make -j4 install && \rm -rf /var/tmp/openmpi-3.0.0.tar.bz2 /var/tmp/openmpi-3.0.0
ENV LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATH \PATH=/usr/local/openmpi/bin:$PATH
Stage0 += openmpi()
Generate corresponding Dockerfile instructions for the HPCCM building blockhpccm
12
EQUIVALENT HPC CONTAINER MAKER WORKFLOW
Login to system (e.g., CentOS 7 with Mellanox OFED 3.4)
$ module load PrgEnv/GCC+OpenMPI
$ module load cuda/9.0
$ module load gcc
$ module load openmpi/1.10.7
Steps to build application
Stage0 += baseimage(image='nvidia/cuda:9.0-devel-centos7')Stage0 += mlnx_ofed(version=‘3.4-1.0.0.0’)
Stage0 += gnu()
Stage0 += openmpi(version='1.10.7')
Steps to build application
Result: application binary suitable for that
particular bare metal system
Result: portable application container
capable of running on any system
13
HIGHER LEVEL ABSTRACTION
openmpi(check=False, # run “make check”?configure_opts=['--disable-getpwuid', …], # configure command line options cuda=True, # enable CUDA?directory='', # path to source in build contextinfiniband=True, # enable InfiniBand?ospackages=['bzip2', 'file', 'hwloc', …], # Linux distribution prerequisitesprefix='/usr/local/openmpi', # install locationtoolchain=toolchain(), # compiler to useversion='3.0.0') # version to download
Building blocks to encapsulate best practices, avoid duplication, separation of concerns
Full building block documentation can be found in RECIPES.md
Examples:openmpi(prefix='/opt/openmpi', version='1.10.7’)openmpi(infiniband=False, toolchain=pgi.toolchain)
14
INCLUDED BUILDING BLOCKS
• Compilers
• GNU
• PGI
• Intel (BYOL)
• HPC libraries
• Charm++
• FFTW, MKL, OpenBLAS
• CGNS, HDF5, NetCDF, PnetCDF
• Miscellaneous
• Boost
• CMake
• InfiniBand
• Mellanox OFED
• OFED (upstream)
• MPI
• Intel MPI
• MVAPICH2, MVAPICH2-GDR
• OpenMPI
• Package management
• packages (Linux distro aware), or
• apt_get
• yum
As of version 18.8CUDA is included via the base image, see https://hub.docker.com/r/nvidia/cuda/
15
RECIPES INCLUDED WITH CONTAINER MAKER
GNU compilers
PGI compilers
OpenMPI 3.0.0
MVAPICH2 2.3rc2
CUDA 9.0
FFTW 3.3.7
HDF5 1.10.1
Mellanox OFED 3.4-1.0.0.0
Python 2 and 3
Ubuntu 16.04
CentOS 7
HPC Base Recipes:
16
BUILDING APPLICATION CONTAINER IMAGES WITH HPCCM
Container template generator
2. Edit Dockerfile to add application build instructions
3. Build the container image$ docker build -t myapp -f Dockerfile .
1. Generate the desired container template starting from one of the HPC base recipes included with HPCCM$ hpccm --recipe recipes/hpcbase-gnu-openmpi.py --single-stage --format docker > Dockerfile$ hpccm --recipe recipes/hpcbase-gnu-openmpi.py --single-stage --format singularity > Singularity
2. Edit Singularity recipe file to add application build instructions
3. Build the container image$ sudo singularity build myapp.simg Singularity
17
BUILDING APPLICATION CONTAINER IMAGES WITH HPCCM
HPC Base Images
2. Create Dockerfile referencing the HPC base image and containing the application build instructionsFROM hpcbase-gnu-openmpi…
3. Build the container image$ docker build -t myapp -f Dockerfile .
1. Generate the desired container base image starting from one of the HPC base recipes included with HPCCM$ hpccm --recipe recipes/hpcbase-gnu-openmpi.py --single-stage --format docker > Dockerfile$ docker build -t hpcbase-gnu-openmpi -f Dockerfile .
2. Create Singularity recipe file referencing the HPC base image and containing the application build instructionsBootStrap: DockerFrom: hpcbase-gnu-openmpi…
3. Build the container image$ sudo singularity build myapp.simg Singularity
18
BUILDING APPLICATION CONTAINER IMAGES WITH HPCCM
$ cat mpi-bandwidth.py# Setup GNU compilers, Mellanox OFED, and OpenMPIStage0 += baseimage(image='nvidia/cuda:9.0-devel-centos7')Stage0 += gnu()Stage0 += mlnx_ofed(version='3.4-1.0.0.0')Stage0 += openmpi(version='3.0.0')
# Application build steps below# Using “MPI Bandwidth” from Lawrence Livermore National Laboratory (LLNL) as an example# 1. Copy source code into the containerStage0 += copy(src='mpi_bandwidth.c', dest='/tmp/mpi_bandwidth.c')# 2. Build the applicationStage0 += shell(commands=['mkdir -p /workspace',
'mpicc -o /workspace/mpi_bandwidth /tmp/mpi_bandwidth.c’])
$ hpccm --recipe mpi-bandwidth.py --format …
Application recipe
19
BUILDING APPLICATION CONTAINERS IMAGES WITH HPCCMApplication recipes
BootStrap: dockerFrom: nvidia/cuda:9.0-devel-centos7%post
. /.singularity.d/env/10-docker.sh
# GNU compiler%post
yum install -y \gcc \gcc-c++ \gcc-gfortran
rm -rf /var/cache/yum/*
# Mellanox OFED version 3.4-1.0.0.0%post
yum install -y \libnl \libnl3 \numactl-libs \wget
rm -rf /var/cache/yum/*%post
mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp http://content.mellanox.com/ofed/MLNX_OFED-3.4-1.0.0.0/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz
mkdir -p /var/tmp && tar -x -f /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz -C /var/tmp -zrpm --install /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libibverbs-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-
x86_64/RPMS/libibverbs-devel-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libibverbs-utils-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libibmad-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libibmad-devel-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libibumad-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libibumad-devel-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libmlx4-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libmlx5-*.x86_64.rpm
rm -rf /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64
# OpenMPI version 3.0.0%post
yum install -y \bzip2 \file \hwloc \make \openssh-clients \perl \tar \wget
rm -rf /var/cache/yum/*%post
mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://www.open-mpi.org/software/ompi/v3.0/downloads/openmpi-3.0.0.tar.bz2mkdir -p /var/tmp && tar -x -f /var/tmp/openmpi-3.0.0.tar.bz2 -C /var/tmp -jcd /var/tmp/openmpi-3.0.0 && ./configure --prefix=/usr/local/openmpi --disable-getpwuid --enable-orterun-prefix-by-default --with-cuda --with-verbsmake -j4make -j4 installrm -rf /var/tmp/openmpi-3.0.0.tar.bz2 /var/tmp/openmpi-3.0.0
%environmentexport LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATHexport PATH=/usr/local/openmpi/bin:$PATH
%postexport LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATHexport PATH=/usr/local/openmpi/bin:$PATH
%filesmpi_bandwidth.c /tmp/mpi_bandwidth.c
%postmkdir -p /workspacempicc -o /workspace/mpi_bandwidth /tmp/mpi_bandwidth.c
FROM nvidia/cuda:9.0-devel-centos7
# GNU compilerRUN yum install -y \
gcc \gcc-c++ \gcc-gfortran && \
rm -rf /var/cache/yum/*
# Mellanox OFED version 3.4-1.0.0.0RUN yum install -y \
libnl \libnl3 \numactl-libs \wget && \
rm -rf /var/cache/yum/*RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp http://content.mellanox.com/ofed/MLNX_OFED-3.4-1.0.0.0/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz && \
mkdir -p /var/tmp && tar -x -f /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz -C /var/tmp -z && \rpm --install /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libibverbs-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-
x86_64/RPMS/libibverbs-devel-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libibverbs-utils-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libibmad-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libibmad-devel-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libibumad-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libibumad-devel-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libmlx4-*.x86_64.rpm /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64/RPMS/libmlx5-*.x86_64.rpm && \
rm -rf /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64
# OpenMPI version 3.0.0RUN yum install -y \
bzip2 \file \hwloc \make \openssh-clients \perl \tar \wget && \
rm -rf /var/cache/yum/*RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://www.open-mpi.org/software/ompi/v3.0/downloads/openmpi-3.0.0.tar.bz2 && \
mkdir -p /var/tmp && tar -x -f /var/tmp/openmpi-3.0.0.tar.bz2 -C /var/tmp -j && \cd /var/tmp/openmpi-3.0.0 && ./configure --prefix=/usr/local/openmpi --disable-getpwuid --enable-orterun-prefix-by-default --with-cuda --with-verbs && \make -j4 && \make -j4 install && \rm -rf /var/tmp/openmpi-3.0.0.tar.bz2 /var/tmp/openmpi-3.0.0
ENV LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATH \PATH=/usr/local/openmpi/bin:$PATH
COPY mpi_bandwidth.c /tmp/mpi_bandwidth.c
RUN mkdir -p /workspace && \mpicc -o /workspace/mpi_bandwidth /tmp/mpi_bandwidth.c
Singularity recipe fileDockerfileCUDA CentOS base image
GNU compiler
Mellanox OFED
OpenMPI
MPI Bandwidth
20
RECIPES INCLUDED WITH CONTAINER MAKER
GNU compilers
PGI compilers
OpenMPI 3.0.0
MVAPICH2 2.3rc2
CUDA 9.0
FFTW 3.3.7
HDF5 1.10.1
Mellanox OFED 3.4-1.0.0.0
Python 2 and 3
Ubuntu 16.04
CentOS 7
HPC Base Recipes:
Reference Recipes:
GROMACS MILC
MPIBandwidth
21
CONTAINER SPECIFICATION FORMAT ABSTRACTION
HPC Container Maker Dockerfile Singularity recipe file
baseimage(image='ubuntu:16.04') FROM ubuntu:16.04 Bootstrap: dockerFrom: ubuntu:16.04
shell(commands=['a', 'b', 'c']) RUN a && \b && \c
%postabc
copy(src='a', dest='b') COPY a b %filesa b
Single source to either Docker or Singularity
Recommendation: avoid using primitives when a building block is available
22
CONTAINER SPECIFICATION FORMAT ABSTRACTION
HPC Container Maker Ubuntu base imagebaseimage(image='ubuntu:16.04')
CentOS base imagebaseimage(image='centos:7')
packages(ospackages=['gcc'])
RUN apt-get update –y &&apt-get install –y --no-install-recommends \
gcc && \rm –rf /var/lib/apt/lists/*
RUN yum install –y \gcc && \
rm -rf /var/cache/yum/*
apt_get(ospackages=['g++'])
RUN apt-get update –y &&apt-get install –y --no-install-recommends \
g++ && \rm –rf /var/lib/apt/lists/*
!!!
yum(ospackages=['gcc-c++']) !!!RUN yum install –y \
gcc-c++ && \rm -rf /var/cache/yum/*
packages(apt=['g++'],yum=['gcc-c++'])
RUN apt-get update –y &&apt-get install –y --no-install-recommends \
g++ && \rm –rf /var/lib/apt/lists/*
RUN yum install –y \gcc-c++ && \
rm -rf /var/cache/yum/*
Single source supporting either Ubuntu or CentOS/RHEL base images
Recommendation: always use the packages building block to install distro packages
23
USER ARGUMENTS
$ cat recipes/examples/userargs.py…# Set the image tag based on the specified version (default to 9.1)cuda_version = USERARG.get('cuda', '9.1')image = 'nvidia/cuda:{}-devel-ubuntu16.04'.format(cuda_version)
Stage0.baseimage(image)
# Set the OpenMPI version based on the specified version (default to 3.0.0)ompi_version = USERARG.get('ompi', '3.0.0')ompi = openmpi(infiniband=False, version=ompi_version)
Stage0 += ompi
$ hpccm --recipe recipes/examples/userargs.py --userarg cuda=9.0 ompi=2.1.2…$ hpccm --recipe recipes/examples/userargs.py --userarg cuda=9.2
Easy way to enable configurable recipes
24
MULTISTAGE RECIPES
$ cat recipes/examples/multistage.py# Devel stage base imageStage0.name = 'devel'Stage0.baseimage('nvidia/cuda:9.0-devel-ubuntu16.04')
# Install compilers (upstream)g = gnu()Stage0 += g
# Build FFTW using all default optionsf = fftw()Stage0 += f
# Runtime stage base imageStage1.baseimage('nvidia/cuda:9.0-runtime-ubuntu16.04')
# Compiler runtime (upstream)Stage1 += g.runtime()
# Copy the FFTW build from the devel stage and setup the runtime environment.# The _from option is not necessary, but increases clarity.Stage1 += f.runtime(_from=Stage0.name)
Only supported by Docker
25
SUMMARY
• HPC Container Maker simplifies creating a container specification file
• Best practices used by default
• Building blocks included for many popular HPC components
• Flexibility and power of Python
• Supports Docker (and other frameworks that use Dockerfiles) and Singularity
• Open source: https://github.com/NVIDIA/hpc-container-maker
• pip install hpccm
26
BACKUP
27
BIG PICTURE
28
BUILDING AN HPC APPLICATION IMAGE
1. Use the HPC base image as your starting point
2. Generate a Dockerfile from the HPC base recipe Dockerfile and manually edit it to add the steps to build your application
3. Copy the HPC base recipe file and add your application build steps to the recipe
Analogous workflows for Singularity
Base recipe Dockerfile Base image App Dockerfile
Base recipe Dockerfile App Dockerfile
Base recipe App recipe
See https://devblogs.nvidia.com/making-containers-easier-with-hpc-container-maker/
29
FULL PROGRAMMING LANGUAGE AVAILABLE
# get and validate precisionVALID_PRECISION = [‘single’, ‘double’, ‘mixed’]precision = USERARG.get(’LAMMPS_PRECISION’, ‘single’)if precision not in VALID_PRECISION:
raise ValueError(‘Invalid precision’)
…
Stage0 += shell(commands=[‘make –f Makefile.linux.{}’.format(precision), …])
…
Conditional branching, validation, etc.
Courtesy of Logan Herche
$ hpccm … --userarg LAMMPS_PRECISION=single