Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
HTPC High
Throughput
Parallel
Computing
Steve Cox RENCI Engagement
Steven Cox: http://osglog.wordpress.com
Overview
The Open Science Grid (OSG)
High Throughput Parallel Computing (HTPC)
Message Passing Interface (MPI)
Producing Binaries with Static Linking
Globus Resource Specification Language (RSL)
HTPC on GlideinWMS
GPGPUs and Next Steps
Conclusions
Steven Cox: http://osglog.wordpress.com
Open Science Grid
A framework for large scale distributed resource sharing
addressing the technology, policy, and social requirements of sharing
OSG is a consortium of software, service and resource providers and researchers, from universities, national laboratories and computing centers across the U.S., who together build and operate the OSG project. The project is funded by the NSF and DOE, and provides staff for managing various aspects of the OSG.
Brings petascale computing and storage resources into a uniform grid computing environment
Integrates computing and storage resources from over 80 sites in the U.S. and beyond
Steven Cox: http://osglog.wordpress.com
Open Science Grid
Virtual Organizations (VO) represent scientific user communities.
High Energy Physics, Structural Biology and other user communities consume hundreds of thousands of computing hours each day.
Steven Cox: http://osglog.wordpress.com
Open Science Grid
Using OSG Today
Astrophysics
Biochemistry
Bioinformatics
Earthquake Engineering
Genetics
Gravitational-wave physics
Mathematics
Nanotechnology
Nuclear and particle physics
Text mining
And more…
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
Objectives
Exploit parallel processing OSG resources
Simplify submission to hide targeting details
Integrate with existing submission models
Explore MPI delivery and execution
Status
8-way slots common; 16 happening now
About a half dozen sites are HTPC enabled
Implementing discoverable GIP configuration
High Throughput Parallel Computing (HTPC)
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
Operational Overview
Reserve an entire compute node
Use shared memory, not node interconnects
Bundle all job dependencies for portability
This requires
Submitting jobs with cluster specific RSL
Statically linked executable and MPI tools
Launching with mpiexec on the worker node
High Throughput Parallel Computing (HTPC)
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
Widely used for distributed memory computation
MPI is an application programmer interface (API)
…with several versions
…and many implementations
Implementations optimize high speed node interconnects
But this is non-portable
GOAL: Make a predictable MPI environment on every host
Message Passing Interface (MPI)
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
Some of the most common include:
MPICH
MPICH2
MVAPICH
OpenMPI
Each provides a launch program (mpiexec, mpirun, etc) mpiexec –n 8 executable arguments
For greater portability:
Count processors using /proc/cpuinfo
Use that number in launching mpiexec
Message Passing Interface (MPI)
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
Binaries with Static Linking (MPI and App)
Binaries must contain all dependencies
True of the application and MPI tools
Chose an MPI library that’s
Easy to build
Supported by the target application
Compile and link it with –static
export LDFLAGS=-static
make
Statically link the executable also
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
Binaries with Static Linking (MPI and App)
Example of building MPICH 1.2.7 and
Static MPICH / MPICH2 binaries available
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
Globus Resource Specification Language (RSL)
Used to specify job parameters
Each batch scheduler is different
Condor: (condorsubmit=('+RequiresWholeMachine' TRUE)
('Requirements' 'CAN_RUN_WHOLE_MACHINE=?=TRUE'))
PBS: (jobtype=single)(queue=tg_workq)(xcount=8)
(host_xcount=1)(maxWallTime=2800)
Each cluster may have multiple queues (jobtype=single)(queue=tg_workq)(xcount=8)…
(jobtype=single)(queue=gpgpu)(xcount=8)…
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
Globus Resource Specification Language (RSL)
Each new HTPC site publishes RSL
Condor-G job submission includes RSL
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
Amber PMEMD 9
mpich-1.2.7p1
mpich2-1.1.1p1
job
Job Control Scripts
common functions
OSG PMEMD Amber PMEMD
Amber PMEMD Executable
MPI Libraries (MPICH, MPICH2)
OSG Adapter Scripts
RCI – Job Control
Example HTPC Application
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
job module
stage-in: globus-url-copy
stage-out: globus-url-copy
OSG Worker Node ( VDT, Globus, … )
run
All files are staged in and out for the user
The framework provides static executables, runs the specified experiment and tracks and reports exit status
The framework provides an API to run PMEMD via MPI
Example HTPC Application
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
pmemd.mpich2
cpmemd_mpi_exec ()
cpmemd_execute_experiment ()
cpmemd_exec ()
mpiexec
API
API
Researchers focus on the experiment - implement a standard entry point.
Execute PMEMD with a template driven input file; inputs and outputs from and to standard locations
Execute PMEMD with complete control over all parameters while still allowing the framework to manage MPI launch
Example HTPC Application
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
Pilot based job submission Demonstrated scalability
Decentralized front-end, factory architecture
Manages RSL to site mapping
HTPC on GlideinWMS
GlideinWMS
Front End
Condor
GlideinWMS
Factory Site
Worker
Node
Glidein
Submit Glidein
Glidein Executes Job
RSL site site site
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
Groups model categories of job requirements
Factory administrators configure sites and RSL
Front end admins configure job routing
Jobs use special tags to match desired groups
HTPC on GlideinWMS
GlideinWMS
Front End
HTPC Group
GlideinWMS
Factory
RSL site site site
HTPC Group
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
GlideinWMS Condor Submit File
OSG MatchMaker (Condor-G)
Putting it All Together
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
What
General Purpose Graphics Processing Unit
Why
Increased parallelism (fermi => 512 cores)
How
Access typically via specialized queue
Requires compilation against CUDA
Requires custom RSL (jobtype=single)(queue=gpgpu)
(xcount=2)(host_xcount=1)(maxWallTime=2000)
Hybrid workflows require CPUs and GPUs
GPGPUs and Next Steps
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
Publish RENCI Blueridge GPGPU RSL
Configure a GlideinWMS GPGPU group
Experiment with CUDA and static linking
Initial indications are this may not be supported
Explore Pegasus as a means of managing
hybrid GPU/CPU workflows
Explore CampusFactory as a bridge to
GPU resources
GPGPUs and Next Steps
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
Conclusion
High Throughput Parallel Computing with 8
and 16 way jobs is in production on the
Open Science Grid
Enabling an application for HTPC involves:
Compile and link it statically
Compile and link its MPI library statically
Build a submit file targeting an HTPC group
Submit it via GlideinWMS
HTPC GPGPU support emerging
Steven Cox: http://osglog.wordpress.com Steven Cox: http://osglog.wordpress.com
References
• Steve’s OSG Blog
• HTPC Wiki
• HTPC GlideinWMS Groups
• GlideinWMS
• Message Passing Interface (MPI)