Cray CS-Storm Accelerated GPU Cluster Supercomputer CS-Storm systems provide customers with powerful GPU-accelerator-optimized solutions. Cray ... Caffe, Caffe2, Caffe-MPI, Chainer,

TM

Purpose-built for the most demanding HPC, machine learning and deep learning workloads, Cray CS-Storm systems provide customers with powerful GPU-accelerator-optimized solutions.

Cray® CS-Storm™ Accelerated GPU Cluster Supercomputer

CS-Storm Accelerated Cluster Supercomputer Designed for Speed

Powered by NVIDIA, Intel and Nallatech accelerators and processors, each CS-Storm server node and rack system is integrated by Cray to deliver maximum performance across the broadest range of machine learning and deep learning environments.

Designed for Scale

As machine learning and deep learning become core requirements for business and scientific discovery, the need for consistent and timely processing is driving the use of supercomputer-scale systems, which are designed to handle the largest data sets and perform the most complex calculations. Cray has the depth of experience to guide organizations on the journey from AI trials and pilots to production workloads.

Designed for Simplicity

Designing, implementing and using an AI system doesn’t have to be difficult. You can rely on Cray, the expert in production supercomputing, to simplify your environment. Our systems are built on open, industry-standard technologies. We deliver a complete solution for AI including high-performance storage, system management, developer tools and validated deep learning frameworks.

But adding a modern GPU is not as simple as just plugging it in. Factors such as GPU-to-GPU communications, system scaling, file system I/O, storage, power consumption and cooling are all important considerations for organizations looking to move accelerated computing from pilot use to mainstream production.

The purpose-built Cray CS-Storm supercomputer adds to the industry’s

broadest range of integrated systems, ready to tackle problems that benefit

from the processing power and scale available from today’s GPU and FPGA

accelerators.

In addition to machine and deep learning, applications like signal processing,

reservoir simulation, geospatial intelligence, portfolio and trading algorithm

optimization, pattern recognition and in-memory databases benefit from

accelerated computing. A single high-density rack dedicated to GPU

computation can deliver up to 980 TF of double-precision performance on

traditional high-performance computing workloads. For machine learning,

where precision can be traded for additional performance, the CS-Storm

system can deliver more than 15 PF of deep learning FLOPS per rack.

Cray’s CS-Storm system is designed specifically for these demanding

applications and delivers the compute power to quickly and efficiently

convert large streams of raw data into actionable information.

Cray CS-Storm Server Node ConfigurationsDesigned for the demands of extreme HPC processing and complete AI workflows — from data preparation to model training, classification and in-ference — the Cray CS-Storm series includes two unique server nodes, each designed to address a range of AI and HPC application requirements.

Cray CS-Storm 500GT System

The Cray CS-Storm 500GT configuration scales up to ten NVIDIA® Tesla® Volta or Pascal architecture GPUs or Nallatech FPGAs by leveraging the flexibility and economics of PCI Express.

Cray CS-Storm 500NX System

The Cray CS-Storm 500NX configuration scales up to eight NVIDIA Tesla Volta or Pascal architecture GPUs using NVIDIA® NVLink™ to reduce latency and increase bandwidth between GPU-to-GPU communications.

Customizable PerformanceThe CS-Storm supercomputer is a highly configurable cluster, able to support options specific to your individual requirements including a range of NVIDIA Tesla accelerators tested to run at full power, multiple network and topology options for InfiniBand™ (including fat tree, single/dual rail) and Intel® Omni-Path, a customizable HPC cluster software stack, and Cray® ClusterStor™ scale-out parallel file system storage.

The Rise of Accelerated GPU Computing The exponential growth of data volumes is bringing new GPU-accelerated use cases to the forefront. Whether it’s used for scientific research or new devices and services, artificial intelligence (AI) is a key transformation technology. Powering the rapid adoption of AI are enhanced algorithms and toolsets, designed to make the esoteric domains of data science — machine and deep learning neural networks — accessible to a wider audience. Also driving the promise of AI is the availability of powerful GPU accelerators from NVIDIA, designed to handle its unique processing challenges.

Cray HPC Cluster Software StackThe Cray CS-Storm accelerated cluster system provides a high-performance software and hardware infrastructure running parallel tasks with maximum efficiency. The system provides optimized I/O connectivity and flexible user login capabilities. It is available with a comprehensive HPC software stack compatible with open-source and commercial compilers, debuggers, schedulers and libraries, such as the optional Cray Programming Environment on Cluster Systems (Cray PE on CS), which includes the Cray Compiler Environment, Cray Scientific and Math Libraries, and performance measurement and analysis tools. Cray LibSci_ACC provides accelerated BLAS and LAPACK routines that generate and execute auto-tuned kernels on GPUs.

Systems Management

The CS-Storm system can be easily managed. Bright Cluster Manager™ for HPC provides network, server and cluster capabilities with easy system administration and maintenance for scale-out environments.

Bright Cluster Manager for HPC lets customers deploy complete clusters over bare metal and manage them effectively. It provides single-pane-of-glass management for the hardware, the operating system, the HPC software and users. System administrators can quickly get clusters up and running and keep them running reliably throughout their lifecycle — all with the ease and elegance of a full-featured, enterprise-grade cluster manager.

Additionally, the NVIDIA Data Center GPU Manager (DCGM) can be used in conjunction with the Cray CS-Storm accelerated cluster to manage GPU activity on the CS-Storm system. Designed to work with NVIDIA GPUs, DCGM allows for real-time GPU resource monitoring, system policy implementation and system event diagnostics. It includes the capability to control application power efficiency with dynamic power capping on GPUs, and synchronous clock management to keep GPUs across the system in sync based on target workload and power budgets.

A Complete Deep Learning EnvironmentThe Bright Deep Learning extension is an option that provides a comprehensive collection of machine and deep learning frameworks, libraries and tools that enable complex analytics and AI workloads on CS-Storm accelerated compute systems:

• Machine and Deep Learning Frameworks: Including Caffe, Cafe2, TensorFlow, MXnet, Microsoft Cognitive Toolkit, Torch, Theano and others.

• Machine Learning Libraries: Bright includes a selection of the most popular machine learning libraries to help you access datasets. These include MLPython, NVIDIA CUDA Deep Neural Network library (cuDNN) and Deep Learning GPU Training System (DIGITS).

• Supporting Infrastructure: Over 400 MB of Python modules that support the machine learning packages, plus NVIDIA hardware drivers, CUDA (parallel computing platform API) drivers, CUB (CUDA building blocks) and NCCL (library of standard collective communication routines).

COGNITIVE TOOLKIT

Cray Cluster Software Stack

HPCProgramming

Tools

Deep Learning

Enviroment

Development & Performance Tools

Scientific and Communication Libraries

Resource Management/Job Schedulings

Debuggers

Cray Compiler Environment (CCE)CrayPAT™, Cray

Apprentice2, Cray® Reveal™

Caffe, Caffe2, Caffe-MPI, Chainer, Microsoft Cognitive Toolkit, Keras, MXNet, TensorFlow, Theano, Torch

Intel® Parallel Studio XS Cluster

Edition

AllineaMAP

PGI ClusterDevelopment

Kit

GNU Toolchain

NVIDIA®

CUDA®

Cray CCDB,LGDB Intel® IDB Allinea DDT PGI PGDBG

Rogue Wave

TotalView®GNU GDB

NFS GPFS Panasas PanFS® Local (ext3, ext4, XFS)

SLURM

Lustre®

Adaptive Computing Moab®, Maui, TORQUE

Altair PBS Professional®

IBM Spectrum

LSFGrid Engine

Cray® LibSci™, LIBSci_ACC, FFTW Intel® MPI

MVAPICH2MVAPICH-gdr

IBM® Spectrum

LSFOpen MPI

Schedulers, File Systems

andManagement File Systems

Drivers & Network Mgmt.

Operating Systems

Bright Cluster Manager for HPC including support for NVIDIA® Data Center GPU Manager

Accelerator Software Stack and Drivers

Linux® (Red Hat® Enterprise Linux®, CentOS)

Cluster Management

Operating Systems and

Drivers

OFED

Bright for Deep Learning

Frameworks

Libraries

User Access

cuDNN, NCCL, cuBLAS

NVIDIA® DIGITS

© 2018 Cray Inc. All rights reserved. Specifications are subject to change without notice. Cray is a registered trademark of Cray Inc. All other trademarks mentioned herein are the properties of their respective owners. 20180326ES

Cray Inc. 901 Fifth Avenue, Suite 1000 Seattle, WA 98164 Tel: 206.701.2000 Fax: 206.701.2500 www.cray.com

Global Leader in SupercomputingAs the global leader in supercomputing, Cray’s ability to turn pioneering hardware and software technologies into world-renowned supercomputing solutions is the result of decades of dedication to HPC. From technical enterprise- to petaflops-sized solutions, our systems enable tremendous scientific achievement by increasing productivity, reducing risk and decreasing time to solution.

For More Information

To learn more about the Cray CS-Storm cluster visit www.cray.com/cs-storm or contact your Cray representative.

Cray® CS-Storm™ Specifications

CS-Storm 500GT CS-Storm 500NX

Processors Two Intel® Xeon® Scalable processors 4U: Intel® Xeon® E5-2600 v4 family processors1U: Dual Intel Xeon Scalable processors

Memory Capacity Up to 2 TB DDR4 (16 x 128 GB DIMMs)4U: Up to 3 TB DDR4 (24 x 128 GB DIMMs)1U: Up to 1.5 TB DDR4 (12 x 128 GB DIMMs) ECC 2,466 MHz

AcceleratorsNVIDIA® Tesla® V100, P40 or P100 PCIe GPU accelerators or Nallatech FPGA accelerators;supports up to 10 400W parts or 8 450W parts

4U: Up to 8 NVIDIA Tesla V100 or P100 SXM2 GPU acceler-ators; supports up to 300W parts1U: Up to 4 NVIDIA Tesla V100 GPU accelerators; optimized for GPUDirect RDMA

Drive Bays

Multiple local storage configuration options. 12 hot-swappable 2.5” drives (up to 4 NVMe);some configurations require an additional add-in storage controller; total number of drives varies by configuration

4U: 16 hot-swappable 2.5” drives (up to 8 NVMe)1U: 2 HS 2.5” NVME drive bays; 4 total 2.5” HDD bays 4x16

Expansion Slots 12 PCIe 3.0 x 16 slots supporting multiple PCIe topologies and configuration options

4U: 4 x16, low-profile (GPU tray), 2 x 8 (motherboard tray) PCIe 3.01U: 2x16 (FHFL/LP) from PLX; 2x16 (FHFL/LP) from CPU

Power Supply Four 2200W AC power supplies;N+1 and N+N (limited to specific configurations)redundancy

4U: Four 2200W AC power supplies; 2+2 redundancy; titanium-level efficiency1U: 2000W Titanium redundant power supply

Power Input 200-277VAC, 10A max 200-277VAC, 10A max

Weight Up to 76 lbs. (without PCIe cards) Up to 135 lbs.

Dimensions 5.25” H x 17.6” W x 36.4”D (3U) 4U: 7.0”H x 17.6”W x 29”D1U: 1.57”H x 17.6”W x 29”D

Temperature Operating: 10°C–35°C, ASHRAE 2 Operating: 10°C–35°C, ASHRAE 2

https://www.cray.com/products/computing/cs-series/cs-storm

Documents

Cray CS-Storm Accelerated GPU Cluster Supercomputer CS-Storm systems provide customers with powerful GPU-accelerator-optimized solutions. Cray ... Caffe, Caffe2, Caffe-MPI, Chainer,