Cray XC50 Supercomputer · Cray® XC50™ Specifications Processor Intel ® Xeon , Intel® Xeon® Scalable, Intel Xeon Phi™ and Cavium ThunderX2™ processors NVIDIA …

Cray Inc. • 901 Fifth Avenue, Suite 1000 • Seattle, WA 98164 • Tel: 206.701.2000 • Fax: 206.701.2500 • www.cray.com

© 2017 Cray Inc. All rights reserved. Specifications are subject to change without notice. Cray and the Cray logo are registered trademarks, and Cray XC is a trademark of Cray Inc. Intel Xeon, Xeon Phi and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. All other trademarks mentioned herein are the properties of their respective owners. 20171109ES

Cray® XC50™ Supercomputer

Adaptive Computing Reaches New HeightsThe Cray® XC50™ supercomputer is a massively parallel processing (MPP) architecture designed for environments where engi-neering, research, scientific, analytics and deep learning applications can leverage the processing power of today’s advanced Intel® Xeon®, Intel® Xeon Phi™ and Cavium processors and NVIDIA® Tesla® GPU accelerators. Building on Cray’s adaptive super-computing vision, the XC50 system integrates extreme performance HPC interconnect capabilities with best-in-class processing technologies to produce a single, scalable architecture.

Cray XC50 Compute BladeFor production supercomputing and user productivity, the Cray XC50 compute blade implements two Intel Xeon Scalable processor engines per compute node and has four compute nodes per blade. Compute blades stack 16 to a chassis and each cabinet can be populated with up to three chassis, resulting in 384 sockets per cabinet and providing performance of more than 619 TF per cabinet. Cray XC50 supercomputers can be configured up to hundreds of cabinets and upgraded to nearly 300 PF per system with CPU blades and over 500 PF per system with a combination of CPU and GPU blades.

CPU/GPU Compute BladesFor GPU-accelerated compute requirements, the XC50 CPU/GPU compute blade features four compute nodes, each including one Intel Xeon CPU and one NVIDIA Tesla P100 PCIe GPU. A full cabinet of 192 nodes provides up to 1 PF of peak performance.

Network Topology and Interconnect Designed for System Scale and PerformanceThe Cray XC50 supercomputer utilizes the performance-optimized Cray Aries interconnect technology. This innovative intercom-munication technology, implemented with a high-bandwidth, low-diameter network topology called Dragonfly, provides substantial improvements on all the network performance metrics for HPC: bandwidth, latency, message rate and more. Delivering unprece-dented global bandwidth scalability at reasonable cost across a distributed memory system, this network provides programmers with global access to all the memory of parallel applications and supports the most demanding global communication patterns.

Performance and Insight — Comprehensive Programming EnvironmentThe Cray Programming Environment is designed to drive maximum computing performance and enable productive programma-bility. This feature-rich, flexible software environment facilitates the development of massively scalable applications, targeting optimization for intramode performance as well as system scaling across a multitude of nodes and compute cabinets.

Scalable Software AdvantagesThe Cray XC50 system delivers the latest Cray Linux Environment (CLE), a suite of high performance software including a Linux-based operating system designed to scale efficiently and run large, sophisticated HPC applications. When running highly scalable applications, Cray’s Extreme Scalability Mode (ESM) ensures operating system services do not interfere with applica-tion performance. Real-world applications have proven this optimized design scales to hundreds of thousands of cores and is capable of scaling to more than one million cores.

To get the most workload production out of your supercomputing system, Cray provides a robust system management stack with a broad variety of utilities and support.

Cray® XC50™ Specifications

ProcessorIntel® Xeon®, Intel® Xeon® Scalable, Intel® Xeon Phi™ and Cavium ThunderX2™ processors

NVIDIA® Tesla® K40 and P100 GPU accelerators

MemoryCPU blades 96-384 GB per node, CPU/GPU blades 32-128 GB DDR4 host + up to 16 GB HBM2 accelerator

Memory bandwidth: CPU blades up to 256 GB/s per node, CPU/GPU blades up to 76.8 GB/s per node,HBM2 up to 720 GB/s per node

Compute Cabinet

Up to 192 compute nodes per cabinet

Peak performance: with CPU blades > 600 TF per cabinet; with CPU/GPU blades up to 902 TF GPU and 141 TF host processor, total up to 1.04 PF/cabinet

Interconnect

1 Aries™ routing and communications ASIC per 4 compute nodes

48 switch ports per Aries chip (500 GB/s switching capacity per chip)

Dragonfly interconnect: low-latency, high-bandwidth topology

System Administration

Cray System Management Workstation (SMW)

Single-system view for system administration

System software rollback capability

Reliability Features (Hardware)

Integrated Cray Hardware Supervisory System (HSS)

Independent, out-of-band management network

Full ECC protection of all packet traffic in the Aries network

Redundant power supplies; redundant voltage regulator modules

Redundant paths to all system RAID

Hot swap blowers, power supplies and compute blades

Integrated pressure and temperature sensors

Reliability Features (Software)

HSS system monitors operation of all operating system kernels

Lustre® file system object storage target failover; Lustre metadata server failover

Software failover for critical system services including system database, system logger and batch subsystems

NodeKARE (Node Knowledge and Reconfiguration)

Operating SystemCray Linux® Environment (includes SUSE Linux SLES12, HSS and SMW software)

Extreme Scalability Mode (ESM) and Cluster Compatibility Mode (CCM)

Compilers, Libraries & Tools

Cray Compiler Environment, PGI compiler, GNU compiler

Support for the ISO Fortran standard (2008) including parallel programming using coarrays, C/C++ and UPC

MPI 3.0, Cray SHMEM, other standard MPI libraries using CCM. Cray Apprentice and CrayPAT™ performance tools. Intel Parallel Studio Development Suite (option)

Job Management

PBS Professional job management system

Moab Adaptive Computing Suite job management system

SLURM – Simple Linux Unified Resource Manager

External I/O Interface Infiniband, 40 and 10 gigabit Ethernet, Fibre Channel (FC) and Ethernet

Disk Storage Full line of FC, SAS and IB based disk arrays with support for FC and SATA

Parallel File System Lustre, Data Virtualization Service (DVS) allows support for NFS, external Lustre and other file systems

Power

103 kW per compute cabinet, maximum configuration

Support for 480 VAC and 400 VAC computer rooms

6 kW per blower cabinet, 20 AMP at 480 VAC or 16 AMP at 400 VAC (three-phase, ground)

Cooling Water cooled with forced transverse air flow: 6,900 cfm intake

Dimensions (Cabinets)H 80.25” x W 35.56” x D 76.50” (compute cabinet)

H 80.25” x W 18.06” x D 59.00” (blower cabinet)

Weight (Operational)4,500 lbs. maximum per compute cabinet — liquid cooled, 254 lbs./square foot floor loading900 lbs. maximum per blower cabinet

Regulatory Compliance

EMC: FCC Part 15 Subpart B, CE Mark, CISPR 22 & 24, ICES-003, C-tick, VCCI

Safety: IEC 60950-1, TUV SUD America CB Report

Acoustic: ISO 7779, ISO 9296

Documents

Cray XC50 Supercomputer · Cray® XC50™ Specifications Processor Intel ® Xeon , Intel® Xeon® Scalable, Intel Xeon Phi™ and Cavium ThunderX2™ processors NVIDIA …