HPC: On the rise, how to deploy it, and what's the …...HPC –Data Centre Deployments• 10 –40 kW densities currently per rack footprint (1200mm x 800mm ~ 0.96 sqm) • Average

HPC: On the rise, how to deploy it, and what's the future?11th March 2020

David Barker

Founder and CTO

What is HPC?Where did it come from?

Definitions

• Supercomputer – class of fastest computers currently available –national research resource or multi-national company ~ £1m+

• MFLOPS – 106 FLoating-point Operations Per Second (TFLOPS = 1012 / PFLOPS = 1015 / etc)

• HPC – “At least 10 times more compute power than is available on one’s desktop” JISC definition

• CUDA - "Compute Unified Device Architecture" NVIDIA programming language that allows calculations on both CPU and GPUs

A brief history...

Colossus - 1943

• The first programmable, digital, electronic computer

• Top secret – Only broke codes

• Only overtaken in the mid-90's by general purpose CPUs for breaking codes

A brief history...

CDC STAR-100 - 1974

• The first computer to use a vector processor

• Basic scaler instruction sets were sacrificed for vector performance

• Generally considered a failure at the time

• Most modern CPUs include a vectorisedinstruction set now (Intel AVX)

• GPU's arrays of shader pipelines can handle vectorised instruction sets

A brief history...

ASCI RED - 1995

• The first supercomputer to use commodity components at scale

• Massive parallel processing – Used Pentium II Xeons – 1.2 TB RAM and 12.5 TB storage

• 1.068 TFLOP/s - 850 kW to run and 850 kW to cool

• Modern PUE of >2!

A brief history...

Tianhe – 1A - 2010

• 2.57 PFLOP/S (~ 2406x more powerful than ASCI RED)

• First to use a mixture of CPU and GPU processors

• 14,336 Intel Xeon X5670 CPUs7,168 NVIDIA Tesla M2050 GPUs

• $88m cost to build, 4MW to operate

A brief history...

Summit - 2019

• 187.6 PFLOP/S (~ 73x Tianhe-1A)

• 2.2m cores: 4,608 nodesNode: 2x IBM Power9 CPUs, 6x GPUs

• $162m cost to build, 8.8MW to operate

• 13,889 GFLOPs/kW

HPC for the everydayWhat we're seeing

Everyday HPC workloads?

• Previously the realm of national Governments, universities and research organisations

• The last 18-24 months - Increased demand for HPC at our 4D Gatwick data centre

• 4D Gatwick - Higher average power loads and deployment of rack-level cooling solutions for HPC workloads

• HPC increasingly becoming an 'everyday' thing for some sectors but not all (Geo-Physics, Medical, "AI", Rapid Prototyping)

• Not everyone needs HPC – But most businesses will have a supplier that utilises it (ieMicrosoft Insights in Office 365)

HPC – Data Centre Deployments

• 10 – 40 kW densities currently per rack footprint (1200mm x 800mm ~ 0.96 sqm)

• Average workload is currently idle at 10 kW and loaded at 15 – 18 kW

• Mixture of CPU and GPU workloads

• Almost exclusively Intel and NVIDIA at the moment

• Whitebox NVIDIA DGX as well as "full" DGX deployments

• Mixtures of NVIDIA Tesla / Titan-X Black cards

• Intel W-Series and Intel Platinum 9282 CPUs

Cooling HPC?Options for the data centre

How to deploy – Rackmount Servers

• Majority of deployments are currently rackmount and air-cooled

• Starting to see a move to liquid cooled CPUs and GPUs

• Rear-door coolers hooked into the main chilled water-cooling loops

How to deploy - Immersion?

• Lots of interest in immersion [Hot new topic...]

• May allow for higher operating temperatures

• 22 kW – 40 kW (some systems will go higher)

• High capex for immersion currently and awkward footprint

• Hardware to work with and service equipment in immersion tanks

How to deploy – Rackmount Immersion?

• Probably the future

• The most backwards compatible of immersion solutions

• Fits into standard 19" rack

• Plate heat-exchanger at rear of server removes waste heat

• Rapidly deployed into a data centre (such as 4D Gatwick...)

Any questions?

Documents

HPC: On the rise, how to deploy it, and what's the …...HPC –Data Centre Deployments• 10 –40 kW densities currently per rack footprint (1200mm x 800mm ~ 0.96 sqm) • Average