View
218
Download
3
Category
Preview:
Citation preview
• Robert C. Bell – CSIRO and Bureau of Meteorology
From CSIRAC to Cray (again) and onwards
CSIRO INFORMATION MANAGEMENT & TECHNOLOGY
SCIENTIFIC COMPUTING
•The past: •history of computing support for science in
Australia •The present: • the Bureau’s Cray system •The future: •computing for science
Robert C. Bell
Summary
2 |
Robert C. Bell
Dinosaurs
3 |
Robert C. Bell
4 |
History of Compute performance in CSIRO
0.0000001
0.000001
0.00001
0.0001
0.001
0.01
0.1
1
10
100
1000
10000
100000
1000000
10000000
1940 1960 1980 2000 2020
Gfl
op
/s
CSIRO Peak Computing Systems
Peak Speed(Gflop/s)
CSIRAC
CDC
CDC Cyber
CDC Cyber
Cray
NEC vector
raijin & magnus
APAC SGI
NCI Sun
Robert C. Bell 5 |
History of Computing in the Bureau
0.0001
0.001
0.01
0.1
1
10
100
1000
10000
100000
1000000
10000000
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020
Peak P
erfo
rm
an
ce (
flo
ps/
sec)
Years
Peak Performance
IBM 360/65
FACOM/Fujistu M200 CDC/ETA 10P
Cray X-MP, Y-MP
NEC SX-4
NEC SX-5
NEC SX-6
Sun Constellation
Oracle
Cray XC40
Cray XC+
1 MegaFlop
1 PetaFlop
1 GigaFlop
1 TeraFlop
10 MF
100 MF
10 GF
100 GF
10 TF
100 TF
10 PF
Robert C. Bell 6 |
Early computing
Robert C. Bell
• CSIRAC in CSIR Radiophysics from 1949 •Moved to University of Melbourne in 1956 •Dick Jensen ran first numerical forecasts in
Australia on CSIRAC • CSIRO Computing Research – CDC 3600 (used by
Bureau until IBM 360/65 systems in 1968) • CSIRO: Csironet: CDC Cybers, then post-Csironet
Crays • Bureau: FACOMs, ETA, Crays •HPCCC: NEC
Robert C. Bell
Early history
8 |
9 | Robert C. Bell
CSIRO HPC Systems
Robert C. Bell 10 |
Robert C. Bell
11 |
•
Provision of: •Systems – compute •Storage – HSM •Software – Open Source treasury •Services – Help Desk, backups, etc •Support – people with knowledge
Robert C. Bell
History
12 |
• New data centre • Cray XC40 – widely adopted in the community • Features: • Intel Xeon Haswell processors: • FMA3: d=round(a x b + c) ; Variable clock speed
• Dense packaging: 384 processors per cabinet: 90 kW • Special compute-node Linux version: low jitter • Water cooled • Two partitions: for resiliency: compute and storage • Aries interconnect: • 48-port routers, 500 Gbyte/s; dragonfly topology
Robert C. Bell
The present: Bureau’s Cray system
13 |
Cray XC Rank1 Network
o Chassis with 16 compute blades o 128 Sockets o Inter-Aries communication over backplane o Per-Packet adaptive Routing
Robert C. Bell 14
Australis HPC
Parameter Oracle HPC
System 2015 System 2018 System
Relative Increase
Processor
Intel Xeon Sandy Bridge
6-core 2.5GHz
Intel Xeon Haswell E5-2690v3
12-core 2.6GHz
Intel Xeon
Haswell + Skylake
Increase relative to Oracle HPC
System Ngamai
Nodes 576 2,160 4,112 2015: 3.8x 2018: 7.1x
Cores 6,912 51,840 129,920 2015: 7.5x
2018: 18.8x
Aggregate
Memory 36.9 TB 276 TB 651 TB
2015: 7.5x
2018: 17.7x
Global Filesystem Technology
Oracle Lustre 1.8.8
Sonexion Lustre 2.5.1+
Sonexion Lustre 2.x
Usable Storage 214 TB 4,320 TB 8,640 TB 2015: 20.3x 2018: 60.5x
Storage
Bandwidth 16 GB/s 135 GB/s 306 GB/s
Compute
Interconnect
Mellanox Infiniband
QDR 40Gb/s
Cray Aries
93 – 157Gb/s
Cray Aries
93 – 157Gb/s
Typical Power
Use (kiloWatts) 200 kW 865 kW 1,648 kW
2015: 4.3x
2018: 8.2x
Sustain system
performance (SSP)
16 253 618 2015: 15.6x
2018: 38.1x
Top 500 Rmax Linpack (TF)
104 1663 5000+ 2015: 16.0x 2018: 50.0x
Robert C. Bell 15 |
Operational Capacity (Half of total capacity)
1.E+04
1.E+05
1.E+06
1.E+07
1.E+08
2010 2011 2012 2013 2014 2015 2016 2017
Syste
m P
erfo
rm
an
ce (
Top
50
0 H
PL R
max)
NWP Computing Capacity of National Meteorological Centres
100T
PetaFlops
10P
100P
10T
ECMWF (1.8PF)
UK Met Office
Korea Met Agency
Environment Canada
Australian Bureau of Meteorology
#1 Machine 54 Pflops China
Bureau of Meteorology (N/A)
100T
PetaFlops
10P
100P
10T
US-NWS
#500 Trend Line
#1 Trend Line US-NWS (2.5PF)
UK Met Office (2.8PF)
ECMWF
Robert C. Bell 16 |
•Systems – no more easy gains? •Storage – cloud? •Software – narrowing of options because of
architectures •Services – computer science, collaboration,
visualisation •Support – people with knowledge
Robert C. Bell
Future possibilities
17 |
•
Robert C. Bell
•
Stream2 - 2004
Robert C. Bell
•Systems – more cores, not much faster? •Limited bandwidth gains •Accelerators: for some applications •CSIRO accelerated computing program –
GPU systems – some success •Commodity market – ARM processors?
Robert C. Bell
Future systems
20 |
• Ten years ago, I wrote: • Users typically want every file kept and backed-up, and would be
happy to use only one file system, globally visible across all the systems they use, with high-performance everywhere, and infinite capacity!
• A user added that they want all of the above, at zero cost!
• Storage – cloud? Regulatory, cost and protection issues • Storage for HPC – Lustre, fast, parallel, but not very
reliable. • Object stores rather than filesystems? • Tape persists, even in cloud • CSIRO Cloud storage
Robert C. Bell
Future Storage
21 |
CSIRO area
22 | Robert C. Bell
23 | Robert C. Bell
•Explosion in consumer software but, in HPC, narrowing of options because of architectures
•Massive investment in major models •Collaboration essential •CSIRO eResearch model
Robert C. Bell
Future: Software
24 |
E-enablement
equipping the organisation to perform activities more
efficiently through electronic means. The e-enablement
layer includes general computing and storage,
networks and office productivity tools both on-
premise and remote
Consolidate infrastructure
Increase reliability/availability
Achieve economies of scale
Virtualise
eResearch
the rapid evolution of research methodologies enabled by
information technologies and tools. The eResearch layer
includes scientific computing and storage, research data
management, visualisation, advanced collaboration
technologies and productivity tools.
Expand scientific computing
Embed enterprise-level research data management
Extend collaboration/e-tools
Develop visualisation services
Robert C. Bell
IM&T eResearch Supports the Entire Science Data Workflow
Workflow Services
Research Data Services
Information Services Research
Question
Research Design
Data Collection
Processing & Analysis
Data & Workflow Archiving
Disseminate & Publish
Data Re-Use
Measure Impact
IM&T eEnablement Services: High Speed Networks, Application Development, Tele-presence, Office
Productivity, Collaboration Tools
Scientific Computing & Visualisation
Outreach & eResearch Planning
Advanced Collaboration
Robert C. Bell 26 |
• Submission History
eResearch Project Services
Robert C. Bell
0
10
20
30
40
50
60
70
80
90
100
RFP Responses by Type
Workflow
Sustainability
Data
Vizualisation
Compute
Unique
•File protection (backup) •Computer science •Collaboration •Visualisation •Buzz words: e.g. Big Data
Robert C. Bell
Future: services
28 |
•People with knowledge •Critical mass •Training – little HPC knowledge from
undergraduates (Fortran?) •Staff with commitment to science support •Bureau: Scientific Computing Services
Robert C. Bell
Future: support
29 |
• We’ve come a long way in computing for atmospheric and ocean sciences
• Major acquisition by Bureau for operational computing
• Hard to see future, but we know some dead-ends
• Forecast models running on phones?
•
Conclusion: From CSIRAC to Cray (again) and onwards
CSIRO IM&T Scientific Computing Services Robert C. Bell
t +61 3 9669 8102 | +61 3 9545 2368
e Robert.Bell@csiro.au
w www.hpsc.csiro.au
CSIRO IM&T SCIENTIFIC COMPUTING SERVICES
Thank you
Recommended