14
DDN Confidential. Do NOT reproduce or distribute Paving the way for Exascale: Lessons learn from I/O accelerators Extreme Scale Demonstrator, Prague May, 2016 Jean-Thomas Acquaviva, DDN

Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

DDN Confidential.

Do NOT reproduce or distribute

Paving the way for Exascale:

Lessons learn from I/O

accelerators

Extreme Scale Demonstrator, Prague May, 2016

Jean-Thomas Acquaviva, DDN

Page 2: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

2

ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.

Any statements or representations around future events are subject to change.

Corporate Status: DDN Advanced Technical Center

R&D centered on Emerging tech. programs, Paris, France 25+ R&D engineers

Page 3: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

3

ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.

Any statements or representations around future events are subject to change.

I/O Bandwidth Requirements As seen from checkpoint restart needs

Bandwidth needs next-gen pre-Exascale systems

Rules of thumb:

1/ Checkpointing less than 6 minutes per hour

2/ Checkpointing means draining half of system memory

Pre-Exascale system:

4 Petabyte → bandwidth requirement 5.6 TB/s

Oakridge lab., Teng Wang, Weikuan Yu et al. “ An Efficient Distributed Burst Buffer for Linux”, LUG 2014

Page 4: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

4

ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.

Any statements or representations around future events are subject to change.

Irregular I/O Bandwidth Pressure

99% of the time the IO sub-system is stressed bellow 30% of its bandwidth

70% of the time the system is stress under 5% of its peak bandwidth Argone lab. P. Carns, K. Harms et al., Understanding and Improving Computational Science Storage Access through Continuous Characterization, 2011

Page 5: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

5

ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.

Any statements or representations around future events are subject to change.

What is IME? Distributed Virtually Shared Coherent Array of SSDs

SSD reshuffles the parameters

Latency / 40 : 4ms → 0,1 ms

Bandwidth x 3: 150 → 450 MB/s

Capacity / 8 : 8 → 1TB.

Cost x 10 $ 0,05/Gbit → $0.04

What can we do with a costly high

bandwidth low latency technology?

Page 6: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

6

ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.

Any statements or representations around future events are subject to change.

Dealing with System Complexity

Sporadic IO traffic

leads to difficult routing

?

Courtesy Philip Brighten, U. Illinois

Page 7: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

7

ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.

Any statements or representations around future events are subject to change.

State of the Art Storage Architecture

Courtesy Philip Brighten, U. Illinois

1PF Compute Cluster

Burst Buffer with DDN Infinite Memory Engine (IME) at 500 GB/s Performance

DDN EXAScaler (Lustre) For Scratch

4PB at 200 GB/s performance

EDR Infiniband N/w

GS-WOS Bridge

PFS Stats Collection & Monitoring

DDN GRIDScaler For Home & Nearline

3.5PB at 100 GB/s performance

WOS over

10GbE

NAS Gateways & Data Transfer Nodes

MetroX

5PB DDN WOS Object Storage Archive

Remote Login Nodes at NUS

MetroX

Remote Login Nodes at NTU

Page 8: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

8

ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.

Any statements or representations around future events are subject to change.

I/O proxys act as traffic aggregators: routing is easier

I/O proxy Storage Multicore Manycore GPU

Exascale as a System of Systems

Page 9: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

9

ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.

Any statements or representations around future events are subject to change.

From Monitoring to Orchestration

DDN DIO-pro 2016

Page 10: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

10

ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.

Any statements or representations around future events are subject to change.

Temporal and Spatial Patterns...

lDonald J. Hatfield, Jeanette Gerald: Program Restructuring for Virtual Memory. IBM

Systems Journal, 10 (3): 168-192 (1971)

Time

Page 11: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

11

ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.

Any statements or representations around future events are subject to change.

Temporal and Spatial Patterns... are here to stay

Source: Storage Models: Past, Present, and Future. Dres Kimpe et Robert Ross, Argonne National Laboratory

Page 12: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

12

ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.

Any statements or representations around future events are subject to change.

Conclusion: Storage Evolution

Storage getting closer to the CPU

Mechanically same needs will arise

Tools convergence

Access latency put pressure on the software design

→ window of opportunity to drastic redesign

Page 13: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

13

ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.

Any statements or representations around future events are subject to change.

Early IME I/O Accelerator feed-back

Harnessing distributed HW resources

→ From Fault tolerance to QoS

Hierarchical storage

→ Narrowing the gap between Storage and Processing

→ Inter-Operable

→ Software only solution are versatile

→ System wide profiling

→ Data policies

→ Orchestration by job scheduler

System of Systems: Think Out of the Box!

Page 14: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance

14

ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.

Any statements or representations around future events are subject to change.

Merci !

Thank you ! Grazie

Gracias

спасибо

ありがとう

谢谢