Upload
ronald-smith
View
217
Download
3
Embed Size (px)
Citation preview
1
Advanced Storage Technologiesfor
High Performance Computing
Sorin, FaibishEMC NAS Senior TechnologistIDC HPC User Forum, April 14-16, Norfolk, VA
2
IDC HPC User Forum 2008
New HPC Storage Intensive Applications
Storage Challenges* New algorithms that can scale to search and process massive datasets;
New metadata management of distributed data sources;
New platforms provide uniform high-speed memory access to multi terabyte data structures;
Hybrid interconnect architectures to process and filter multi gigabyte data streams from scientific instruments;
High-performance, high-reliability, petascale distributed file systems;
New approaches to software mobility, so that algorithms can execute on nodes where the data resides;
Flexible and high-performance software integration technologies running on diverse computing platforms;
Data signature generation techniques for data reduction and rapid processing.
*Computer Magazine: http://www.computer.org/portal/cms_docs_computer/computer/homepage/0408/R4gei.pdf
3
IDC HPC User Forum 2008
New Storage Technologies for HPC
Storage Technologies Virtualization to address the multi-
core problem
CDP and memory snapshots to address storage failures during computation
DR and distributed cache appliances to address computation across geographies
SSD disk technology to address Data Intensive Super Computing tasks as well as decrease power consumption of storage
pNFS and RDMA technologies to increase the I/O speeds and reduce computation cycles
Storage at Previous HPC User Forum
4
IDC HPC User Forum 2008
New Concept – Better Utilization of multi-cores
Current Implementation – Application split on multiple single
core SMP HW– Use middleware SW (Platform)
5
IDC HPC User Forum 2008
New Concept – Better Utilization of multi-cores
Dual-core support added– Application modified to support SMP
dual core– CPU used: 4x 100% (100%)– Licenses paid: 4– Licenses used: 4
6
IDC HPC User Forum 2008
New Concept – Better Utilization of multi-cores
Quad-core chips appear– CPU used: 4x 100% (4/8=50%)– Licenses paid: 8– Licenses used: 4 – Application must be modified or
7
IDC HPC User Forum 2008
New Concept – Better Utilization of multi-cores
Quad-core chips appear– CPU used: 4x 100% (50%)– Licenses paid: 8– Licenses used: 4 – Application must be modified or– Use VM with CPU affinity– CPU used: 8x80% (80%)– Licenses used: 8
8
IDC HPC User Forum 2008
New Concept – Better Utilization of multi-cores
N-cores chips are coming– Use VM with VT support– CPU used: 2xNx90% (90%)– Licenses paid=used: 2xN
9
IDC HPC User Forum 2008
New Concept – Better Utilization of multi-cores
Core agnostic Middleware will work with as many cores as available
– Enabled by pNFS access to shared storage
10
IDC HPC User Forum 2008
CDP + Memory Snapshots in HPC applications
SAN
SunIBM HPHDSEMC
HPC Application platform support
CDP Journal + Memory Snapshots
CDPAppliance
CDP Technology will work with Real and Virtual Infrastructures
– VM snapshots on central storage repository
– VM and HW hosts memory snapshots
– Any SAN or NAS storage– Recover HPC job at any
point in time (last minute failure after 2 weeks run)
11
IDC HPC User Forum 2008
HPC Application remote platform
HPC Application platform support
Continuous Remote Replication in HPC
Site A Site BSANSAN
SunIBM HPHDSEMC
SunIBM HPHDSEMC Heterogeneous
storage
CacheAppliance
CacheAppliance
HeterogeneousBlades; VM+HW
Distributed cache engines allow distributed access to shared storage
– Remote Compute Nodes accessing the shared storage
12
IDC HPC User Forum 2008
SSD Disks in HPC applications
Solid State Disks will replace Disk Drives– Today HPC workloads are mostly compute
intensive– Data intensive Super Computing (DISC)
applications start to appear (see: IEEE Computer Magazine, April 2008)
– SSD will balance performance between DISC and compute intensive HPC applications
– EMC DMX has SSD today (25 SSD = 800K iops or 5 GB/sec) SAN
EMC
HPC Application platform support
DMX + SSD
0
0.5
1
1.5
2
Pri
ce/P
erfo
rman
ce
ext2 ext3 Reiserfs DualFS
Performance Normalized to Cost
13
IDC HPC User Forum 2008
pNFS addresses the storage access issues
– Remove servers layer between CE and shared storage
– Separates MD traffic from Data Traffic
– Asymmetric storage architectures increase scalability
– SSD increase I/O speed
HPC Architecture
SSD STORAGE
CONNECTIVITY
MIDDLEWARE
NFS S E R V E R S
HPC Jobs
Storage must be Networked
Compute Engines
CONNECTIVITY
pNFS
pNFS will deliver very high I/O speeds to HPC
14
IDC HPC User Forum 2008
MD is directed to the single MD server
Data is served by storage servers or storage arrays directly from host to storage
Storage access controlled by iSCSI
I/O to native IB or 10G storage redirected via RDMA in HW
iSCSI (iSER) NFS (pNFS)
Storage array
NFS/pNFS
File systems
Data path
Control path
Native IB Storage Array Cache
MetaData Cache
CE Cache
RD
MA
pNFS with Infiniband RDMA value added to HPC