37
One Optimized I/O Configuration per HPC Application Leveraging I/O Configurability of Amazon EC2 Cloud Mingliang Liu, Jidong Zhai, Yan Zhai Tsinghua University Xiaosong Ma North Carolina State University Oak Ridge National Laboratory Wenguang Chen Tsinghua University APSys 2011, July 12

One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

One Optimized I/O Configuration perHPC Application

Leveraging I/O Configurability of Amazon EC2 Cloud

Mingliang Liu, Jidong Zhai, Yan ZhaiTsinghua University

Xiaosong MaNorth Carolina State University

Oak Ridge National Laboratory

Wenguang ChenTsinghua University

APSys 2011, July 12

Page 2: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Introduction

Outline

Page 3: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Introduction

Background

I/O becomes the bottleneck for many HPC applications

Intensive I/O operations and concurrencyOne-size-fits-all I/O configuration

There is a trend to migrate the HPC applications from traditionalplatforms to cloud

Cloud provides tremendous flexibility in configuring I/O system

Fully controlled virtual machinesEasily deployed user scriptsMultiple types of low-level devicesOnline device acquisition and migration

Page 4: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Introduction

Motivation

The Problem

Can we employ I/O configurability of cloud for HPC apps?

, Configurability lies in:

Set up the specific file system at start up

Explore different types of low-level devices

Tune the file system inherent parameters

/ Challenges:

The feasibility is highly workload-dependent

Tradeoff between efficiency vs. cost-effectiveness

Page 5: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Introduction

Motivation

The Problem

Can we employ I/O configurability of cloud for HPC apps?

, Configurability lies in:

Set up the specific file system at start up

Explore different types of low-level devices

Tune the file system inherent parameters

/ Challenges:

The feasibility is highly workload-dependent

Tradeoff between efficiency vs. cost-effectiveness

Page 6: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Introduction

Motivation

The Problem

Can we employ I/O configurability of cloud for HPC apps?

, Configurability lies in:

Set up the specific file system at start up

Explore different types of low-level devices

Tune the file system inherent parameters

/ Challenges:

The feasibility is highly workload-dependent

Tradeoff between efficiency vs. cost-effectiveness

Page 7: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Introduction

Amazon EC2 CCI

CCI: Cluster Computing Instance

Amazon’s solution to HPC in Cloud

Quad-core Intel Xeon X5570 CPU, with 23GB memory

Interconnected by 10 Gigabit Ethernet

Amazon Linux AMI, RedHat family OS with Intel MPI

local block storage (Ephemeral) with 2*800 GB

Elastic Block Store (EBS), attached as block storage devices

Simple Storage Service (S3), key-value based object storage

Page 8: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

Outline

Page 9: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

File System Selection

Different I/O access pattern and concurrency require differentkinds of file system- shared vs. parallel

NFS is enough for low I/O demands, simple to deploy

A parallel file system (eg. PVFS, Lustre) can be employed to

support large and shared file writesscale up well

Easy to choose and setup

118 LOC bash for NFS, and 173 LOC bash for PVFS

Page 10: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

File System Selection

Different I/O access pattern and concurrency require differentkinds of file system- shared vs. parallel

NFS is enough for low I/O demands, simple to deploy

A parallel file system (eg. PVFS, Lustre) can be employed to

support large and shared file writesscale up well

Easy to choose and setup

118 LOC bash for NFS, and 173 LOC bash for PVFS

Page 11: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

Device Selection Considerations

Devices differ in levels of abstraction and access interfaces.Storage Pros. Cons.

S3 off-shelf, designed for Inter-net and database apps

no POSIX

Ephemeral Free non-persistent,only two disks

EBS - persistent beyond instances- more disks available

Charged

The choice depends on the needs of individual applications

Page 12: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

Device Selection Considerations

Devices differ in levels of abstraction and access interfaces.Storage Pros. Cons.

S3 off-shelf, designed for Inter-net and database apps

no POSIX

Ephemeral Free non-persistent,only two disks

EBS - persistent beyond instances- more disks available

Charged

The choice depends on the needs of individual applications

Page 13: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

File System Internal ParametersOptions Description

Sync mode NFS sync vs async write modesDevice number Combining multiple disks into a soft-

ware RAID0I/O server number NFS single-server bottleneck, PVFS

can employ many I/O serversMeta-data distribution PVFS can distribute metadata to mul-

tiple serversI/O Server Placement Part-time vs. dedicated I/O serversData Striping striping factor, unit size

Again

The choice depends on the needs of individual applications

Page 14: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

File System Internal ParametersOptions Description

Sync mode NFS sync vs async write modesDevice number Combining multiple disks into a soft-

ware RAID0I/O server number NFS single-server bottleneck, PVFS

can employ many I/O serversMeta-data distribution PVFS can distribute metadata to mul-

tiple serversI/O Server Placement Part-time vs. dedicated I/O serversData Striping striping factor, unit size

Again

The choice depends on the needs of individual applications

Page 15: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

Dedicated NFS, Ephemeral Disks

10GB Ethernet

NFS Server

...

Computing Intances

Ephemeral Disks

There is 1 NFS I/O server deployed in one dedicated instance, mounting two ephemeral disks.

Page 16: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

Parttime NFS Mounting Ephemeral

10GB Ethernet

NFS Server

...

Computing Intances

Ephemeral Disks

There is 1 NFS I/O server deployed in one parttime instance, mounting two ephemeral disks.

Page 17: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

Dedicated NFS Mounting EBS Disks

10GB Ethernet

NFS Server

...

Computing Intances

EBS Disks

There is 1 NFS I/O server deployed in one dedicated instance, mounting 8 EBS disks into RAID0.

Page 18: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

1 Dedicated PVFS I/O server

10GB Ethernet

PVFS Server

...

Computing Intances

Ephemeral Disks

There is 1 PVFS I/O server deployed in one dedicated instance.

Page 19: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

2 Dedicated PVFS I/O Servers

10GB Ethernet

PVFS Server

...

Computing Intances

PVFS Server

There are 2 PVFS I/O servers deployed in two dedicated instances.

Page 20: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

4 Dedicated PVFS I/O Servers

10GB Ethernet

PVFS Server

...

Computing Intances

PVFS Server PVFS Server PVFS Server

There are 4 PVFS I/O servers deployed in the dedicated instances, low untilized.

Page 21: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

4 Part-time PVFS I/O servers

10GB Ethernet

...

Computing Intances

PVFS PVFS PVFS PVFS

There are 4 PVFS I/O servers deployed in the computing instances, working part-time.

Ephemeral Disks

Page 22: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

Options: sync mode and device

1 6 M 1 2 8 M 5 1 2 M 1 G 2 G 4 G0

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0

3 5 0

Write

Band

width

(MB/s

)

B l o c k S i z e ( B y t e )

E B S - s y n c E p h e m e r a l - s y n c E B S - a s y n c E p h e m e r a l - a s y n c

Test by IOR benchmark in 16 processes at 2 nodes.

Observations:No obvious difference between EBS and ephemeralAsync mode prefers small block sizes

Page 23: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

Options: sync mode and device

1 6 M 1 2 8 M 5 1 2 M 1 G 2 G 4 G0

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0

3 5 0

Write

Band

width

(MB/s

)

B l o c k S i z e ( B y t e )

E B S - s y n c E p h e m e r a l - s y n c E B S - a s y n c E p h e m e r a l - a s y n c

Test by IOR benchmark in 16 processes at 2 nodes.Observations:

No obvious difference between EBS and ephemeral

Async mode prefers small block sizes

Page 24: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

Options: sync mode and device

1 6 M 1 2 8 M 5 1 2 M 1 G 2 G 4 G0

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0

3 5 0

Write

Band

width

(MB/s

)

B l o c k S i z e ( B y t e )

E B S - s y n c E p h e m e r a l - s y n c E B S - a s y n c E p h e m e r a l - a s y n c

Test by IOR benchmark in 16 processes at 2 nodes.Observations:

No obvious difference between EBS and ephemeralAsync mode prefers small block sizes

Page 25: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

NFS Write Bandwidth

8 1 6 3 2 6 4 1 2 8 2 5 60

5 0

1 0 0

1 5 0

2 0 0

Write

Band

width

(MB/s

)

N u m o f P r o c e s s e s

1 - d i s k 2 - d i s k s 4 - d i s k s 8 - d i s k s

Combining 1,2,4,8 EBS disks into a software RAID0

The RAID0 doesn’t scale well- possibly because of the virtualized layer

Page 26: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Storage System Options in the Amazon EC2 CCI

NFS Write Bandwidth

8 1 6 3 2 6 4 1 2 8 2 5 60

5 0

1 0 0

1 5 0

2 0 0

Write

Band

width

(MB/s

)

N u m o f P r o c e s s e s

1 - d i s k 2 - d i s k s 4 - d i s k s 8 - d i s k s

Combining 1,2,4,8 EBS disks into a software RAID0

The RAID0 doesn’t scale well- possibly because of the virtualized layer

Page 27: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Preliminary Applications Results

Outline

Page 28: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Preliminary Applications Results

BTIO: CLASS=C, SUBTYPE=full

1 6 3 6 6 4 8 1 1 0 0 1 2 10

1 02 03 04 05 06 07 08 09 0

1 0 01 1 01 2 01 3 01 4 01 5 01 6 01 7 0

Total

Exec

ution

Time

(s)

N u m o f P r o c e s s e s

N F S - D e d i c a t e d N F S - P a r t t i m e P V F S - 1 - S e r v e r P V F S - 2 - S e r v e r P V F S - 4 - S e r v e r P V F S - 4 - S e r v e r - P a r t

PVFS outperforms NFS configurations all the time

PVFS scales up by adding more I/O servers

Parttime I/O servers provide pretty good performance

Page 29: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Preliminary Applications Results

BTIO: CLASS=C, SUBTYPE=full

1 6 3 6 6 4 8 1 1 0 0 1 2 10

1 02 03 04 05 06 07 08 09 0

1 0 01 1 01 2 01 3 01 4 01 5 01 6 01 7 0

Total

Exec

ution

Time

(s)

N u m o f P r o c e s s e s

N F S - D e d i c a t e d N F S - P a r t t i m e P V F S - 1 - S e r v e r P V F S - 2 - S e r v e r P V F S - 4 - S e r v e r P V F S - 4 - S e r v e r - P a r t

PVFS outperforms NFS configurations all the time

PVFS scales up by adding more I/O servers

Parttime I/O servers provide pretty good performance

Page 30: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Preliminary Applications Results

BTIO: CLASS=C, SUBTYPE=full

1 6 3 6 6 4 8 1 1 0 0 1 2 10

1 02 03 04 05 06 07 08 09 0

1 0 01 1 01 2 01 3 01 4 01 5 01 6 01 7 0

Total

Exec

ution

Time

(s)

N u m o f P r o c e s s e s

N F S - D e d i c a t e d N F S - P a r t t i m e P V F S - 1 - S e r v e r P V F S - 2 - S e r v e r P V F S - 4 - S e r v e r P V F S - 4 - S e r v e r - P a r t

PVFS outperforms NFS configurations all the time

PVFS scales up by adding more I/O servers

Parttime I/O servers provide pretty good performance

Page 31: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Preliminary Applications Results

BTIO: CLASS=C, SUBTYPE=full

1 6 3 6 6 4 8 1 1 0 0 1 2 10

1 02 03 04 05 06 07 08 09 0

1 0 01 1 01 2 01 3 01 4 01 5 01 6 01 7 0

Total

Exec

ution

Time

(s)

N u m o f P r o c e s s e s

N F S - D e d i c a t e d N F S - P a r t t i m e P V F S - 1 - S e r v e r P V F S - 2 - S e r v e r P V F S - 4 - S e r v e r P V F S - 4 - S e r v e r - P a r t

PVFS outperforms NFS configurations all the time

PVFS scales up by adding more I/O servers

Parttime I/O servers provide pretty good performance

Page 32: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Preliminary Applications Results

BTIO: Total Cost AnalysisCost calculation

Cost($) = Numcomputing instances ∗ Run time ∗ 1.6/3600

1 6 3 6 6 4 8 1 1 0 0 1 2 10 . 0

0 . 1

0 . 2

0 . 3

0 . 4

0 . 5

0 . 6

0 . 7

0 . 8

0 . 9

Total

Cost

($)

N u m o f P r o c e s s e s

N F S - D e d i c a t e d N F S - P a r t t i m e P V F S - 1 - S e r v e r P V F S - 2 - S e r v e r P V F S - 4 - S e r v e r P V F S - 4 - S e r v e r - P a r t

Page 33: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Preliminary Applications Results

POP: Parallel Ocean Program

Process 0 carries out all I/O tasks via POSIX interface, which isvery different from BTIO

POP does not scale on EC2, due to its heavy communicationwith small messages

1 6 3 2 6 4 1 2 8 1 9 60

1 0 02 0 03 0 04 0 05 0 06 0 07 0 08 0 09 0 0

1 0 0 01 1 0 01 2 0 01 3 0 01 4 0 0

Total

Exec

ution

Time

(s)

N u m o f P r o c e s s e s

N F S - p a r t t i m e P V F S - 2 - S e r v e r P V F S - 4 - S e r v e r P V F S - 4 - S e r v e r - p a r t

Page 34: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Conclusion

Outline

Page 35: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Conclusion

Future WorkThe Problem

Configure the I/O system for a HPC app automatically

HPC Apps

Analyzing IO Patterns

I/O Demands

I/O Options

Mapping ModelOptimized

Configurations

File System

Devices

Writing Mode

Number of Server

Placement

...

Page 36: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Conclusion

Conclusion

Cloud enables users to build per-application I/O systems

Our preliminary results hint that

HPC app behaves differently with different I/O systemconfigurations in cloudConfiguration per app depends on its I/O access pattern andconcurrency

Tradeoff: cost vs. efficiency

Thank you!

Page 37: One Optimized I/O Configuration per HPC Applicationhpc.cs.tsinghua.edu.cn/ACIC/files/ACIC_APSys2011_Slides.pdf · Storage Pros. Cons. S3 off-shelf, designed for Inter-net and database

Conclusion

Conclusion

Cloud enables users to build per-application I/O systems

Our preliminary results hint that

HPC app behaves differently with different I/O systemconfigurations in cloudConfiguration per app depends on its I/O access pattern andconcurrency

Tradeoff: cost vs. efficiency

Thank you!