I/O Performance of Virtualized Cloud...

I/O Performance of Virtualized Cloud Environments

Devarshi Ghoshal,

Richard S. Canon, Lavanya Ramakrishnan

• Data is a critical component of scientific process – Joint Genome Institute projects about 2PB/year

– Large Hydron Collider (LHC) projects 16PB/year

– Large Synoptic Survey Telescope (LSST) projects 6PB/year of raw data

• Similar problem: Internet data – Terabytes to petabytes of data/day

– Cloud is resource platform for these applications • MapReduce/Hadoop

• On-demand virtual machines for processing

Data Volumes are Increasing

• Cloud Infrastructure as a Service (IaaS)

– Provision bare metal or virtual machine resources

– Pay-as-you-go model

• different quality of service levels might be available

– User responsible for complete software stack

– Storage options: Elastic Block Store, Ephemeral local disk, S3

What is a Cloud?

Magellan

Determine the appropriate role for commercial and/or private cloud computing for DOE/SC midrange

workloads

Approach

• Deployed a distributed test bed at Argonne and NERSC to explore the use of clouds for mid-range scientific computing.

• Evaluated the effectiveness of cloud models for a wide spectrum of DOE/SC applications.

• Benchmarking I/O performance over different cloud and HPC platforms to identify major bottlenecks – Network and I/O related

• Identified through prior Magellan research and other related efforts

• I/O Intensive – Low Disk Latency – High Bandwidth – High Speed Interconnection Network – Scalability

• Specialized infrastructure at supercomputing centers compared to cloud’s commodity infrastructure

• Disk I/O and Throughput on Amazon EC2

– I/O benchmarking for different storage options available on Amazon’s EC2 cloud infrastructure

– Not sufficient data to understand impact on HPC apps

• Performance Analysis of HPC applications on Amazon EC2

– Focus on determining the performance of HPC applications on cloud

– No work on the effect of virtualization on I/O

Related work

• NERSC Machine Description

– 720 node IBM iDataPlex cluster

• 40 nodes for I/O benchmarking

– Each node has two quad-core Intel Nehalem processors running at 2.67 GHz, 24 GB RAM

– 4X Quad Data Rate (QDR) Infiniband Technology

– Batch Queue system

• GPFS, peak performance of 15 GB/s

– VM instance type- c1.xlarge

NERSC Configuration

Instance type

API name CPU Family

EC2 Compute Units

Memory (GB)

Local Storage (GB)

Expected I/O Performance

Small m1.small Intel Xeon E5430

1 1.7 160 Moderate

Large c1.xlarge Intel Xeon E5410

20 7 1690 High

Cluster-compute

cc1.4xlarge 2 x Intel Xeon X5570

33 23 1690 Very High

Amazon EC2 Configuration

• IOR (Interleaved or Random) benchmark – For testing performance of parallel file systems

– Read and Write Performance

– I/O type: Direct, Buffered

– Low benchmarking overhead

• Timed I/O Benchmark – Performance measurements for a particular

duration

– Measurement results at specific time-intervals

Benchmarks

• Types of I/O – Direct I/O vs Buffered I/O

• Virtual Machine Instance Types – Small, large and Cluster-compute instances

• Storage Devices – Local stores (ephemeral), EBS volumes

• Location of Instances – US-East and US-West regions

• Shared File-system – GPFS vs EBS volumes

• Time of run

Benchmark Parameters

• Buffer Caching

• Comparison across all Platforms

• Effect of Regions on Amazon

• Multi-node Shared File-system

• 24-hour Tests

• Large Scale Tests

Evaluation Summary

I/O Performance- Summary

• Network bandwidth plays an important role in determining I/O performance. • EBS performance is better on CC instances, due to 10 Gigabit Ethernet network.

Amazon EC2 (Write Performance)

West-zone instances outperform the East-zone instances in most cases

Shared Filesystem

High resource contention degrades the EBS performance severely.

Timed Benchmark (Write)

Occasional performance drop may be attributed to the underlying shared resources.

Performance Distribution

West-zone performance varies a lot compared to East-zone, but the average is lower.

• Performance impact on I/O

• Suitable trade-offs between EBS and local disk

– persistence vs performance

– cost is another factor

• Application design needs to consider

– performance variation

– lack of high performance shared file system

Impact on applications

• Performance in VMs is lower than on physical machines – Clouds are not yet ready for data-intensive applications

with high-performance requirements

• I/O performance on local disks is better than on EBS volumes – Local disks are ephemeral devices

– Local disks are also not suitable to MPI (most commonly used in HPC applications)

• Timed results show substantial variations – Further investigation required

Conclusions & Future Work

• US Department of Energy DE-AC02-05CH11232

• Krishna Muriki, Iwona Sakrejda, David Skinner

Acknowledgements

Questions? Other related events at Supercomputing –

• At Lawrence Berkeley Booth • Science in the Cloud? Busting Common Myths about

Clouds and Science Tue Nov 15 at 10:30 am

• What do Clouds mean for Science? Experiences from the Magellan Project Tue Nov 15 at 11:15 am

• Papers • Evaluating Interconnect and Virtualization

Performance for High Performance Computing, PMBS Workshop, Sun Nov 13 at 9 am.

• Riding the Elephant: Managing Ensembles with Hadoop, MTAGS Workshop, Mon Nov 14 at 4:30 pm

I/O Performance of Virtualized Cloud...

Documents

Datacloud 1 SQL Azure Hcjang

MTAGS2011: Panel -- Many-Task Computing meets Exascales 3datasys.cs.iit.edu/events/DataCloud-SC11/conc-remarks.pdfData-intensive cloud computing applications, characteristics, challenges

IEEE DataCloud 2012 2datasys.cs.iit.edu/events/MTAGS12/closing-remarks.pdfIEEE DataCloud 2012 3rd International Workshop on Data Intensive Computing in the Clouds 2012 Keynotes Daniel

SC11-1622 Appendix Vol 14, BRC & DVR

ADP DataCloud0F916658-03DA... · AWESOME NEW TECHNOLOGY 2016 Benchmarking, powered by ADP DataCloud TOP HR PRODUCT 2016 Reporting, powered by ADP DataCloud BEST IN CLASS - HR 2016

Https://portal.futuregrid.org Data Intensive Applications on Clouds The Second International Workshop on Data Intensive Computing in the Clouds (DataCloud-SC11)

RAPPORT FINAL 2006 Sous-projet SC11 - divex.cadivex.ca/wp-content/uploads/2015/12/SC11-Gauthier-2006.pdf · Projet DIVEX SC11 – Gauthier: Zinc dans les marbres 4 Figure 4 : Carte

Designing a Secure Storage Repository for Sharing ...datasys.cs.iit.edu/events/DataCloud-SC11/p04.pdfDesigning a Secure Storage Repository for Sharing ... necessary to support a shared,

Data-Intensive Distributed Systems Laboratory - …datasys.cs.iit.edu/events/DataCloud-SC11/s04.pdfData sharing - key tenet of scientific computing. Tremendous increase in data producers

T R’S U T O A NOSQL DATACLOUD TUTORIAL HANDOUTassets.en.oreilly.com/1/event/45/Building a NoSQL Data Cloud... · U TO A NOSQL DATACLOUD – TUTORIAL HANDOUT AWS SETUP 1 ... I have

Design Pattern for Scientific Applications in DryadLINQ CTPdatasys.cs.iit.edu/events/DataCloud-SC11/s07.pdf · 2019-10-14 · Design Pattern for Scientific Applications in DryadLINQ

Sc11 presentation 2001_06_28

ADP iHCM 2 …in 60 Sekunden€¦ · ADP Talent Cloud ADP Time Cloud ADP DataCloud. ADP ® iHCM 2 ist eine umfassende Human Capital Management Lösung in der Cloud. Unsere weltweite

Advanced Programmable Networks @ SC11

SC11 Jen-Hsun Huang Keynote

Stone conservation course – SC11 Curriculum · 2017-10-05 · Stone conservation course – SC11 Curriculum Course opening & closing Lecture Exercise / Demonstration Participant

Datacloud 2015 Sponsorship Brochure

XDC (eXtreme DataCloud) · XDC Objectives The eXtreme DataCloud is a software development and integration project Develops scalable technologies for federating storage resources and

SPICER CONSULTING SC11/SI SC11 Sensor Interface Analysis … · 2020. 4. 28. · SC11/AC should be used for this purpose. (The MEDA uMAG-01N one-axis DC magnetometer is also supported)

SC11: Advanced Research Networks - Rod Wilson