An MPI-IO Cloud Cluster Bioinformatics Summer Project (BDT205) | AWS re:Invent 2013

Preview:

DESCRIPTION

Researchers at Clemson University assigned a student summer intern to explore bioinformatics cloud solutions that leverage MPI, the OrangeFS parallel file system, AWS CloudFormation templates, and a Cluster Scheduler. The result was an AWS cluster that runs bioinformatics code optimized using MPI-IO. We give an overview of the process and show how easy it is to create clusters in AWS.

Citation preview

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

An MPI-IO Cloud Cluster Bioinformatics Summer Project

Brandon Posey, Dougal Ballantyne, Boyd Wilson

November 13, 2013

Filesystems on AWS

What filesystems *MUST* you use on AWS?

The one that means the needs of your unique application needs!

Some things to consider: • Total amount of storage required? • Resilience required? • Expected number of clients? • Locality of servers and clients? • Average file sizes? (KB, MB, GB, TB) • Block sizes used by applications? • IO profile? Read/Write%? • Typical IO use case?

Filesystems on AWS are all about building blocks!

Building Blocks • Amazon Elastic Compute Cloud (Amazon EC2)

– 1ECU to 88ECU of compute power – 613MB to 240GB of memory – Shared network, EBS optimized, dedicated 10Gb

• Amazon Simple Storage Service (Amazon S3) – Unlimited capacity – Web-scale – Lifecycle management

Amazon EC2

Amazon S3

Building Blocks • Local storage (ephemeral)

– 150GB to 3360GB per instance – HDD and SSD – FREE! (part of instance cost)

• Amazon Elastic Block Store (Amazon EBS) – 1G to 1000GB per volume – Standard and Provisioned IOPS – Multiple volumes per instance – Supports snapshot to Amazon S3

Amazon EBS

Ephemeral Disk

Storage-optimized EC2 instances http://aws.amazon.com/ec2/instance-types/ "This family includes the HI1 and HS1 instance types, and provides you with Intel Xeon processors and direct-attached storage options optimized for applications with specific disk I/O and storage capacity requirements." • HI1 instances features SSD storage • HS1 instances feature direct attach HDD

Amazon EBS optimized instances http://aws.amazon.com/ebs/ "To enable your Amazon EC2 instances to fully utilize the IOPS provisioned on an EBS volume, you can launch selected Amazon EC2 instance types as “EBS-Optimized” instances."

What Are Your Needs? • Temporary or long-term storage? • Shared or per instance? • How much? • How fast?

Long term storage • Use Amazon S3 • Pull datasets when needed • Easy to access using AWS CLI or API

$ aws s3 cp s3://mybucket/dataset/input /ephemeral/input

• Lifecycle to Amazon Glacier

Temporary Storage • Local ephemeral for scratch • Distributed filesystem for high-performance

scratch – OrangeFS – Lustre – Ceph

• Pull data from Amazon S3

How much? • With Amazon S3, you pay for what you use • With Amazon EBS, you pay for what you

provision • Keeping data in Amazon S3 and only pulling

what is needed helps mange cost

How fast? • Ephemeral storage can deliver up to 2.2GB/sec

– more instances == more throughput

• Amazon EBS volumes support up to 4000 IOPS – more volumes == more IOPS

• Amazon S3 scales horizontally – more client == more throughput – more connections == more throughput

Making filesystems persist • Use Amazon EBS for block storage • Use Amazon EBS snapshots for recovery • Use a replicated distributed filesystem

Automating deployments • AWS CloudFormation • Drive storage through parameters • Easy to set up and tear down • Track template changes in SCM

Solutions on AWS • OrangeFS from Omnibond

• Red Hat Storage 2.0

• Intel Cloud Edition Lustre - Private Beta

Customer presentation

RNA-Seq Differential Gene Expression Workflow

Clemson University Professor, Dr. Alex Feltus had been discussing with Eddie Duffy and Dr. Barr Von Oehsen, about optimizing the Gene Expression Workflow. As a result, a summer project with Brandon Posey was started to work with this optimization in the AWS cloud. The longest processing steps were the FastQ steps and is where the optimization started.

*Workflow chart provided with permission from Allele Systems (www.allelesystems.com)

OrangeFS – Scalable Parallel File System on AWS

Available on the AWS Marketplace and brought to you by Omnibond

OrangeFS Instance

Unified High Performance File System

Amazon DynamoDB

Amazon EBS

volumes

Cloud Cluster Built using AWS, Torque/Maui, OrangeFS

OrangeFS WebDAV

Torque / Maui

Optimization Areas • Data uploaded and

retrieved via OrangeFS WebDav Interface

• MPI Jobs are submitted via Torque & Maui Scheduler

• All built with AWS CloudFormation template

MPI-IO Clients

OrangeFS Servers

Amazon DynamoDB

AWS CloudFormation Prompts "KeyName" : {

"VpcId" : {

"VpcPublicSubnetId" : {

"NAT & OrangeFS… AccessFrom" : {

"FSConfigDDB" : {… "WorkerConfigDDB" : {… "Type" : "AWS::DynamoDB::Table",

"CfnUser" : { …. "Type" : "AWS::IAM::User",…

AWS CloudFormation – Amazon DynamoDB "FSConfigDDB" : {

"Type" : "AWS::DynamoDB::Table",

"WorkerConfigDDB" : {

"Type" : "AWS::DynamoDB::Table",

AWS CloudFormation - IAM & Network "instanceRootRole" : {

"instanceRootProfile" : {

"HostKeys" : {

"PrivateSubnet" : {

"PrivateRouteTable" : {

"PrivateSubnetRouteTableAssociation" : {

"PrivateNetworkAcl" : {

"NATIPAddress" : {… "Type" : "AWS::EC2::EIP",

AWS CloudFormation – Instances "NATDevice" : {…

"Type" : "AWS::EC2::Instance",

"MasterCoordinator" : {… "Type" : "AWS::EC2::Instance",

"OrangeFSFleet" : {… "Type" : "AWS::AutoScaling::AutoScalingGroup",

"WorkerFleet" : {… "Type" : "AWS::AutoScaling::AutoScalingGroup",

"WebDavDevice" : {… "Type" : "AWS::EC2::Instance",

AWS CloudFormation – Cloud Init (python & Boto) "sudo /usr/bin/python2.7 /home/ec2-user/TorqueMasterConfigure.py -l DEBUG -f /home/ec2-user/MasterConfig.log”,

" -n ", {"Ref" : "WorkerConfigDDB"}, " -o ", {"Ref" : "FSConfigDDB"}, " -s ", {"Fn::FindInMap" : [ "ConfigParameters", "OrangeFSFleetSize", "item"]}, " -z ", {"Fn::FindInMap" : [ "ConfigParameters", "WorkerFleetSize", "item"]}, " -m ", {"Fn::FindInMap" : [ "ConfigParameters", "WorkerMaxFleetSize", "item"]}, " -p ", {"Fn::FindInMap" : [ "ConfigParameters", "OrangeFSPort", "item"]}, " -a ", {"Fn::FindInMap" : [ "ConfigParameters", "FSName", "item"]}, " -d ", {"Fn::FindInMap" : [ "ConfigParameters", "FSID", "item"]}, "\n",

Demo • Spin up a cluster on AWS live

*Workflow chart provided with permission from Allele Systems (www.allelesystems.com)

RNA-Seq Differential Gene Expression Workflow

Optimization Areas • Fast- Splitter

rewritten in MPI-IO to leverage OrangeFS in AWS

• Merge-FastQ also rewritten in MPI-IO to leverage OrangeFS in AWS

Genomics – Data @@@FFF=BFHFDHCCDECJHIIIHG@GEEGAGEHFDHDHGIF@FGDEBFGIIGG=CGFGCDCEGHFEEECEBADBB?BCCCC<5:>@CCCA<9>C@A@ACB

@HWI-ST1097:170:C1LBBACXX:6:1101:1379:2208 1:N:0:CGATGT

CCTGTTATTGCCTCAAACTTCCGTGGCCTAAAACGCCAAAGTCCCCCTAAGAAGATAGCTGCGGGGGGGTGGCTCCGCCTAGCTAGTTAGGAAGCTGAGGG

+

CCCFFFFFHHHHHJJJJJJJJJJFAC8A*1?E#####################################################################

@HWI-ST1097:170:C1LBBACXX:6:1101:1582:2059 1:N:0:CGATGT

GTATTGTCATAAGCAGTTAAAGCTGATGTGCGCCTGTCATGTAATGCTGTAGAAACAAGCTCAGCAAGCTGCTGCTTTTGTGTTCTTGCACCGGAGNTCTT

Torque/Maui Job #!/bin/bash

#PBS -l nodes=4

#PBS -l walltime=4:00:00

#PBS -j oe

#PBS -q batch

#PBS -N AWS

cd /mnt/orangefs

mpirun /usr/local/bin/concat -p '/mnt/orangefs/Sample_Feltus1_L006_R2.cat.fastq.*' -o Combined.fastq >> /mnt/orangefs/Results.txt

FastQ Splitter Time (seconds)

0 20 40 60 80 100

m1.xlarge

m3.xlarge

cc2.8xlarge

Read Input Transfer Write Output

0 500 1000 1500 2000 2500 3000 3500 4000

Old Method

Seconds

Seconds

FastQ Merge Time (seconds)

0 20 40 60 80 100 120

m1.xlarge

m3.xlarge

cc2.8xlarge

Merge Time

0 500 1000 1500 2000 2500

Old Method

Seconds

Seconds

Demo • Torque/Maui Job on the cluster that was spun

up.

More Info • AWS Marketplace…

– OrangeFS Community Edition – OrangeFS Advanced Edition

• Community… Orangefs.org

• Pipeline – Allele Systems… allelesystems.com

Please give us your feedback on this presentation

As a thank you, we will select prize winners daily for completed surveys!

BDT205

Recommended