AWS - Data analytics and Deep Learning on AWS with …-+Data...Why use AWS for Research? Time to...

Data analytics and Deep Learning on AWS with Jupyter and MXNetAmazon Web Services

Adrian White, Research & Technical ComputingJuly 20, 2017

Agenda1:30 Introduction to AWS (20 mins)1:50 Demo: Alces Flight & Landsat-8 (15 mins)2:05 Lab: Jupyter Notebooks on AWS (50 mins)3:00 Finish

Why use AWS for Research?

Time to ScienceAccess research

infrastructure in minutes

Low CostPay-as-you-go pricing

ElasticEasily add or remove

capacity

Globally AccessibleEasily Collaborate with

researchers around the world

SecureA collection of tools to

protect data and privacy

ScalableAccess to effectively

limitless capacity

AWS Global Infrastructure

16 Regions

43AvailabilityZones

70EdgeLocations

ENTERPRISE APPS

DEVELOPMENT & OPERATIONSMOBILE SERVICESAPP SERVICESANALYTICS

DataWarehousing

Hadoop/Spark

Streaming Data Collection

Machine Learning

Elastic Search

Virtual Desktops

Sharing & Collaboration

Corporate Email

Backup

Queuing & Notifications

Workflow

Search

Transcoding

One-click App Deployment

Identity

Single Integrated Console

PushNotifications

DevOps Resource Management

Application Lifecycle Management

Containers

Triggers

Resource Templates

TECHNICAL & BUSINESS SUPPORT

Account Management

Support

Professional Services

Training & Certification

Security & Pricing Reports

Partner Ecosystem

Solutions Architects

MARKETPLACE

Business Apps

Business Intelligence DatabasesDevOps

Tools NetworkingSecurity Storage

Regions Availability Zones

Points of Presence

INFRASTRUCTURE

CORE SERVICES

ComputeVMs, Auto-scaling, & Load Balancing

StorageObject, Blocks, Archival, Import/Export

DatabasesRelational, NoSQL, Caching, Migration

NetworkingVPC, DX, DNSCDN

Access Control

Identity Management

Key Management & Storage

Monitoring & Logs

Assessment and reporting

Resource & Usage Auditing

SECURITY & COMPLIANCE

Configuration Compliance

Web application firewall

HYBRIDARCHITECTURE

Data Backups

Integrated App Deployments

DirectConnect

IdentityFederation

IntegratedResource Management

Integrated Networking

API Gateway

Rules Engine

Device Shadows

Device SDKs

Registry

Device Gateway

Streaming Data Analysis

Business Intelligence

MobileAnalytics

More on AWS Instance Types

Broad Set of Compute Instance Typesfor HPC and Deep Learning

General purpose

Computeoptimized

Storage and I/Ooptimized

GPU or FPGAenabled

Memoryoptimized

I2 HS1

Instances Sizes: R4 Example M4

GPU and FPGA InstancesP2: GPU instance• Up to 16 NVIDIA GK210 (8 X K80) GPUs in a single instance, with

peer-to-peer PCIe GPU interconnect• Supporting a wide variety of use cases including deep learning, HPC

simulations, financial computing, and batch rendering

F1: FPGA instance• Up to 8 Xilinx Virtex® UltraScale+™ VU9P FPGAs in a single

instance, with peer-to-peer PCIe and bidirectional ring interconnects• Designed for hardware-accelerated applications including financial

computing, genomics, accelerated search, and image processing

P2 GPU Instances

• Up to 16 K80 GPUs in a single instance• Including peer-to-peer PCIe GPU interconnect• Supporting a wide variety of use cases including deep

learning, HPC simulations, and batch rendering

Instance Size

GPUs GPU Peer to Peer

vCPUs Memory (GiB)

Network Bandwidth*

p2.xlarge 1 - 4 61 1.25Gbpsp2.8xlarge 8 Y 32 488 10Gbpsp2.16xlarge 16 Y 64 732 20Gbps

*In a placement group

F1 FPGA Instances

• Up to 8 Xilinx Virtex UltraScale Plus VU9p FPGAs in a single instance with four high-speed DDR-4 per FPGA

• Largest size includes high performance FPGA interconnects via PCIeGen3 (FPGA Direct), and bidirectional ring (FPGA Link)

• Designed for hardware-accelerated applications including financial computing, genomics, accelerated search, and image processing

Instance Size FPGAs FPGA Link

FPGADirect

vCPUs Memory (GiB)

NVMeInstanceStorage

Network Bandwidth*

f1.2xlarge 1 - 8 122 1 x480 5 Gbpsf1.16xlarge 8 Y Y 64 976 4x960 30 Gbps

*In a placement group

A GPU is effective at processing the same set of operations in parallel – single instruction, multiple data (SIMD). A GPU has a well-defined instruction-set, and fixed word sizes – for example single, double, or half-precision integer and floating point values.

An FPGA is effective at processing the same or different operations in parallel – multiple instructions, multiple data (MIMD). An FPGA does not have a predefined instruction-set, or a fixed data width.

ControlALU

CPU(one core)

DRAM DRAM

Each FPGA in F1 has more than 2M of these cells

Each GPU in P2 has 2880 of these cores

Parallel Processing in GPUs and FPGAs

DRAM DRAM

But it’s not about servers…

Physical Virtualisation Containerization Serverless

Evolving Compute Abstractions

AWS Lambda

AWS Lambda – How it Works

Bring your own codeNode.JS, Java, PythonJava = Any JVM based language such as Scala, Clojure, etc.Bring your own libraries

Flexible invocation pathsEvent or RequestResponse invoke optionsExisting integrations with various AWS services

Simple resource model• Select memory from 128MB

to 1.5GB in 64MB steps• CPU & Network allocated

proportionately to RAM• Reports actual usage

Fine grained permissions• Uses IAM role for Lambda

execution permissions• Uses Resource policy for

AWS event sources

Lambda in the context of Grid Computing

Source: “Occupy the Cloud: Distributed Computing for the 99%”https://arxiv.org/pdf/1702.04024.pdf

Batch & HPC

On-demand, Auto Scaling Clusters On AWS

CfnCluster AWS Batch

AWS Batch automatically provisions compute resources tailored to the needs of your jobs using Amazon EC2 and EC2 Spot

Alces Flight is available in the AWS Marketplace and bundles 1000+ commonly used applicationshttps://aws.amazon.com/marketplace/

CfnCluster is provided by AWS to quickly provision configurable clusters and grid computing environments.

Alces Flight: Personal on-demand HPC

1000+ popular scientific applications

• Pre-installed

• Multiple versions, complete with libraries and various compiler optimizations, ready to run

Available via the AWS Marketplace (the cloud’s “App Store”)

http://alces-flight.com/ for more information

Self-scaling HPC clusters instantly ready to compute, billed by the hour and use the AWS Spot market by default, so they’re incredibly low cost

Flight is accessible

All the traditional command-line tools will be familiar, but you can also create an Alces “session” and immediately launch a desktop view of your cluster to run graphical apps.

Command Line (ssh) Graphical Console

Demo: Alces Flight & Landsat on AWS

Data Analytics

ENTERPRISE APPS

DEVELOPMENT & OPERATIONSMOBILE SERVICESAPP SERVICESANALYTICS

DataWarehousing

Hadoop/Spark

Streaming Data Collection

Machine Learning

Elastic Search

Virtual Desktops

Sharing & Collaboration

Corporate Email

Backup

Queuing & Notifications

Workflow

Search

Transcoding

One-click App Deployment

Identity

Single Integrated Console

PushNotifications

DevOps Resource Management

Application Lifecycle Management

Containers

Triggers

Resource Templates

TECHNICAL & BUSINESS SUPPORT

Account Management

Support

Professional Services

Training & Certification

Security & Pricing Reports

Partner Ecosystem

Solutions Architects

MARKETPLACE

Business Apps

Business Intelligence DatabasesDevOps

Tools NetworkingSecurity Storage

Regions Availability Zones

Points of Presence

INFRASTRUCTURE

CORE SERVICES

ComputeVMs, Auto-scaling, & Load Balancing

StorageObject, Blocks, Archival, Import/Export

DatabasesRelational, NoSQL, Caching, Migration

NetworkingVPC, DX, DNSCDN

Access Control

Identity Management

Key Management & Storage

Monitoring & Logs

Assessment and reporting

Resource & Usage Auditing

SECURITY & COMPLIANCE

Configuration Compliance

Web application firewall

HYBRIDARCHITECTURE

Data Backups

Integrated App Deployments

DirectConnect

IdentityFederation

IntegratedResource Management

Integrated Networking

API Gateway

Rules Engine

Device Shadows

Device SDKs

Registry

Device Gateway

Streaming Data Analysis

Business Intelligence

MobileAnalytics

Evolution of Data Analytics

Batch Real time Prediction

Amazon KinesisAmazon

Redshift

AWS Batch

Amazon EMR

AWS IoTAmazon

Amazon Kinesis Analytics

Amazon Machine Learning

Amazon Rekognition

Amazon Redshift Amazon Elastic MapReduce

Data Warehouse Semi-structured

Amazon Glacier

Use an optimal combination of highly interoperable services

Amazon Simple Storage Service

Data Storage Archive

Amazon DynamoDB

Amazon Machine Learning

Amazon Kinesis

NoSQL Predictive Models Other AppsStreaming

Machine Learning

The Circle of ML

Front-End team

Data Engineering team

Analysts / DS team

DevOps team

Business Problem

ML Model

ML Application

The Circle of ML

Front-End team

Data Engineering team

Analysts / DS team

DevOps team

Business Problem

ML Data

ML Application

Heavy Lifting by AWS

Dive Deep as much as you need

Hardware - Distributed computing, GPU, FPGA, Green Grass

DL - MXNet, NeMo, TensorFlow, Caffe, Torch, Theano

Platform – Data Science Environment (Notebooks, Model Hosting and Retraining)

Simple API ServicesU

Control

Jupyter Notebooks on AWS

Research customers are increasingly doing exploratorydata science and analytics work using notebooks.

Jupyter on AWS allows researchers to take advantageof any AWS compute node type:• Large memory, CPU optimized, IO optimized• GPU nodes (e.g. multiple K80 GPUs)

Researchers can also access Batch, HPC and Spark/Mllib clusterswith Jupyter

How to:Run Jupyter Notebook and JupyterHub on Amazon EMRCreating and Using a Jupyter Instance on AWS

Demo: Jupyter on AWS

Distributed Deep Learning on AWS

• Distributed training across GPUs or CPUs using MXNet

• Spin up a cluster in minutes• Automatically add or remove cluster

nodes• Supports Amazon EFS share filesystem• Available on GitHub

https://github.com/awslabs/deeplearning-cfn

Research Programs at AWS

Global Data Egress Waiver

Why?Researchers strongly need Predictable Budgets

Who? Available to Degree-granting / Research Institutions in APAC (and elsewhere)

What?Waives data egress charges from Qualified Accounts (capped at 15% of Total Spend)

How?Contract Addendum Required.Talk to your Account Team.

All qualifying research customers should use this!

AWS Research Cloud Program

Science first, not servers.Researchers are not professional IT people (nor do they wish to be).

Simple and easily explainedprocedures to get set up with cloud access.

Budget management tools to ensure that over-spends do not happen.

Large catalog of scientificSolutions from partners, including instant clusters from AWS Marketplace.

Fast track to invoice-backed billing & Egress Waiver.

Best practices to ensure both data and research budgets are safe and privacy is protected.

IT’S ABOUT SCIENCE, NOT SERVERS.

aws.amazon.com/rcp

We recognise that whilst research is often a compute-intensive activity, most researchers are not IT experts.

We want to simplify research in the cloud with easy-to-use tools for researchers and their students, and share the catalogue of “researcher-obsessed” products and services created by many of our partners.

AWS Researcher’s HandbookThe 150-page “missing manual” for science in the cloud.

Written by Amazon’s Research Computing community for scientists.

• Explains foundational concepts about how AWS can accelerate time-to-science in the cloud.

• Step-by-step best practices for securing your environment to ensure your research data is safe and your privacy is protected.

• Tools for budget management that will help you control your spending and limit costs (and preventing any over-runs).

• Catalogue of scientific solutions from partners chosen for their outstanding work with scientists.

aws.amazon.com/rcp

Lab: Deep Learning on AWS with Jupyter and MXNet

AWS - Data analytics and Deep Learning on AWS with …-+Data...Why use AWS for Research? Time to...

Documents

AWS Data Pipeline · 2020-03-31 · AWS Data Pipeline Developer Guide What is AWS Data Pipeline? AWS Data Pipeline is a web service that you can use to automate the movement and transformation

AWS Data Pipeline - Entwicklerhandbuch · AWS Data Pipeline Entwicklerhandbuch Zugriff auf AWS Data Pipeline Datenverarbeitungskapazitäten ist anpassbar. Weitere Informationen erhalten

Protecting Your Data With AWS KMS and AWS CloudHSM

Protecting Your Data in AWS - WordPress.com · 2016-07-14 · Protecting Your Data in AWS . Encrypting Data in AWS AWS Key Management Service, CloudHSM and other options . What to

AWS Big Data combo

AWS Data Transfer Services: Accelerating Large-Scale Data Ingest Into the AWS Cloud

AWS Data Pipeline Developer Guides3.amazonaws.com/awsdocs/datapipeline/latest/datapipeline-dg.pdf · AWS Data Pipeline Developer Guide. What is AWS Data Pipeline? AWS Data Pipeline

Globus Research Data Management: Introduction and · PDF fileGlobus Research Data Management: Introduction and Service Overview ... (AWS EC2) ssh Test Endpoint Log ... – OPeNDAP,

Intro to Big Data on AWS Igor Roiter Big Data Cloud ... presentation- AWS 18... · AWS IoT DynamoDB AWS Snowball Amazon Athena EC2 ... Elasticsearch Service Lambda AWS Database Migration

Big Data Analytics Options on AWS - AWS Whitepaper

Welding Research - AWS

AWS Black Belt Techシリーズ AWS Data Pipeline

AWS Data Wrangler

20141021 AWS Cloud Taekwon - Big Data on AWS

AWS Black Belt Tech シリーズ 2015 - AWS Data Pipeline

AWS Big Data

AWS Data Collection & Storage

Big Data on AWS

Data Lake Foundation on the AWS Cloud with AWS Services · AWS Cloud with AWS Services Quick Start Reference Deployment ... Usage model for Data Lake Foundation Quick ... Data Lake

AWS Summit Barcelona - Data Analysis on AWS