55
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Adam Boeglin, HPC Solutions Architect 6/15/22 Choosing the Right EC2 Instance and Applicable Use Cases

Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Embed Size (px)

Citation preview

Page 1: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Adam Boeglin, HPC Solutions Architect

Wednesday, May 3, 2023

Choosing the Right EC2 Instance and Applicable Use Cases

Page 2: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Understanding the factors that going into choosing an EC2 instance

Defining system performance and how it is characterized for different workloads

How Amazon EC2 instances deliver performance while providing flexibility and agility

How to make the most of your EC2 instance experience through the lens of several instance types

What to Expect from the Session

Page 3: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

InstancesAPI

Networking

EC2EC2

Purchase options

Amazon Elastic Compute Cloud is Big

Page 4: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Host ServerHypervisor

Guest 1 Guest 2 Guest n

Amazon EC2 Instances

Page 5: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

In the past

First launched in August 2006 M1 Instance

“One size fits all” M1

Page 6: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

2006 2008 2010 2012 2014 2016

m1.small

m1.largem1.xlarge

c1.mediumc1.xlarge

m2.xlarge

m2.4xlargem2.2xlarge

cc1.4xlarge

t1.micro

cg1.4xlarge

cc2.8xlarge

m1.medium

hi1.4xlarge

m3.xlargem3.2xlarge

hs1.8xlarge

cr1.8xlarge

c3.largec3.xlarge

c3.2xlargec3.4xlargec3.8xlargeg2.2xlarge

i2.xlargei2.2xlargei2.4xlargei2.4xlarge

m3.mediumm3.large

r3.larger3.xlarger3.2xlarge

r3.4xlarger3.8xlarge

t2.microt2.smallt2.med

c4.largec4.xlargec4.2xlargec4.4xlargec4.8xlarge

d2.xlarged2.2xlarged2.4xlarged2.8xlarge

g2.8xlarge

t2.largem4.large

m4.xlargem4.2xlargem4.4xlarge

m4.10xlarge

Amazon EC2 Instances History

x1.32xlarge

t2.nano

Page 7: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Instance generation

c4.largeInstance family

Instance size

Page 8: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Choices and Flexibility

Choice of Processor Memory Storage Options Accelerated Graphics Burstable Performance

Page 9: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Servers are hired to do jobs Performance is measured differently depending on the job

Hiring a Server

?

Page 10: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Performance Factors

Resource Performance factors Key indicatorsCPU Sockets, number of cores, clock

frequency, bursting capabilityCPU utilization, run queue length

Memory Memory capacity Free memory, anonymous paging, thread swapping

Network interface

Max bandwidth, packet rate Receive throughput, transmit throughput over max bandwidth

Disks Input / output operations per second, throughput

Wait queue length, device utilization, device errors

Page 11: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Resource Utilization

For given performance, how efficiently are resources being used

Something at 100% utilization can’t accept any more work

Low utilization can indicate more resource is being purchased than needed

Page 12: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Example: Web Application MediaWiki installed on Apache with 140 pages of content Load increased in intervals over time

Page 13: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Example: Web Application Memory stats

Page 14: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Example: Web Application Disk stats

Page 15: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Example: Web Application Network stats

Page 16: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Example: Web Application CPU stats

Page 17: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

“Launching new instances and running tests in parallel is easy…[when choosing an instance] there is no substitute for measuring the performance of your full application.”

- EC2 Documentation

Page 18: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

How not to choose an EC2 instance

Brute Force Testing Ignoring Metrics Favoring old generation instances Guessing based on what you already have

Page 19: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

EC2 Instance Families

General purpose

Computeoptimized

C3

Storage and IOoptimized

I2 G2

GPUenabled

Memoryoptimized

R3C4M4D2

X1

Page 20: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Give back instances as easily as you can acquire new ones Find an ideal instance type and workload combination EC2 Instance Pages provide “Use Case” Guidance With EBS, storage and instance size don’t need to be coupled

Instance Selection = Performance Tuning

Page 21: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Instance sizing

c4.8xlarge 2 - c4.4xlarge

4 - c4.2xlarge

8 - c4.xlarge

Page 22: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Choosing the right size

Understand your unit of work Web request Database / Table Batch Process

What is that unit’s requirements? CPU threads Memory Constraints Disk & Network

What are it’s availability requirements?

Page 23: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

CPU Instructions and Protection Levels

CPU has at least two protection levels. Privileged instructions can’t be executed in user mode to protect

system. Applications leverage system calls to the kernel.

Kernel

Application

Page 24: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

VMM

Application

Kernel

PV

X86 CPU Virtualization: Prior to Intel VT-x

Binary translation for privileged instructions Para-virtualization (PV)

PV requires going through the VMM, adding latency Applications that are system call bound are most affected

Page 25: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

KernelApplication

VMM

PV-HVM

X86 CPU Virtualization: After Intel VT-x

Hardware assisted virtualization (HVM) PV-HVM uses PV drivers opportunistically for operations that

are slow emulated: e.g., network and block I/O

Page 26: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Tip: Use HVM AMIs with EBS

Page 27: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Time Keeping Explained

Time keeping in an instance is deceptively hard gettimeofday(), clock_gettime(), QueryPerformanceCounter() The TSC

CPU counter, accessible from userspace Requires calibration, vDSO Invariant on Sandy Bridge+ processors

Xen pvclock; does not support vDSO On current generation instances, use TSC as clocksource

Page 28: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

# cat /sys/devices/system/cl*/cl*/available_clocksourcexen tsc hpet acpi_pm # cat /sys/devices/system/cl*/cl*/current_clocksourcetsc

# vi /boot/grub/menu.list && reboot

# cat /proc/cmdlineroot=/dev/xvda1 ro clock source=tsc

Change with:

Tip: Use TSC as clocksource

Page 29: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Review: C4 Instances Custom Intel E5-2666 v3 at 2.9 GHz P-state and C-state controls

Model vCPU Memory (GiB) EBS (Mbps)c4.large 2 3.75 500c4.xlarge 4 7.5 750c4.2xlarge 8 15 1,000c4.4xlarge 16 30 2,000c4.8xlarge 36 60 4,000

Batch & HPC workloads, Game Servers, Ad Serving, & High Traffic Web Servers

Page 30: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

What’s new in C4: P-state and C-state control

Intel Turbo Boost up to 3.5Ghz By entering deeper idle states, non-idle cores can achieve up to 300MHz

higher clock frequencies But… deeper idle states require more time to exit, may not be appropriate

for latency-sensitive workloads

Page 31: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Tip: P-state control for AVX2

If an application makes heavy use of AVX2 on all cores, the processor may attempt to draw more power than it should

Processor will transparently reduce frequency Frequent changes of CPU frequency can slow an application

Page 32: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Review: T2 Instances Lowest cost EC2 instance at $0.0065 per hour Burstable performance Fixed allocation enforced with CPU credits

Model vCPU Baseline CPU Credits / Hour

Memory (GiB)

Storage

t2.nano 1 5% 3 .5 EBS Onlyt2.micro 1 10% 6 1 EBS Onlyt2.small 1 20% 12 2 EBS Onlyt2.medium 2 40%** 24 4 EBS Onlyt2.large 2 60%** 36 8 EBS Only

General Purpose, Web Serving, Developer Environments, Small Databases

Page 33: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

How Credits Work

A CPU credit provides the performance of a full CPU core for one minute

An instance earns CPU credits at a steady rate An instance consumes credits when active Credits expire (leak) after 24 hours

Baseline rate

Credit balance

Burstrate

Page 34: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Tip: Monitor CPU credit balance

Page 35: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Tip: How to Interpret Steal Time

Fixed CPU allocations of CPU can be offered through CPU caps Steal time happens when CPU cap is enforced Leverage CloudWatch metrics

Page 36: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Announced: X1 Instances

Largest memory instance with 2TB of DRAM Quad socket, Intel E7 processors with 128 vCPUs

Model vCPU Memory (GiB) Local Storage

x1.32xlarge 128 1952 2x 1920GB

In-Memory Databases, Big Data Processing, HPC Workloads

Page 37: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

NUMA

Non-uniform memory access Each processor in a multi-CPU system has local memory that is

accessible through a fast interconnect Each processor can also access memory from other CPUs, but

local memory access is a lot faster than remote memory Performance is related to the number of CPU sockets and how they

are connected - Intel QuickPath Interconnect (QPI)

Page 38: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

QPI

122GB 122GB

16 vCPU’s 16 vCPU’s

r3.8xlarge

Page 39: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

QPI

QPI

QPIQPI

QPI

488GB

488GB

488GB

488GB

32 vCPU’s 32 vCPU’s

32 vCPU’s 32 vCPU’s

x1.32xlarge

Page 40: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Tip: Kernel Support for NUMA Balancing

An application will perform best when the threads of its processes are accessing memory on the same NUMA node.

NUMA balancing moves tasks closer to the memory they are accessing. This is all done automatically by the Linux kernel when automatic NUMA

balancing is active: version 3.8+ of the Linux kernel. Windows support for NUMA first appeared in the Enterprise and Data

Center SKUs of Windows Server 2003.

Page 41: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Review: I2 Instances 16 vCPU: 3.2 TB SSD; 32 vCPU: 6.4 TB SSD 365K random read IOPS for 32 vCPU instance

Model vCPU Memory (GiB)

Storage Read IOPS Write IOPS

i2.xlarge 4 30.5 1 x 800 SSD 35,000 35,000i2.2xlarge 8 61 2 x 800 SSD 75,000 75,000i2.4xlarge 16 122 4 x 800 SSD 175,000 155,000i2.8xlarge 32 244 8 x 800 SSD 365,000 315,000

NoSQL Databases, Clustered Databases, Online Transaction Processing (OLTP)

Page 42: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Hardware

Split Driver ModelDriver Domain Guest Domain Guest Domain

VMM

Frontend driver

Frontend driver

Backend driver

DeviceDriver

Physical CPU

Physical Memory

Network Device

Virtual CPU Virtual Memory

CPU Scheduling

Sockets

Application1

23

4

5

Page 43: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Granting in pre-3.8.0 Kernels

Requires “grant mapping” prior to 3.8.0 Grant mappings are expensive operations due to TLB flushes

SSD

Inter domain I/O:(1) Grant memory(2) Write to ring buffer(3) Signal event(4) Read ring buffer(5) Map grants(6) Read or write grants(7) Unmap grants

read(fd, buffer,…)

I/O domain Instance

Page 44: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Granting in 3.8.0+ Kernels, Persistent and Indirect

Grant mappings are setup in a pool once Data is copied in and out of the grant pool

SSD

read(fd, buffer…)

I/O domain Instance

Grant pool

Copy to and from grant pool

Page 45: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

2009 – Longer ago than you think Avatar was the top movie in the theaters Facebook overtook MySpace in active users President Obama was sworn into office

The 2.6.32 Linux kernel was released

Page 46: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Tip: Use 3.8+ kernel

Amazon Linux 13.09 or later Ubuntu 14.04 or later RHEL/Centos 7 or later Etc.

Page 47: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Device Pass Through: Enhanced Networking

SR-IOV eliminates need for driver domain Physical network device exposes virtual function to instance Requires a specialized driver, which means:

Your instance OS needs to know about it EC2 needs to be told your instance can use it

Page 48: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Hardware

After Enhanced NetworkingDriver Domain Guest Domain Guest Domain

VMM

NIC Driver

Physical CPU

Physical Memory

SR-IOV Network Device

Virtual CPU Virtual Memory

CPU Scheduling

Sockets

Application1

2

3

NIC Driver

Page 49: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Elastic Network Adapter

Next Generation of Enhanced Networking Hardware Checksums Multi-Queue Support Receive Side Steering

20Gbps in a Placement Group New Open Source Amazon Network Driver

Page 50: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

EBS Performance

Instance Size Matters Match your volume size and

type to your instance Use EBS Optimization if EBS

performance is important

Page 51: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Choose HVM AMI’s Time keeping: use TSC C state and P state controls Monitor T2 CPU credits Use a modern Linux kernel NUMA balancing Persistent grants for I/O performance Enhanced networking

Summary: Getting the Most Out of EC2 Instances

Page 52: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series
Page 53: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Bare metal performance goal, and in many scenarios already there History of eliminating hypervisor intermediation and driver domains

Hardware assisted virtualization Scheduling and granting efficiencies Device pass through

Virtualization Themes

Page 54: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Next steps

Visit the Amazon EC2 documentation Launch an instance and try your app!

Page 55: Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webinar Series

Thank you!