66
Optimizing for Cost in the Cloud Jinesh Varia @jinman Technology Evangelist

Optimizing for Costs in the Cloud

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Optimizing for Costs in the Cloud

Optimizing for Cost in the Cloud

Jinesh Varia

@jinman

Technology Evangelist

Page 2: Optimizing for Costs in the Cloud

Multiple dimensions of optimizations

Cost Performance Response time Time to market High-availability Scalability Security Manageability …….

Page 3: Optimizing for Costs in the Cloud

Optimizing for Cost

Page 4: Optimizing for Costs in the Cloud

When you turn off your cloud resources, you actually stop paying for them

Page 5: Optimizing for Costs in the Cloud

Continuous optimization in your architecture results in recurring savings in your next month’s bill

Page 6: Optimizing for Costs in the Cloud

Elasticity is one of the fundamental

properties of the cloud that drives many of its economic benefits

Page 7: Optimizing for Costs in the Cloud

#1 Use only what you need (use Auto Scaling Service, modify–db)

Optimizing for Cost…

Page 8: Optimizing for Costs in the Cloud

Turn off what you don’t need (automatically)

Page 9: Optimizing for Costs in the Cloud

0

2

4

6

8

10

12

14

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Lo

ad

Hour

Daily CPU Load

25% Savings

Optimize by the time of day

Page 10: Optimizing for Costs in the Cloud

Availability Zone #2

Availability Zone #1

Auto Scaling group : App Tier

Auto Scaling group : Web Tier

Elastic Load

Balancer

www.MyWebSite.com

(dynamic data)

media.MyWebSite.com

(static data)

Amazon Route 53

(DNS)

Amazon EC2

Amazon RDS Amazon

RDS

Amazon S3

Amazon

CloudFront

Page 11: Optimizing for Costs in the Cloud

1 5 9 13 17 21 25 29 33 37 41 45 49

We

b S

erv

ers

Week

Optimize during a year

50% Savings

Page 12: Optimizing for Costs in the Cloud

Auto scaling : Types of Scaling

Scaling by Schedule

• Use Scheduled Actions in Auto Scaling Service • Date

• Time

• Min and Max of Auto Scaling Group Size

• You can create up to 125 actions, scheduled up to 31 days into the future, for each of your auto scaling groups. This gives you the ability to scale up to four times a day for a month.

Scaling by Policy

• Scaling up Policy - Double the group size

• Scaling down Policy - Decrement by 1

Page 13: Optimizing for Costs in the Cloud

Auto scaling Best Practices

Use Auto Scaling Tags

Use Auto scaling Alarms and Email Notifications

Scale up and down symmetrically

Scale up quickly and scaling down slowly

Auto Scaling across Availability Zones

Leverage Suspend and Resume Processes

Page 14: Optimizing for Costs in the Cloud

Scale up by 10% if CPU utilization is greater than 60% for 5 minutes, Scale down by 10% if CPU utilization is less than 30% for 20 minutes.

Example:

Page 15: Optimizing for Costs in the Cloud

Ag

g.

CP

U

Ins

tan

ce

s

Page 16: Optimizing for Costs in the Cloud

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29

RD

S D

B S

erv

ers

Days of the Month

75% Savings

Optimize during a month

Page 17: Optimizing for Costs in the Cloud

End of the month processing

Expand the cluster at the end of the month • Expand/Shrink feature in Amazon Elastic MapReduce

Vertically Scale up at the end of the month • Modify-DB-Instance (in Amazon RDS) (or a New RDS DB Instance )

• CloudFormation Script (in Amazon EC2)

Page 18: Optimizing for Costs in the Cloud

Tip: Use “Reminder scripts”

Disassociate your unused EIPs

Delete unassociated EBS volumes

Delete older EBS snapshots

Leverage S3 Object Expiration

Page 19: Optimizing for Costs in the Cloud

AWS Support – Trusted Advisor – Your personal cloud assistant

Page 20: Optimizing for Costs in the Cloud

Tip – Instance Optimizer

Instance

Amazon CloudWatch Alarm

Free Memory

Free CPU

Free HDD

At 1-min

intervals

Custom Metrics

PUT 2 weeks

“You could save a bunch of money by switching to a small instance, Click on CloudFormation Script to Save”

$$$ in Savings

Page 21: Optimizing for Costs in the Cloud

#1 Use only what you need (use Auto Scaling Service, modify–db)

#2 Invest time in Reserved Pricing analysis (EC2, RDS)

Optimizing for Cost…

Page 22: Optimizing for Costs in the Cloud

Save more when you reserve

On-demand Instances

• Pay as you go

• Starts from $0.02/Hour

Reserved Instances

• One time low upfront fee + Pay as you go

• $23 for 1 year term and $0.01/Hour

1-year and 3-year terms

Heavy Utilization RI

Medium Utilization RI

Light Utilization RI

Page 23: Optimizing for Costs in the Cloud

The Total Cost Of (Non) Ownership in the Cloud Whitepaper (New!)

Whitepaper: http://bit.ly/aws-tco-webapps

Page 24: Optimizing for Costs in the Cloud

Steady State Usage Pattern

(Example: Corporate Website)

Web Application Usage Patterns

Spiky Predictable Usage Pattern

(Example: Marketing Promotions Website)

Uncertain unpredictable Usage Pattern

(Example: Social game or Mobile Website)

Page 25: Optimizing for Costs in the Cloud

Availability Zone #2

Availability Zone #1

Auto Scaling group : App Tier

Auto Scaling group : Web Tier

Elastic Load

Balancer

www.MyWebSite.com

(dynamic data)

media.MyWebSite.com

(static data)

Amazon Route 53

(DNS)

Amazon EC2

Amazon RDS Amazon

RDS

Amazon S3

Amazon

CloudFront

Example: TCO of a 3-tier Web Application

Page 26: Optimizing for Costs in the Cloud

Utilization Sweet Spot Feature Savings over On-Demand

<10% On-Demand No Upfront Commitment

10% - 40% Light Utilization RI Ideal for Disaster Recovery Up to 56% (3-Year)

40% - 75% Medium Utilization RI Standard Reserved Capacity Up to 66% (3-Year)

>75% Heavy Utilization RI Lowest Total Cost Ideal for Baseline Servers

Up to 71% (3-Year)

$-

$2.000

$4.000

$6.000

$8.000

$10.000

$12.000

$14.000

Co

st

Utilization

Heavy Utilization

Medium Utilization

Light Utilization

On-Demand

m2.xlarge running Linux in US-East Region

over 3 Year period Break-even point

Page 27: Optimizing for Costs in the Cloud

0

2

4

6

8

10

12

0 5 10 15 20 25 30 35

Tra

ffic

me

as

ure

d i

n S

erv

ers

/In

sta

nc

es

Months

Spiky Predictable Usage Pattern

Traffic Pattern

EC2 Reserved

Physical servers

(on-premises)

EC2 On-Demand

Page 28: Optimizing for Costs in the Cloud

TCO Web Application - Spiky Usage Pattern

Amortized monthly costs

over 3 years

On-Premises Option

AWS Option 1

All Reserved AWS Option 2 Mix of On-Demand

and Reserved

AWS Option 3 All On-Demand

Compute/Server Costs

Server Hardware $510 $0 $0 $0

Network Hardware $103 $0 $0 $0

Hardware Maintenance $78 $0 $0 $0

Power and Cooling $286 $0 $0 $0

Data Center Space $240 $0 $0 $0

Personnel $2,000 $0 $0 $0

AWS Instances $0 $992 $881 $1,940

Total - Per Month $3,220 $992 $881 $1,940

Total - 3 Years $115,920 $35,717 $31,731 $69,854

Savings over On-premises

Option 69% 72% 40%

TCO of Spiky Predictable Web Application

Option 1: All Reserved

Option 2: Mix of On-Demand and Reserved Recommended Option (Most Cost-effective)

Option 3: All On-Demand Commitment-free and Risk-free Option

Page 29: Optimizing for Costs in the Cloud

Recommendations

Steady State Usage Pattern • For 100% utilization

• 3-Year Heavy RI (for maximum savings over on-demand)

Spiky Predictable Usage Pattern • Baseline

• 3-Year Heavy RI (for maximum savings over on-demand) • 1-Year Light RI (for lowest upfront commitment) + savings over on-demand

• Peak: On-Demand

Uncertain and unpredictable Usage Pattern • Start out small with On-Demand Instances (risk-free and commitment-

free) • Switch to some combination of Reserved and On-Demand, if application is

successful • If not successful, you walk away having spent a fraction of what you would

pay to buy your own technology infrastructure

Page 30: Optimizing for Costs in the Cloud
Page 31: Optimizing for Costs in the Cloud
Page 32: Optimizing for Costs in the Cloud

#1 Use only what you need (use Auto Scaling Service, modify–db)

#2 Invest time in Reserved Pricing analysis (EC2, RDS)

#3 Architect for Spot Instances (bidding strategies)

Optimizing for Cost…

Page 33: Optimizing for Costs in the Cloud

Optimize by using Spot Instances

Heavy Utilization RI

Medium Utilization RI

Light Utilization RI

1-year and 3-year terms

On-demand Instances

• Pay as you go

• Starts from $0.02/Hour

Reserved Instances

• One time low upfront fee + Pay as you go

• $23 for 1 year term and $0.01/Hour

Spot Instances

• Requested Bid Price and Pay as you go

• $0.005/Hour as of today at 9 AM

Page 34: Optimizing for Costs in the Cloud

What are Spot Instances?

Availability Zone

Region

Availability Zone

Unused

Unused

Unused

Unused

Unused

Unused

Sold at 50% Discount!

Sold at 56% Discount!

Sold at 66% Discount!

Sold at 59% Discount!

Sold at 54% Discount!

Sold at 63% Discount!

Page 35: Optimizing for Costs in the Cloud

What is the tradeoff?

Availability Zone

Region

Availability Zone

Unused

Unused

Unused

Unused

Unused

Unused

Reclaimed

Reclaimed

Page 36: Optimizing for Costs in the Cloud

Spot Use cases

Use Case Types of Applications

Batch Processing Generic background processing (scale out computing)

Hadoop Hadoop/MapReduce processing type jobs (e.g.

Search, Big Data, etc.)

Scientific Computing Scientific trials/simulations/analysis in chemistry, physics, and biology

Video and Image

Processing/Rendering

Transform videos into specific formats

Testing Provide testing of software, web sites, etc

Web/Data Crawling Analyzing data and processing it

Financial Hedgefund analytics, energy trading, etc

HPC Utilize HPC servers to do embarrassingly parallel jobs

Cheap Compute Backend servers for Facebook games

Page 37: Optimizing for Costs in the Cloud

Save more money by using Spot Instances

Reserved Hourly Price > Spot Price < On-Demand Price

Page 38: Optimizing for Costs in the Cloud

Spot: Example Customers

63%

50%

57%

50%

50%

66%

56%

50%

Page 39: Optimizing for Costs in the Cloud

Typical Spot Bidding Strategies

1. Bid near the Reserved Hourly Price

2. Bid above the Spot Price History

3. Bid near On-Demand Price

4. Bid above the On-Demand Price

0%

2%

4%

6%

8%

10%

12%

14%

16%

18%

20%

Perc

en

tag

e o

f th

e D

istr

ibu

tio

n

Bid Price as Percentage of the On-Demand Price

Bid Distribution (for last 3 months)

Page 40: Optimizing for Costs in the Cloud

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $

1. Bid Near the Reserved Hourly Price

66% Savings over On-Demand

Page 41: Optimizing for Costs in the Cloud

2. Bid above the Spot Price History

50% Savings over On-Demand

Page 42: Optimizing for Costs in the Cloud

3. Bid near the On-Demand Price

50% Savings over On-Demand

Page 43: Optimizing for Costs in the Cloud

4. Bid above the On-Demand Price

57% Savings over On-Demand

Page 44: Optimizing for Costs in the Cloud

Managing Interruption

Page 45: Optimizing for Costs in the Cloud

Amazon Elastic MapReduce

Hadoop Cluster

HDFS

Task Node

Task Node

Core Node

Core Node

Input

Data Outpu

tData

Amazon S3

Metadata

Amazon DynamoDB

BI Apps

Upload large

datasets or log

files directly Data

Source

Code/

Scripts

Amazon S3

Service

Amazon Elastic

MapReduce

HiveQL

Pig Latin

Cascading

Mapper

Reducer

Runs multiple

JobFlow Steps

Name Node

JDBC/ODB

C

HiveQL

Pig Latin

Query

Amazon EMR (Hadoop): Run Task Nodes on Spot

Page 46: Optimizing for Costs in the Cloud

#1: Cost without Spot 4 instances *14 hrs * $0.45 = $25.20

Job Flow

14 Hours

Duration:

Scenario #1

Duration:

Job Flow

7 Hours

Scenario #2

#2: Cost with Spot 4 instances *7 hrs * $0.45 = $12.60 + 5 instances * 7 hrs * $0.225 = $7.875 Total = $20.475

Time Savings: 50% Cost Savings: ~19%

Amazon EMR: Reducing Cost with Spot

Page 47: Optimizing for Costs in the Cloud

Use Case: Web crawling/Search using Hadoop type clusters. Use Reserved Instances for their DB workloads and Spot instances for their indexing clusters. Launch 100’s of instances.

Bidding Strategy: Bid a little above the On-Demand price to prevent interruption.

Interruption Strategy: Restart the cluster if interrupted

Made for each other: MapReduce + Spot

66% Savings over On-Demand

Page 48: Optimizing for Costs in the Cloud

On-demand + Spot

Amazon S3

Amazon SQS

Amazon DynamoDB

Job

Amazon S3

Amazon SQS

Amazon DynamoDB

Completed

Job Reports

Website

Amazon

CloudWatch

Amazon

Elastic Compute Cloud

Amazon EC2

Amazon EC2

Amazon EC2

Input Queue

Output Queue

Input Bucket

Output Bucket

Website (Job

Manager)

Intranet

Video Transcoding Application Example

Page 49: Optimizing for Costs in the Cloud

Use of Amazon SQS in Spot Architectures

VisibilityTimeOut Amazon EC2

Spot Instance

Page 50: Optimizing for Costs in the Cloud

Optimizing Video Transcoding Workloads

Free Offering • Optimize for reducing cost

• Acceptable Delay Limits

Implementation

• Set Persistent Requests

• Use on-demand Instances, if delay

Maximum Bid Price

< On-demand Rate

Get your set reduced price for your workload

Premium Offering Optimized for Faster response times

No Delays

Implementation

Invest in RIs

Use on-demand for Elasticity

Maximum Bid Price

>= On-demand Rate

Get Instant Capacity for higher price

Page 51: Optimizing for Costs in the Cloud

Persistent Requests

Page 52: Optimizing for Costs in the Cloud

Architecting for Spot Instances : Best Practices

Manage interruption

• Split up your work into small increments

• Checkpointing: Save your work frequently and periodically

Test Your Application

Track when Spot Instances Start and Stop

Spot Requests

• Use Persistent Requests for continuous tasks

• Choose maximum price for your requests

Page 53: Optimizing for Costs in the Cloud

#1 Use only what you need (use Auto Scaling Service, modify–db)

#2 Invest time in Reserved Pricing analysis (EC2, RDS)

#3 Architect for Spot Instances (bidding strategies)

#4 Leverage Application Services (ELB, SNS, SQS, SWF, SES)

Optimizing for Cost…

Page 54: Optimizing for Costs in the Cloud

Optimize by converting ancillary instances into services

Monitoring: CloudWatch Notifications: SNS Queuing: SQS SendMail: SES Load Balancing: ELB Workflow: SWF Search: CloudSearch

Page 55: Optimizing for Costs in the Cloud

Elastic Load Balancing

Elastic Load Balancing

Pros

Elastic and Fault-tolerant

Auto scaling

Monitoring included

Cons

For Internet-facing traffic only

Software LB on EC2

Pros

Application-tier load balancer

Cons

SPOF

Elasticity has to be implemented manually

Not as cost-effective

Page 56: Optimizing for Costs in the Cloud

Web Servers

$0.08 per hour

(small instance)

Availability Zone

$0.025 per hour

Web Servers

Availability Zone

EC2 instance

+ software LB

Elastic Load

Balancer DNS

DNS

Page 57: Optimizing for Costs in the Cloud

Application Services

SNS, SQS, SES, SWF

Pros

Pay as you go

Scalability

Availability

High performance

Software on EC2

Pros

Custom features

Cons

Requires an instance

SPOF

Limited to one AZ

DIY administration

Page 58: Optimizing for Costs in the Cloud

Producer

SQS queue

Consumers

Consumers

Producer

EC2 instance

+ software queue

$0.01 per

10,000 Requests ($0.000001 per Request)

$0.08 per hour

(small instance)

Page 59: Optimizing for Costs in the Cloud

#1 Use only what you need (use Auto Scaling Service, modify–db)

#2 Invest time in Reserved Pricing analysis (EC2, RDS)

#3 Architect for Spot Instances (bidding strategies)

#4 Leverage Application Services (ELB, SNS, SQS, SWF, SES)

#5 Implement Caching (ElastiCache, CloudFront)

Optimizing for Cost…

Page 60: Optimizing for Costs in the Cloud

Optimize for performance and cost by page caching and edge-caching static content

caching

Page 61: Optimizing for Costs in the Cloud

When am I charged?

Paris

Singapore

London

Amazon Simple Storage Service

(S3)

Edge Location

Edge Location

Edge Location

Client

Client

Client

Amazon Elastic

Compute Cloud

(EC2)

Page 62: Optimizing for Costs in the Cloud

When content is popular…

Paris

Singapore

London

Amazon Simple Storage Service

(S3)

Edge Location

Edge Location

Edge Location

Client

Client

Client

Amazon Elastic

Compute Cloud

(EC2)

Page 63: Optimizing for Costs in the Cloud

Architectural Recommendations

Use Amazon S3 + CloudFront as it will reduce the cost as well as reduce latency for static data • Depends on cache-hit ratio

For Video Streaming, use CloudFront as there is no need of a separate streaming server running Adobe FMS

Use managed caching service (Amazon ElastiCache)

Page 64: Optimizing for Costs in the Cloud

#1 Use only what you need (use Auto Scaling Service, modify–db)

#2 Invest time in Reserved Pricing analysis (EC2, RDS)

#3 Architect for Spot Instances (bidding strategies)

#4 Leverage Application Services (ELB SNS, SQS, SWF, SES)

#5 Implement Caching (ElastiCache, CloudFront)

Number of ways to further save with AWS…

Page 65: Optimizing for Costs in the Cloud

[email protected]

Twitter: @jinman

Thank you!

Page 66: Optimizing for Costs in the Cloud

http://aws.amazon.com