16
Grzegorz Kochan | [email protected] How to handle cloud failure 1

How to handle cloud failure

Embed Size (px)

DESCRIPTION

"How to handle cloud failure" presentation slides. TechCamp #1 - Kraków 02.12.2011.

Citation preview

Page 1: How to handle cloud failure

Grzegorz Kochan | [email protected]

How to handlecloud failure

1

Page 2: How to handle cloud failure

Grzegorz Kochan | [email protected]

subtitle text

Jak sobie radzićz awarią w chmurach

2

Page 3: How to handle cloud failure

Grzegorz Kochan | [email protected]

1,5 bilion widget pageviews monthly

100 000 ad clicks daily

35 thousand registered publishers

15 thousand advertisers

over 1500 requests per second

over 150 mbit data per second

on Amazon AWSAdTaily

3

Page 4: How to handle cloud failure

Grzegorz Kochan | [email protected]

why?Startup in the cloud

4

Scalability

PricingAvailability

Simplicity

API

Page 5: How to handle cloud failure

Grzegorz Kochan | [email protected]

can go wrongBut things

“Amazon EC2 goes down, taking with it Reddit, Foursquare and Quora” - April 2011

„Down Goes The Internet… Again. Amazon EC2 Outage Takes Down Foursquare, Instagram, Quora, Reddit, Etc” - August 2011

5

Tech

Cru

nch

Page 6: How to handle cloud failure

Grzegorz Kochan | [email protected]

Design for failure

6

Page 7: How to handle cloud failure

Grzegorz Kochan | [email protected]

Geographical mapAmazon AWS

7

Availability Zones:

eu-west-1beu-west-1beu-west-1c

EU Ireland

Availability Zones:ap-southeast-1aap-southeast-1b

Asia Pacific: Singapore

Availability Zones:

us-west-1aus-west-1bus-west-1c

US - Oregon

Availability Zones:

us-west-2aus-west-2b

US - N. California

Availability Zones:

ap-southeast-1aap-southeast-1b

Asia Pacific: Tokyo

Availability Zones:

us-east-1aus-east-1bus-east-1cus-east-1d

US - Virginia

Page 8: How to handle cloud failure

Grzegorz Kochan | [email protected]

duplicate and balanceReplicate,

8

Availability Zoneus-east-1aSingle server

setup

Multi serversetup

Page 9: How to handle cloud failure

Grzegorz Kochan | [email protected]

multi A-Zone architectureDistribute

9

Availability Zoneus-east-1a

Availability Zoneus-east-1b

Page 10: How to handle cloud failure

Grzegorz Kochan | [email protected]

multi Region architectureDistribute more

10

US East Region US West Region

Page 11: How to handle cloud failure

Grzegorz Kochan | [email protected]

application decouplingDesign for failure

Stateless services

Gracefull degradation

Die fast and alone

Auto recover

Backup and scale independently

11

Page 12: How to handle cloud failure

Grzegorz Kochan | [email protected]

exampleDecoupling

12

- shopping cart- process orders- process payments- generate invoices- send emails

EcommerceApplication

Product Catalog

Order Processor

Payment Processor

Invoice generator

Email Sender

...

Messaging (SQ

S)Monitoring & Alerting (CloudWatch & SNS)

#2

#2

#4

#3

#3

#5

#2

#2

#2

AutoScaling

#3#2

Page 13: How to handle cloud failure

Grzegorz Kochan | [email protected]

everythingAutomate

Infrastructure - custom AMIs, Chef, Puppet

Monitoring - CloudWatch

Scaling and recoverying - AutoScaling

Fail and recover constantly - ChaosMonkeyby NetFlix

13

Page 14: How to handle cloud failure

Grzegorz Kochan | [email protected]

weight the risks and costsBe rational

14

Page 15: How to handle cloud failure

Grzegorz Kochan | [email protected]

on Amazon AWSMore info

15

http://aws.amazon.com/architecture