AWS Summit Berlin 2013 - Building web scale applications with AWS

Preview:

Citation preview

Ryan Shuttleworth, Technical Evangelist

Building Web-Scale Applications with AWS

What’s a web scale application?

Three principles to build upon

Layering the cake:Data

Application

Total Jobs Group – their story

What are we going to cover?

What do web scale apps have in

common?

Actual demand

Predicted demand

Customerdissatisfaction

Waste

Demand

Time

Elastic capacity No need to guess capacity requirements and over-provisionElastic Capacity

Elastic capacity

Demand

Time

Elastic capacity No need to guess capacity requirements and over-provisionElastic Capacity

Built on a global footprint

9 Regions

25 Availability Zones

Continuous Expansion

Built across regional availability zones

Relational Database ServiceDatabase-as-a-Service

No need to install or manage database instances

Scalable and fault tolerant configurations

DynamoDBProvisioned throughput NoSQL database

Fast, predictable performance

Fully distributed, fault tolerant architecture

Use RDS for databases

Use DynamoDB for high performance key-

value DB

Architected using services

Amazon SQS

Processing

task/processing

trigger

Processing results

Amazon SQSReliable, highly scalable, queue service

for storing messages as they travel

between instances

Task A

Task B

(Auto-scaling)

Task C

2

3

1

Simple WorkflowReliably coordinate processing steps

across applications

Integrate AWS and non-AWS resources

Manage distributed state in complex

systems

Push inter-process workflows into the cloud with SWF

Reliable message queuing without

additional software

Architected using services

Cloud SearchElastic search engine based upon

Amazon A9 search engine

Fully managed service with

sophisticated feature set

Scales automatically

DocumentServer

Results

SearchServer

Don’t install search software, use CloudSearch

Process large volumes of data cost effectively

with EMR

Elastic MapReduceElastic Hadoop cluster

Integrates with S3 & DynamoDB

Leverage Hive & Pig analytics scripts

Integrates with instance types such as

spot

Architected using services

Three principles to build upon…

Scale

1

Scale

Elasticity

1

Scale

Elasticity

State Data

1

Scale

Security

Elasticity

State Data

2

Scale

Security

Elasticity

State Data

Inherent2

Scale

Security

Elasticity

State Data

Inherent

VPC

Groups

2

Scale

FailureSecurity

Elasticity

State Data

Inherent

VPC

Groups

3

Scale

FailureSecurity

Elasticity

State Data

Inherent

VPC

Expected

Groups

3

Scale

FailureSecurity

Elasticity

State Data

Inherent

VPC

Expected

Automation

TestingGroups

3

Scale

Failure

Elasticity

State Data

Expected

Automation

Testing

SecurityInherent

VPC

Groups

Layering the cake

Data

Web scale data

Object storage

Data

You put it in S3AWS stores with 99.999999999% durability

Highly scalable web access to objects

Multiple redundant copies in a region

What is S3?

Highly scalable data storage

Access via APIsA web store,

not a file system

Fast

Highly available & durable

Economical

Data

Data

Case S

tudy

Data

Web scale data

Object storage

Data

Web scale data

Object storageRelational data

Data

Master/Slave Horizontal ScalingReasonably simple to implement

Leverage PIOPs for raw performance

Easy to change instances sizes

Has an upper limit

Data

hash ring

Sharded Horizontal ScalingMore complex at the application layer

No practical limit on scalabilityOperation complexity/sophistication

Shard by function or key spaceRDBMS or NoSQL

A

BC

D

Data

Web scale data

Object storageRelational data

Data

Web scale data

Object storageRelational data

NoSQL

Data

Horizontal Scaling - Fully Managed

DynamoDBProvisioned throughput NoSQL database

Fast, predictable performance

Fully distributed, fault tolerant architecture

Considerations for non-uniform data

Data

DynamoDBProvisioned read/write performance per table

Predictable high performance scaled via console or

API

Dial it up

Data

Low provisioned throughput

TablePartition

SSD

Region

Illustrative diagram only

Data

Illustrative diagram only

Region

TablePartition

SSD

TablePartition

SSD

TablePartition

SSD

TablePartition

SSD

TablePartition

SSD

TablePartition

SSD

TablePartition

SSD

TablePartition

SSD

TablePartition

SSD

TablePartition

SSD

Increased provisioned throughput

DataRegion

Illustrative diagram only

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

TablePartition

High provisioned throughput

Data

Application

Loose coupling sets you free!

The looser they're coupled, the bigger they scale

Independent components

Design everything as a black box

Decouple interactions

Load-balance clusters

Data

Application

Amazon SQS

Processing

task/processing trigger

Processing results

Amazon SQSReliable, highly scalable, queue service for storing

messages as they travel between instances

Data

Application

Controller A Controller B Controller C

Tight Coupling

Data

Application

Controller A Controller B Controller C

Controller A Controller B Controller C

Tight Coupling

Loose Coupling

Q Q Q

Data

Application

Auto ScalingAutomatic resizing of compute clusters based on demand

Trigger auto-scaling policy

Feature Details

Control Define minimum and maximum instance pool sizes and when scaling and cool down occurs.

Integrated to Amazon

CloudWatch

Use metrics gathered by CloudWatch to drive scaling.

Instance types Run Auto Scaling for On-Demand and Spot Instances. Compatible with VPC.

Data

Application

Where does state

reside?

Data

Application

Where does state

reside?

Browser cookies

Framework session handler

Session database

Memory session

manager

Data

Application

State store should be:

Performant

Scalable

Reliable

Data

Application

Trigger auto-scaling policy

Where should state reside?

Data

Application

Trigger auto-scaling policy

Where should state reside?

Not here

Data

Application

Trigger auto-scaling policy

Where should state reside?

Not here

Session state service

State must reside OUTSIDE the scope of the elements you wish to scale

Data

Application

Where should state reside?PerformantScalableReliable

Data

Application

Load Balancing

Feature Details

Available Load balance across instances in multiple Availability Zones

Health checks Automatically checks health of instances and takes them in or out of service

Session stickiness Route requests to the same instance

Secure sockets layer Supports SSL offload from web and application servers with flexible cipher support

Monitoring Publishes metrics to CloudWatch

Elastic Load BalancingCreate highly scalable applications

Distribute load across EC2 instances

in multiple availability zones

Data

Application

Load Balancing

Distribution

Route53

Region A

Route53

Region B

Request

Route53Global DNS service

Data

Application

Load Balancing

Distribution

Route53

Region A

Route53

Region B

16ms 92ms

Request

Route53Global DNS service

Data

Application

Load Balancing

Distribution

Route53

Region A

Route53

Region B

16ms 92ms

Request

Route53Global DNS service

Data

Application

Load Balancing

Distribution

Route53

Region A

Route53

Region B

16ms 92ms

RequestRegion A DNS entry

Route53Global DNS service

Data

Application

Load Balancing

Distribution

London

Paris

NY

Served from S3/images/*

3

Served from EC2*.php

2

Single CNAMEwww.mysite.com

1

CloudFrontWorld-wide content distribution

network

Data

Application

Load Balancing

Distribution

Re

spo

nse

Tim

e

Se

rve

r L

oa

d

Re

spo

nse

Tim

e

Se

rve

r L

oa

d

Re

spo

nse

Tim

e

Ser

ver

Load

No CDN CDN for

Static

Content

CDN for

Static &

Dynamic

Content

CloudFrontWorld-wide content distribution

network

Data

Application

Load Balancing

Distribution

Management

Data

Application

Load Balancing

Distribution

Management

10 instancesmanageable

Data

Application

Load Balancing

Distribution

Management

100 instancesat a push

Data

Application

Load Balancing

Distribution

Management

1,000 instancesnot a chance

Data

Application

Load Balancing

Distribution

Management

Automation & management

Web scale enabler

OpsWorks Elastic Beanstalk

CloudFormation EC2

Data

Application

Load Balancing

Distribution

Management

OpsWorks Elastic Beanstalk

CloudFormation EC2control

convenience

Data

Application

Load Balancing

Distribution

Management

Summary

Use these techniques (and many others) as appropriate

Awareness of the options is the first step to good design

Scaling is the ability to move the bottlenecks around to the least expensive part of the architecture

AWS makes this easier – so your application is not a victim of its own success

Summary

Thank you

Recommended