Journey Through the AWS Cloud; Application Services

Journey through the Cloud:

Application Services

Ryan Shuttleworth – Technical Evangelist @ryanAWS

Common use cases & stepping stones into the AWS cloud Learning from customer journeys

Best practices to bootstrap your projects

Journey through the cloud

Build upon services built for the cloud Address common pain points in application architectures

Reduce operational management of software components Focus on application function, not undifferentiated heavy lifting


Why AWS for application services Services overview Dive into select services Where to go next

Agenda

Why AWS for application services?


Services that wrap software you’d commonly install and manage

yourself


Databases

Middleware

Analytics

Frameworks


Reliability Scalability Availability


Reliability Scalability Availability

Do you focus on scaling frameworks

rather than optimizing your

code?

Does core component

reliability affect your application

track record?

Is middleware ‘glue’ difficult to

make as available as you

need it?


Operational Management


Operational Management

Do you spend more time managing application services than you do

building and managing assets core to your business?

Your Business

70%

On-Premise Infrastructure

30%

Managing All of the “Undifferentiated Heavy Lifting”

AWS Cloud-Based

Infrastructure

Your Business

More Time to Focus on Your Business

Configuring Your Cloud Assets

70%

30% 70%

On-Premise Infrastructure

30%

Managing All of the “Undifferentiated Heavy Lifting”

Infrastructure & Application Services

Building blocks for applications

Designed and built for the cloud

Available at end of a web service call

Security Scaling

Database

Networking Monitoring

Messaging

Workflow

DNS

Load Balancing

Backup CDN

Compute

Storage

AWS is used in a variety of ways…

Storage & Archive

Saved months of development & architecture time and focused on application development instead

by using Cloud Search

By managing time-consuming database administration tasks,

RDS allows SEGA to focus on business critical applications

Relies on Simple Workflow to orchestrate complex, heterogeneous

scientific workflows

Brought on massive DynamoDB capacity in an operationally short period of time to handle demand

peaks during SuperBowl

You might be able to:

Business & technical drivers



Reduce costs

Running software on EC2 can be more expensive than consuming

functionality as a service



Reduce costs



Improve reliability

Services built with inherent multi-AZ functionality improve

application reliability



Reduce costs



Improve reliability



Scale easily

Services scale as you grow and need them, without large

investments in infrastructure



Reduce costs



Improve reliability



Scale easily

Services scale as you grow and need them, without large

investments in infrastructure

Re-focus energies

Spend less time doing undifferentiated heavy lifting and

more time on your business

Availability Zone

Region

Q Instance

Decoupled applications on EC2 instances

Reliable message queue on instance

Availability Zone

Region

Availability Zone Availability Zone

Q Instance

Q Instance

Q Instance

Multiple instances

for reliability

Technical implementation

Availability Zone

Region


Q Instance

Q Instance

Q Instance

Remove software running on

EC2

Availability Zone

Region


Replace with a regional service

Simplified operations

Amazon Simple

Queue Service (SQS)

Reduced costs

Queuing on AWS

Setup & manage instances

Install & configure queuing middleware

Set up persistent message store

Implement clustering across AZs for HA

Implement queue monitoring systems

Implement queuing in applications

Amazon SQS

Create a queue

HTTP PUT to place messages

HTTP GET (long polling) to pull messages

Queuing on AWS


Install & configure queuing middleware

Set up persistent message store

Implement clustering across AZs for HA

Implement queue monitoring systems

Implement queuing in applications

AWS application services

Relational Database Service

DynamoDB

SimpleDB

Databases Middleware Frameworks Analytics

Simple Queuing Service

Simple Notification Service

Simple Workflow

Elastic MapReduce

Simple Email Service

CloudSearch ElastiCache

CloudWatch


DynamoDB

SimpleDB

Databases Middleware Frameworks Analytics

Simple Queuing Service

Simple Notification Service

Simple Workflow

Elastic MapReduce

Simple Email Service

CloudSearch ElastiCache

CloudWatch

Amazon Relational Database Service

Source: Forrester

Backup, recovery

load and unload

Security planning

License training

Script automation

Installation, upgrade,

patching, migration Performance

and tuning

Frequent server upgrades Storage upgrades

Backup and recovery

Software upgrades

Patching

Hardware crash

Query construction

Query optimization

Configuration

Migration

Schema design


Backup and recovery

Software upgrades

Patching

Hardware crash

Query construction

Query optimization

Configuration

Migration

Schema design

Focus on these things


Backup and recovery

Software upgrades

Patching

Hardware crash

Query construction

Query optimization

Configuration

Migration

Schema design

Instead of these

Near zero administration

Painless patching, automatic upgrades

Cloudwatch monitoring, metric alarms

One click. High Availability.

One click. High availability with Multi-AZ

Automated deployment across multiple AZs

Synchronous replication from master to replica

Automatic fail-over; replica promoted to master

Test fail-over

Push-button scale, high performance

Scale storage from 5Gb to 1Tb of storage

Scale instance from small to 4XL (better I/O)

Add Read Replicas with asynchronous replication

Add ElastiCache for performance


Database-as-a-Service

No need to install or manage database instances

Scalable and fault tolerant configurations

Feature Details

Platform support Create MySQL, SQL Server and Oracle RDBMS

Preconfigured Get started instantly with sensible default settings

Automated patching Keep your database platform up to date automatically

Backups Automatic backups and point in time recovery and full DB backups

Backups Volumes can be snapshotted for point in time restore

Failover Automated failover to slave hosts in event of a failure

Replication Easily create read-replicas of your data and seamlessly replicate data across availability zones

RDBMS on AWS


Install & configure database platform

Configure backups

Implement master-slave for HA

Implement read-replicas for performance

Manage maintenance updates

RDS

Create RDS instance

Select Multi-AZ

Choose backup period

Choose maintenance windows

RDBMS on AWS


Install & configure database platform

Configure backups

Implement master-slave for HA

Implement read-replicas for performance

Manage maintenance updates

Amazon Relational Database Service (Amazon RDS) databases stores forum threads, site content, and project configuration

data.

High availability Multi-AZ database deployment to handle live game metadata and user-generated content.

Enterprise-grade fault tolerance for protecting customer data.

By managing time-consuming database administration tasks,

Amazon RDS allows SEGA to focus on business critical applications.

Amazon DynamoDB

Requirement: predictable, consistent

performance

Scalability

Pe

rfo

rma

nce

DynamoDB & NoSQL

Database services


performance

Reality: performance

degrades with scale

Scalability

Pe

rfo

rma

nce

DynamoDB & NoSQL

Database services

Hardware provisioning

Data sharding

Data caching

Cluster management

Fault management


performance

Reality: performance

degrades with scale

Scalability

Pe

rfo

rma

nce

DynamoDB & NoSQL

Database services

DynamoDB Provisioned read/write performance per table

Predictable high performance scaled via console

or API

Dial it up

Low provisioned throughput

Table Partition

SSD

Region

Illustrative diagram only


Table Partition

SSD

Region


Availability Zone Availability Zone Availability Zone

Replica Partition

SSD

Replica Partition

SSD


Table Partition

SSD

Region


Increased provisioned throughput


Region

Table Partition

SS

D

Table Partition

SS

D

Table Partition

SS

D

Table Partition

SS

D

Table Partition

SS

D

Table Partition

SS

D

Table Partition

SS

D

Table Partition

SS

D

Table Partition

SS

D

Table Partition

SS

D

High provisioned throughput

Region


Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Table Partition

Provisioned throughput NoSQL

database

Fast, predictable performance

Fully distributed, fault tolerant

architecture

Feature Details

Provisioned throughput

Dial up or down provisioned read/write capacity

Predictable performance

Average single digit millisecond latencies from SSD backed infrastructure

Strong consistency Be sure you are reading the most up to date values

Fault tolerant Data replicated across availability zones

Monitoring Integrated to Cloud Watch

Secure Integrates with AWS Identity and Access Management (IAM)

Elastic MapReduce

Integrates with Elastic MapReduce for complex analytics on large datasets

DynamoDB

NoSQL on AWS

Perform capacity planning and determine nodes required

Define PIOPS EBS volumes & instance types


Install NoSQL database software

Configure replica sets

Implement auto-scaling

Define tables

Define key strategy

DynamoDB

Define tables

Define key strategy

Choose performance level

NoSQL on AWS

Perform capacity planning and determine nodes required

Define PIOPS EBS volumes & instance types


Install NoSQL database software

Configure replica sets

Implement auto-scaling

Define tables

Define key strategy

“AWS gave us the flexibility to bring a massive amount of capacity online in a short period of

time and allowed us to do so in an operationally straightforward way.

AWS is now Shazam’s cloud provider of choice,”

Jason Titus,

CTO

DynamoDB: over 500,000 writes per

second

Amazon EMR: more than 1 million writes

per second

Amazon Simple Workflow

Flow framework for simplification of

cross system task coordination

Long running transaction state and task

distribution

Task A

Task B

(Auto-scaling)

Task C

2

3

1

Simple Workflow

New Order

Update

Inventory $ or €?

Update $

account

Update €

account

Ship

order

Send

Email

A typical business workflow…

Multiple steps

Multiple decision points

New Order

Update

Inventory $ or €?

Update $

account

Update €

account

Ship

order

Send

Email

System A System C

System B

System D

System E


Multiple steps

Multiple decision points

Heterogeneous systems


New Order

Update

Inventory $ or €?

Update $

account

Update €

account

Ship

order

Send

Email

System A System C

System B

System D

System E

State

Process logic Middleware

State managed in end systems

Process logic embedded in applications

Complex queuing, message ordering, de-duplication and dependencies

Scaling and failover of tasks troublesome

New Order

Update

Inventory $ or €?

Update $

account

Update €

account

Ship

order

Send

Email

Decider

Worker

Workflow metadata & state Implemented in Simple Workflow…

Implement a ‘Decider’ with simple, linear decision logic

Processes/tasks in a workflow become ‘Workers’

All state & metadata is handled by highly available and durable AWS ‘Workflow’

Amazon Simple Workflow Service

(SWF)


(SWF)

Decider

Return Result

Get Decision Task


(SWF)

Decider

Worker

Get Activity Task

Return Result

Return Result

Get Decision Task

Workflows on AWS


Install & configure workflow middleware

Architect for high availability

Implement workflows in proprietary/complicated

languages

Implement process audit

Simple Workflow


Install & configure workflow middleware

Architect for high availability

Implement workflows in proprietary/complicated

languages

Implement process audit

Implement ‘decider’ in language of choice

Implement task ‘workers’ to consume work

Workflows on AWS

Amazon CloudSearch

select *

from articles

where content

like ‘%cloud%’

1 million hits a day? 5 million items?

Finding things is trickier than first appears

+

Results & Ranking

Facets

Fielded

Anatomy of a search

Tokenization

search-mydomain.us-east-1.

cloudsearch.amazonaws.com

doc-mydomain.us-east-1.

cloudsearch.amazonaws.com

Endpoints

Facets

Query String

Results

Testing

http://search-toilets-in-oz-m6gko2xujhti47jv5w62eldyoy.us-

east-1.cloudsearch.amazonaws.com/2011-02-01/search

?q=True+Blue

&return-fields= address1

%2Caddressnote

%2Cf__name

%2Clastupdatedate

%2Clatitude

%2Clongitude

%2Cnotes

%2Cpostcode

%2Ctoiletdetails_id

%2Ctoileturl

%2Ctext_relevance

Domain

Query String

Search Options

Integration

{

"rank":"-text_relevance",

"match-expr":"(label 'True Blue')",

"hits":{"found":1,

"start":0,

"hit":[{"id":"toiletdetails_csv_41",

"data":{"address1":["51-59 Samwell Street"],

"addressnote":["3 male and 3 female toilets"],

"f__name":["Croydon True Blue Visitor"],

"lastupdatedate":["2010-03-04Z"],

"latitude":["-18.204249"],

"longitude":["142.244534"],

"notes":[],

"postcode":["4871"],

"text_relevance":["309"],

"toiletdetails_id":["40”]

}]

},

"info”{ "rid":"ccd66a5219f938d295e4326391ce31e1359fb7bc306564",

"time-ms":3,

"cpu-time-ms":0}

}

Results Search

Integration

Scalable

Small Instance Partition 1

Copy 1

Time: 1800h

<80% CPU

Elastic Search

Time

Requests


Copy 1

Elastic Search

Time

Requests


Copy 1

>80% CPU


Copy 2

Time: 2000h


Copy 1

Elastic Search

Time

Requests


Copy 1

>80% CPU


Copy 2


Copy 3


Copy 1


Copy 2

Time: 2200h


Copy 1

Elastic Search

Time

Requests


Copy 1


Copy 2


Copy 3


Copy 1


Copy 2 <30% CPU


Copy 1

Time: 0000h


Copy 1

Cost savings

Elastic Search

Time

Requests


Copy 1


Copy 2


Copy 3


Copy 1


Copy 2


Copy 1

Traditional required capacity


Copy 1

Cost savings

Elastic Search

Time

Requests


Copy 1


Copy 2


Copy 3


Copy 1


Copy 2


Copy 1


Elastic Capacity


Copy 1

Cost savings

Elastic Search

Time

Requests


Copy 1


Copy 2


Copy 3


Copy 1


Copy 2


Copy 1


Elastic Capacity

Savings Savings

Cloud Search

Search Instance Partition 1

Copy 1


Copy 2


Copy n

Traffic Request

volume & complexity

Data Document quantity & size


Copy 1


Copy 1

Search Instance Partition n

Copy 1

Cloud Search


Copy 1


Copy 1


Copy 1


Copy 2


Copy 2


Copy 2


Copy n


Copy n


Copy n

Traffic Request

volume & complexity

Data Document quantity & size Cloud

Search

Search on AWS

Perform capacity planning on size of indexes

Setup & manage required number of instances

Install & configure search software such as Solr across

cluster

Manage cluster partitioning and size over time

Implement monitoring

Cloud Search

Perform capacity planning on size of indexes

Setup & manage required number of instances

Install & configure search software such as Solr across

cluster

Manage cluster partitioning and size over time

Implement monitoring

Create search domain in console

Upload documents

Retrieve results

Search on AWS

Amazon Elastic MapReduce

1 instance for 100 hours =

100 instances for 1 hour

Small instance = $8

Managed, elastic Hadoop cluster

Integrates with S3 & DynamoDB

Leverage Hive & Pig analytics scripts

Integrates with instance types such as spot

Elastic MapReduce

Feature Details

Scalable Use as many or as few compute instances running Hadoop as you want. Modify the number of instances while your job flow is running

Integrated with other services

Works seamlessly with S3 as origin and output. Integrates with DynamoDB

Comprehensive Supports languages such as Hive and Pig for defining analytics, and allows complex definitions in Cascading, Java, Ruby, Perl, Python, PHP, R, or C++

Cost effective Works with Spot instance types

Monitoring Monitor job flows from with the management console

But what is it?

A framework Splits data into pieces Lets processing occur

Gathers the results

Very large dataset

(e.g TBs)

Lots of instances of ‘X’

Very large dataset

(e.g TBs)

Split the log into

many small pieces


Very large dataset

(e.g TBs)

Split the log into

many small pieces

Process in an EMR cluster Lots of

instances of ‘X’

Very large dataset

(e.g TBs)

Split the log into

many small pieces

Process in an EMR cluster

Aggregate the results

from all the nodes


Very large dataset

(e.g TBs)

Aggregate view of ‘X’


Split the log into

many small pieces

Process in an EMR cluster

Aggregate the results

from all the nodes

Very large dataset

(e.g TBs)

Insight in a fraction of the time

Very large dataset

(e.g TBs)

Aggregate view of ‘X’

1 instance for 100 hours =

100 instances for 1 hour

Small instance = $8

1 instance for 1,000 hours =

1,000 instances for 1 hour

Small instance = $80

Input data

S3

Elastic MapReduce

Code

Input data

S3

Elastic MapReduce

Code Name node

Input data

S3

Elastic MapReduce

Code Name node

Input data

S3

Elastic cluster

Elastic MapReduce

Code Name node

Input data

S3

Elastic cluster

HDFS

Elastic MapReduce

Code Name node

Input data

S3

Elastic cluster

HDFS Queries

+ BI Via JDBC, Pig, Hive

Elastic MapReduce

Code Name node

Output S3 + SimpleDB

Input data

S3

Elastic cluster

HDFS Queries

+ BI Via JDBC, Pig, Hive

Output S3 + SimpleDB

Input data

S3

Hadoop on AWS

Configure & manage EC2 instances

Setup Namenode

Setup Datanodes

Create HDFS file system

Setup extras (Hive, Pig etc)

Prepare data

Execute job

Elastic MapReduce Hadoop on AWS

Configure & manage EC2 instances

Setup Namenode

Setup Datanodes

Create HDFS file system

Setup extras (Hive, Pig etc)

Prepare data

Execute job

Specify desired cluster size

Execute job

Features powered by Amazon Elastic MapReduce:

People Who Viewed this Also Viewed

Review highlights Auto complete as you type on search

Search spelling suggestions Top searches

Ads

200 Elastic MapReduce jobs per day Processing 3TB of data

"With Amazon Elastic MapReduce, there

was no upfront investment in hardware,

no hardware procurement delay, and no

need to hire additional operations staff.

Because of the flexibility of the platform,

our first new online advertising campaign

experienced a 500% increase in return on

ad spend from a similar campaign a year

before.”

Mark Taylor, Program Director

Elastic MapReduce

Where to go next

http://aws.typepad.com

http://aws.amazon.com/whitepapers

$100 of credits That’s enough to run an Amazon Elastic Compute Cloud (Amazon EC2) Standard Small Instance for approximately 600 hours Or ~600 EC2 Standard Small Instances for 1 hour

Powerof60.com

Summary

AWS is more than compute & storage

Databases, middleware and complex frameworks as services

Remove the operational burden from running common software services

Application services gives access scalable, sophisticated software without the overheads

aws.amazon.com

Technology

Journey Through the AWS Cloud; Application Services