50
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Debanjan Saha, GM, Amazon Aurora [email protected] November 29, 2016 GPST402 Amazon Aurora Deep Dive

AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Embed Size (px)

Citation preview

Page 1: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Debanjan Saha, GM, Amazon Aurora

[email protected]

November 29, 2016

GPST402

Amazon Aurora Deep Dive

Page 2: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Outline

► What is Amazon Aurora?

Background and history

► Key Aurora value proposition

Performance, availability, ease of use, cost of ownership

► Call to action – need your help

Migration, ISV integration, cloud native stacks

Page 3: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

A bit of history …

Re-imagining relational databases for the cloud era

Page 4: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Relational databases were not designed for the cloud

Multiple layers of

functionality all in a

monolithic stack

SQL

Transactions

Caching

Logging

Page 5: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Not much has changed in last 20 years

Even when you scale it out, you’re still replicating the same stack

SQL

Transactions

Caching

Logging

SQL

Transactions

Caching

Logging

Application

SQL

Transactions

Caching

Logging

SQL

Transactions

Caching

Logging

Application

SQL

Transactions

Caching

Logging

SQL

Transactions

Caching

Logging

Storage

Application

Page 6: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Re-imagining relational database

Fully managed service – automate administrative tasks

1

2

3

Scale-out, distributed, multi-tenant design

Service-oriented architecture leveraging AWS services

Page 7: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Scale-out, distributed, multi-tenant architecture

Master Replica Replica Replica

Availability

Zone 1

Shared storage volume

Availability

Zone 2

Availability

Zone 3

Storage nodes with SSDs

Purpose-built log-structured distributed

storage system designed for databases

Storage volume is striped across

hundreds of storage nodes distributed

over 3 different Availability Zones

Six copies of data, two copies in each

Availability Zone to protect against

AZ+1 failures

Plan to apply same principles to other

layers of the stack

SQL

Transactions

Caching

SQL

Transactions

Caching

SQL

Transactions

Caching

Page 8: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Leveraging cloud ecosystem

Lambda

S3

IAM

CloudWatch

Invoke Lambda events from stored procedures/triggers.

Load data from S3, store snapshots and backups in S3.

Use IAM roles to manage database access control.

Upload systems metrics and audit logs to CloudWatch.

Page 9: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Automate administrative tasks

Schema design

Query construction

Query optimization

Automatic failover

Backup & recovery

Isolation & security

Industry compliance

Push-button scaling

Automated patching

Advanced monitoring

Routine maintenance

Takes care of your time-consuming database management tasks,

freeing you to focus on your applications and business

You

AWS

Page 10: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Meet Amazon Aurora ……Databases reimagined for the cloud

Speed and availability of high-end commercial databases

Simplicity and cost-effectiveness of open source databases

Drop-in compatibility with MySQL

Simple pay-as-you-go pricing

Delivered as a managed service

Page 11: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Aurora is used by:

2/3 of top 100 AWS customers

8 of top 10 gaming customers

Aurora customer adoption

Fastest growing service in AWS history

Page 12: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Who are moving to Aurora and why?

Customers using

commercial engines

Customers using

MySQL engines

Higher performance – up to 5x

Better availability and durability

Reduces cost – up to 60%

Easy migration; no application change

One tenth of the cost; no licenses

Integration with cloud eco-system

Comparable performance and availability

Migration tooling and services

Page 13: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Amazon Aurora is fast …

5x faster than MySQL

Page 14: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

WRITE PERFORMANCE READ PERFORMANCE

MySQL SysBench results

R3.8XL: 32 cores / 244 GB RAM

5X faster than RDS MySQL 5.6 & 5.7

Five times higher throughput than stock MySQL

based on industry standard benchmarks.

0

25,000

50,000

75,000

100,000

125,000

150,000

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

Aurora MySQL 5.6 MySQL 5.7

Page 15: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Aurora scaling

With user connection With number of tables

With database size - SYSBENCH With database size - TPCC

Connections

Amazon

Aurora

RDS MySQL

w/ 30K IOPS

50 40,000 10,000

500 71,000 21,000

5,000 110,000 13,000

Tables

Amazon

Aurora

MySQL

I2.8XL

local SSD

RDS MySQL

w/ 30K IOPS

(single AZ)

10 60,000 18,000 25,000

100 66,000 19,000 23,000

1,000 64,000 7,000 8,000

10,000 54,000 4,000 5,000

8xU P T O

F A S T E R

11xU P T O

F A S T E R

DB Size

Amazon

Aurora

RDS MySQL

w/ 30K IOPS

1GB 107,000 8,400

10GB 107,000 2,400

100GB 101,000 1,500

1TB 26,000 1,200

DB Size Amazon Aurora

RDS MySQL

w/ 30K IOPS

80GB 12,582 585

800GB 9,406 69

21U P T O

F A S T E R

136xU P T O

F A S T E R

Page 16: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Real-life data – gaming workloadAurora vs. RDS MySQL – r3.4XL, MAZ

Aurora 3X faster on r3.4xlarge

Page 17: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

BINLOG DATA DOUBLE-WRITELOG FRM FILES

TYPE OF W RITE

MYSQL WITH REPLICA

EBS mirrorEBS mirror

AZ 1 AZ 2

Amazon S3

EBSAmazon Elastic

Block Store (EBS)

Primary

Instance

Replica

Instance

1

2

3

4

5

AZ 1 AZ 3

Primary

Instance

Amazon S3

AZ 2

Replica

Instance

ASYNC

4/6 QUORUM

DISTRIBUT

ED WRITES

Replica

Instance

AMAZON AURORA

780K transactions

7,388K I/Os per million txns (excludes mirroring, standby)

Average 7.4 I/Os per transaction

MySQL IO profile for 30 min Sysbench run

27,378K transactions 35X MORE

950K I/Os per 1M txns (6X amplification) 7.7X LESS

Aurora IO profile for 30 min Sysbench run

How did we achieve this?

Page 18: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

New performance enhancements

► Smart selector

► Logical read ahead

► Read views

Read performance

Write performance

Meta-data access

► NUMA-aware scheduler

► Latch-free lock manager

► Instant schema update

► B-Tree concurrency

► Catalog concurrency

► Faster index build

Page 19: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Use case: MySQL shard consolidation

MasterRead

Replica

Shared distributed

storage volume

M S

M M

M

S S

S

MySQL shards Aurora cluster

Customer, a global SAAS provider, was using hundreds of MySQL shards in order to avoid MySQL

performance and connection scalability bottlenecks

Consolidated multiple 29 MySQL shards to single r3.4xlarge Aurora cluster

Even after consolidation cluster utilization is still 30% with plenty of headroom to grow.

Page 20: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Use case: Massively concurrent event storeFor messaging, gaming, IoT

~22 million accesses per hour (70% read, 30%

write) - billing grows linearly with the traffic.

Scalability bottleneck where certain portions

(partitions) of data became “hot” and overloaded

with requests.

Customer, a global mobile messaging platform,

was using NoSQL key-value database for user

messages:

New Aurora-backed data store reduces

operational costs by 40%

The storage cost will be less than 50% of

the current cost with NoSQL solution.

The cost of reading data (70% of user

traffic) almost eliminated due to memory-

bound nature of the workload.

Using a custom data API, customer was

able to port the application from a non-

relational to relational database with

modest engineering effort.

Page 21: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

What about availability?

“Performance only matters if your database is up”

Page 22: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Six copies across three Availability Zones

4 out 6 write quorum; 3 out of 6 read quorum

Peer-to-peer replication for repairs

Volume striped across hundreds of storage nodes

SQL

Transaction

AZ 1 AZ 2 AZ 3

Caching

SQL

Transaction

AZ 1 AZ 2 AZ 3

Caching

Read and write availability Read availability

6-way replicated storageSurvives catastrophic failures

Page 23: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Up to 15 promotable read replicas

MasterRead

Replica

Read

Replica

Read

Replica

Shared distributed storage volume

Reader end-point

► Up to 15 promotable read replicas across multiple availability zones

► Re-do log based replication leads to low replica lag – typically < 10ms

► Reader end-point with load balancing; customer specifiable failover order

Page 24: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Use case: Near real-time analytics and reporting

MasterRead

Replica

Read

Replica

Read

Replica

Shared distributed storage volume

Reader end-point

A customer in travel industry migrated to Aurora for

their core reporting application accessed by ~1000

internal users.

Replicas can be created, deleted, and scaled within

minutes based on load.

Read-only queries are load balanced across replica

fleet through a DNS endpoint – no application

configuration needed when replicas are added or

removed.

Low replication lag allows mining for fresh data with

no delays, immediately after the data is loaded.

Significant performance gains for core analytics

queries - some of the queries executing in 1/100th

the original time.

► Up to 15 promotable read replicas

► Low replica lag – typically < 10ms

► Reader end-point with load balancing

Page 25: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Cross-region read replicasFaster disaster recovery and enhanced data locality

Promote read replica to a master

for faster recovery in the event of

disaster

Bring data close to your

customer’s applications in

different regions

Promote to a master for easy

migration

Page 26: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Traditional databases

Have to replay logs since the last

checkpoint

Typically 5 minutes between checkpoints

Single-threaded in MySQL; requires a

large number of disk accesses

Amazon Aurora

Underlying storage replays redo records

on demand as part of a disk read

Parallel, distributed, asynchronous

No replay for startup

Checkpointed Data Redo Log

Crash at T0 requires

a re-application of the

SQL in the redo log since

last checkpoint

T0 T0

Crash at T0 will result in redo logs being

applied to each segment on demand, in

parallel, asynchronously

Instant crash recovery

Page 27: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Automated failover in 15 secs

AppRunningFailure Detection DNS Propagation

Recovery Recovery

DBFailure

MYSQL

App

Running

Failure Detection DNS Propagation

Recovery

DB

Failure

AURORA WITH MARIADB DRIVER

5 - 6 s e c

5 - 1 0 s e c

Page 28: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Continuous backup

Segment snapshot Log records

Recovery point

Segment 1

Segment 2

Segment 3

Time

• Take periodic snapshot of each segment in parallel; stream the redo logs to Amazon S3

• Backup happens continuously without performance or availability impact

• At restore, retrieve the appropriate segment snapshots and log streams to storage nodes

• Apply log streams to segment snapshots in parallel and asynchronously

Page 29: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

New availability features

► Read replica end-point

► Specifiable fail-over order

► Faster fail-overs < 15 secs

Read replicas

X-region DR

► Cross-region replication

► Cross-region snapshot copy

► Cross-account snapshot sharing

*Coming soon*

*Coming soon*

Page 30: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Amazon Aurora is easy to use

Automated storage management, security and compliance,

advanced monitoring, database migration.

Page 31: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Simplify storage management

Instantly create user snapshots—no performance impact

Automatic storage scaling up to 64 TB—no performance impact

Automatic restriping, mirror repair, hot spot management, encryption

Up to 64 TB of storage – auto-incremented in 10 GB units

up to 64 TB

Page 32: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Security and compliance

Encryption to secure data at rest using customer managed keys

• AES-256; hardware accelerated• All blocks on disk and in Amazon S3 are encrypted• Key management via AWS KMS

Encrypted cross-region replication, snapshot copy - SSL to secure data in transit

Advanced auditing and logging without any performance impact *NEW*

Industry standard security and data protection certifications – SOC, ISO, PCI/DSS, HIPPA/BAA *NEW*

Data Key 1 Data Key 2 Data Key 3 Data Key 4

Customer MasterKey(s)

StorageNode

StorageNode

StorageNode

StorageNode

DatabaseEngine

Page 33: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Advanced monitoring

50+ system/OS metrics | sorted process list view | 1-60 sec granularity

alarms on specific metrics | egress to CloudWatch Logs | integration with third-party tools

ALARM

Page 34: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Amazon Aurora saves you money

1/10th of the cost of commercial databases

Cheaper than even MySQL

Page 35: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Cost of ownership: Aurora vs. MySQLMySQL configuration hourly cost

Primary

r3.8XL

Standby

r3.8XL

Replica

r3.8XLReplica

R3.8XL

Storage6 TB / 10 K PIOP

Storage6 TB / 10 K PIOP

Storage6 TB / 5 K PIOP

Storage6 TB / 5 K PIOP

$1.33/hr

$1.33/hr

$1.33/hr $1.33/hr

$2,42/hr

$2,42/hr $2,42/hr

Instance cost: $5.32 / hr

Storage cost: $8.30 / hr

Total cost: $13.62 / hr

$2,42/hr

Page 36: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Cost of ownership: Aurora vs. MySQLAurora configuration hourly cost

Instance cost: $4.86 / hr

Storage cost: $4.43 / hr

Total cost: $9.29 / hr

Primary

r3.8XL

Replica

r3.8XL

Replica

R3.8XL

Storage / 6 TB

$1.62 / hr $1.62 / hr $1.62 / hr

$4.43 / hr

*At a macro level Aurora saves over 50% in

storage cost compared to RDS MySQL.

31.8%

Savings

No idle standby instance

Single shared storage volume

No POIPs – pay for use I/O

Reduction in overall IOP

Page 37: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Cost of ownership: Aurora vs. MySQLFurther opportunity for saving

Instance cost: $2.43 / hr

Storage cost: $4.43 / hr

Total cost: $6.86 / hrStorage IOPs assumptions:

1. Average IOPs is 50% of Max IOPs

2. 50% savings from shipping logs vs. full pages

49.6%

Savings

Primary

r3.8XL

Replica

r3.8XLReplica

r3.8XL

Storage / 6TB

$0.81 / hr $0.81 / hr $0.81 / hr

$4.43 / hr

r3.4XL r3.4XL r3.4XL

Use smaller instance size

Pay-as-you-go storage

Page 38: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Higher performance, lower cost

Fewer instances needed

Smaller instances can be used

Safe.com lowered their bill by 40% by switching from sharded

MySQL to a single Aurora instance.

Double Down Interactive (gaming) lowered their bill by 67% while also achieving better latencies (most queries ran faster)

and lower CPU utilization.

No need to pre-provision storage

No additional storage for read replicas

Page 39: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

t2 RI discounts

Up to 34% with a 1-year RI

Up to 57% with a 3-year RI

vCPU Mem Hourly Price

db.t2.medium 2 4 $0.082

db.r3.large 2 15.25 $0.29

db.r3.xlarge 4 30.5 $0.58

db.r3.2xlarge 8 61 $1.16

db.r3.4xlarge 16 122 $2.32

db.r3.8xlarge 32 244 $4.64

R3.large is too expensive for dev/test?We just introduced t2.medium

*Prices are for Virginia

*NEW*

Page 40: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Call to action – need your help!

Page 41: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

MySQL to Aurora migration

MySQL shards

AuroraDMSMySQL shards

MySQL shards

EC2 MySQL

Take snapshot;

Load to S3

Ingest snapshot

into Aurora

RDS MySQL Aurora

Ingest snapshot

Catchup using binlog replication

Aurora

RDS MYSQL to Aurora EC2/on-premises MYSQL to Aurora

Many-to-one migration

Console based automated

snapshot ingestion and catch

up via binlog replication.

Binary snapshot ingestion

through S3 and catch up via

binlog replication.

Consolidate multiple MySQL

shards into a single Aurora

instance using Database

Migration Service.

Amazon S3

Page 42: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Data copy: Existing data is copied from source tables to tables on the target.

Chance data capture and apply: Changes to data on source are captured

while the tables are loaded. Once load is complete, buffered changes are

applied to the target. Additional changes captured on the source are applied to the target until the task

stopped or terminated AWS Database

Migration Service

AWS Schema

Conversion Tool

Oracle, SQL Server to Aurora migration

Assessment report: SCT analyses the source database and provides a

report with a recommended target engine and information on automatic

and manual conversions

Code Browser and recommendation engine: Highlights places that require

manual edits and provides architectural and design guidelines.

Page 43: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Code browsing and assessment using SCT

Page 44: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Data migration using DMS

Replication

Instance

Oracle /

SQL Server Aurora

Change logs

Data pages

Choose between migrating data

snapshot only, live replication (catch-up),

or catch-up only based on alternative

bootstrap/import mechanism

Choose between overwriting existing

tables or appending to them

Choose degree of table import

parallelism; defaults to eight (8)

Create table mappings between source

and target, e.g. lowercase table names

Once satisfied with selections, trigger

migration and watch it progress

Creates tables at target database; Populates data from source

Can use multiple processes; Each process loads one entire table

Can be paused and restarted:

• When restarted will continue from where it was stopped

• Will reload any tables that were currently in progress

Page 45: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Integration with ISV ecosystem

Business Intelligence Data Integration Query and Monitoring SI and Consulting

Source: Amazon

“We ran our compatibility test suites against Amazon Aurora and everything

just worked." - Dan Jewett, Vice President of Product Management at Tableau

Page 46: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

AWS ecosystem

Lambda

S3

IAM

CloudWatch

Generate Lambda event from Aurora stored procedures.

Load data from S3, store snapshots and backups in S3.

Use IAM roles to manage database access control.

Upload systems metrics and audit logs to CloudWatch.

*NEW*

*NEW*

*NEW*

*1-2 months*

Page 47: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Use case: Event-driven data pipeline

Enable database developers to create

rich software features accessible from

SQL layer.

Run code in response to ad-hoc

requests, triggers or scheduled

database events in any language

supported by AWS Lambda (Java,

Node.js, Python).

Simplify custom database logic by

moving it from functions, stored

procedures etc. to cloud-based code

repository.

Accelerate the migration from any

programmable database

Amazon S3

Data Lake

Amazon Aurora

Load From S3

S3 Event Driven

Lambda Call

Lambda Call

S3 Load Completed

Notification

DeliveredAWS Services

Generating Data

Data Flow

Call / Notification Flow

Page 48: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

We are here to help

Rory Richardson, BD

[email protected]

David Lang, PM

[email protected]

Linda Wu, PM

[email protected]

Debanjan Saha, GM

[email protected]

Chuck Edwards, PDM

[email protected] Brayley-Berger, PDM

[email protected]

Your super friendly Aurora team

Page 49: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Other Aurora sessions at re:Invent

DAT322 - Workshop: Stretching Scalability: Doing more with Amazon Aurora

Option 1: Wed 2:00-4:30, Mirage, Trinidad B,

Option 2: Thu 2:30-5:00, Mirage, Antigua B

DAT301 - Amazon Aurora Best Practices: Getting the Best Out of Your Databases

Wed 5:30-6:30, Venetian, Level 4, Lando 4205

DAT303 - Deep Dive on Amazon Aurora

Thu 11:30-12:30, Venetian, Level 4, Delfino 4004

Page 50: AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)

Thank you!