33
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora Relational database reimagined for the cloud Debanjan Saha GM, Amazon Aurora Amazon Web Services April, 2016 Now available in Seoul region!

Amazon Aurora (Debanjan Saha) - AWS DB Day

Embed Size (px)

Citation preview

Page 1: Amazon Aurora (Debanjan Saha) - AWS DB Day

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Amazon AuroraRelational database reimagined for the cloud

Debanjan Saha – GM, Amazon Aurora

Amazon Web Services

April, 2016

Now available in Seoul region!

Page 2: Amazon Aurora (Debanjan Saha) - AWS DB Day

MySQL-compatible relational database

Performance and availability of

commercial databases

Simplicity and cost-effectiveness of

open source databases

Delivered as a managed service

What is Amazon Aurora?

Page 3: Amazon Aurora (Debanjan Saha) - AWS DB Day

Customers have been frustrated by the proprietary nature, high cost, and

licensing terms of traditional, commercial-grade database providers. And while

many companies have started moving toward more open engines like MySQL

and Postgres, they often struggle to get the performance they need. Customers

asked us if we could eliminate that inconvenient trade-off, and that’s why we

built Aurora.

Jeff Bezos, Founder and CEO, Amazon.com

Annual letter to the share holders, 2016

Page 4: Amazon Aurora (Debanjan Saha) - AWS DB Day

Re-imagined for the cloud

Architected for the cloud – e.g. moved the

logging and storage layer into a

multitenant, scale-out database-optimized

storage service

Leverages existing AWS services: Amazon

EC2, Amazon VPC, Amazon DynamoDB,

Amazon SWF, and Amazon S3

Maintain compatibility with MySQL –

customers can migrate their MySQL

applications as-is, use all MySQL tools.

Control PlaneData Plane

Amazon

DynamoDB

Amazon SWF

Amazon Route 53

Logging + Storage

SQL

Transactions

Caching

Amazon S3

1

2

3

Page 5: Amazon Aurora (Debanjan Saha) - AWS DB Day

Fastest growing service in AWS history

Aurora customer adoption

Page 6: Amazon Aurora (Debanjan Saha) - AWS DB Day

Expedia: Online travel marketplaceMigration from SQL Server

Real-time business intelligence and analytics on a

growing corpus of online travel market place data.

Current SQL server based architecture is too

expensive. Performance degrades as data

volume grows.

Cassandra with Solr index requires large memory

footprint and hundreds of nodes, adding cost.

Aurora benefits:

Aurora meets scale and performance

requirements with much lower cost.

25,000 inserts/sec with peak up to 70,000. 30 ms

average response time for write and 17 ms for

read.

World’s leading online travel

company, with a portfolio that

includes more than 150 travel sites

in 70 countries.

Page 7: Amazon Aurora (Debanjan Saha) - AWS DB Day

Pearson Education: Publishing and testingMigration from MySQL

Pearson’s applications enable student

registration, instruction, and testing

Database reliability is critical when dealing with

millions of students’ data. Data cannot be lost.

Aurora benefits:

Allowed Pearson to stop self-managing their

database while still achieving their performance

and availability goals.

“No more babysitting MySQL traditional async

replication”

A leading firm in educational publishing,

testing, and certification, as well as in-class

and online learning tools. Pearson operates

in over 70 countries and serves millions of

students.

Page 8: Amazon Aurora (Debanjan Saha) - AWS DB Day

Higher Performance, Lower Cost

• Fewer instances needed

• Smaller instances can be used

• No need to pre-provision storage

• No additional storage for read replicas

Safe.com lowered their bill by 40% by switching from sharded

MySQL to a single Aurora instance.

Double Down Interactive (gaming) lowered their bill by 67%

while also achieving better latencies (most queries ran faster)

and lower CPU utilization.

Page 9: Amazon Aurora (Debanjan Saha) - AWS DB Day

Amazon Aurora is fast …

5x faster than MySQL on SYSBENCH

Page 10: Amazon Aurora (Debanjan Saha) - AWS DB Day

WRITE PERFORMANCE READ PERFORMANCE

MySQL SysBench results

R3.8XL: 32 cores / 244 GB RAM

5X faster than RDS MySQL 5.6 & 5.7

Five times higher throughput than stock MySQL

based on industry standard benchmarks.

0

25,000

50,000

75,000

100,000

125,000

150,000

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

Aurora MySQL 5.6 MySQL 5.7

Page 11: Amazon Aurora (Debanjan Saha) - AWS DB Day

WRITE PERFORMANCE READ PERFORMANCE

Scaling with instance sizes

Aurora scales with instance size for both read and write.

Aurora MySQL 5.6 MySQL 5.7

Page 12: Amazon Aurora (Debanjan Saha) - AWS DB Day

Reproducing benchmark results

h t t p s : / / d 0 . a w s s t a t i c . c o m / p r o d u c t - m a rk e t i n g / Au r o r a / R D S_ Au r o r a _ Pe r f o r m a n c e _ As s e s s m e n t _ Be n c hm a r k i n g _ v 1 - 2 . p d f

AMAZON

AURORA

R3.8XLARGE

R3.8XLARGE

R3.8XLARGE

R3.8XLARGE

R3.8XLARGE

• Create an Amazon VPC (or use an existing one).

• Create four EC2 R3.8XL client instances to run the

SysBench client. All four should be in the same AZ.

• Enable enhanced networking on your clients

• Tune your Linux settings (see whitepaper)

• Install Sysbench version 0.5

• Launch a r3.8xlarge Amazon Aurora DB Instance in

the same VPC and AZ as your clients

• Start your benchmark!

1

2

3

4

5

6

7

Page 13: Amazon Aurora (Debanjan Saha) - AWS DB Day

Beyond benchmarks

If only real world applications saw benchmark performance

POSSIBLE DISTORTIONS

Real world requests contend with each other

Real world metadata rarely fits in data dictionary cache

Real world data rarely fits in buffer cache

Real world production databases need to run with HA enabled

Page 14: Amazon Aurora (Debanjan Saha) - AWS DB Day

Do fewer IOs

Minimize network packets

Cache prior results

Offload the database engine

DO LESS WORK

Process asynchronously

Reduce latency path

Use lock-free data structures

Batch operations together

BE MORE EFFICIENT

How did we achieve this?

DATABASES ARE ALL ABOUT I/O

NETWORK-ATTACHED STORAGE IS ALL ABOUT PACKETS/SECOND

HIGH-THROUGHPUT PROCESSING DOES NOT ALLOW CONTEXT SWITCHES

Page 15: Amazon Aurora (Debanjan Saha) - AWS DB Day

What about availability

“Performance only matters if your database is up”

Page 16: Amazon Aurora (Debanjan Saha) - AWS DB Day

Six copies across three availability zones

4 out 6 write quorum; 3 out of 6 read quorum

Peer-to-peer replication for repairs

Volume striped across hundreds of storage nodes

SQL

Transaction

AZ 1 AZ 2 AZ 3

Caching

SQL

Transaction

AZ 1 AZ 2 AZ 3

Caching

Read and write availability Read availability

Fault-tolerant storage

Page 17: Amazon Aurora (Debanjan Saha) - AWS DB Day

Read replicas are failover targets

Aurora cluster contains primary node

and up to fifteen secondary nodes

Failing database nodes are

automatically detected and replaced

Failing database processes are

automatically detected and recycled

Secondary nodes automatically

promoted on persistent outage, no

single point of failure

Customer application may scale-out

read traffic across secondary nodes

AZ 1 AZ 3AZ 2

Primary

NodePrimary

NodePrimary

Node

Primary

NodePrimary

NodeSecondary

Node

Primary

NodePrimary

NodeSecondary

Node

Customer specifiable fail-over order

Read balancing across read replicas

Page 18: Amazon Aurora (Debanjan Saha) - AWS DB Day

Faster failover

AppRunningFailure Detection DNS Propagation

Recovery Recovery

DBFailure

MYSQL

App

Running

Failure Detection DNS Propagation

Recovery

DB

Failure

AURORA WITH MARIADB DRIVER

1 5 - 2 0 s e c

3 - 2 0 s e c

Page 19: Amazon Aurora (Debanjan Saha) - AWS DB Day

ALTER SYSTEM CRASH [{INSTANCE | DISPATCHER | NODE}]

ALTER SYSTEM SIMULATE percent_failure DISK failure_type IN

[DISK index | NODE index] FOR INTERVAL interval

ALTER SYSTEM SIMULATE percent_failure NETWORK failure_type

[TO {ALL | read_replica | availability_zone}] FOR INTERVAL interval

Simulate failures using SQL

To cause the failure of a component at the database node:

To simulate the failure of disks:

To simulate the failure of networking:

Page 20: Amazon Aurora (Debanjan Saha) - AWS DB Day

Compatible with the MySQL ecosystem

Use all your existing tools and applications

Page 21: Amazon Aurora (Debanjan Saha) - AWS DB Day

Well established MySQL ecosystem

Business Intelligence Data Integration Query and Monitoring SI and Consulting

Source: Amazon

“We ran our compatibility test suites against Amazon Aurora and everything

just worked." - Dan Jewett, Vice President of Product Management at Tableau

Page 22: Amazon Aurora (Debanjan Saha) - AWS DB Day

Advanced monitoring

50+ system/OS metrics | sorted process list view | 1–60 sec granularity

alarms on specific metrics | egress to Amazon CloudWatch Logs | integration with third-party tools

ALARM

Page 23: Amazon Aurora (Debanjan Saha) - AWS DB Day

Many 3rd party monitoring tools

Page 24: Amazon Aurora (Debanjan Saha) - AWS DB Day

Correlate Aurora metrics with metrics and events from the rest of your infrastructure

Monitoring the whole stack

Page 25: Amazon Aurora (Debanjan Saha) - AWS DB Day

How much does it cost?

Less expensive than MySQL

1/10 the cost of commercial databases

Page 26: Amazon Aurora (Debanjan Saha) - AWS DB Day

Cost of ownership: Aurora vs. MySQLMySQL configuration hourly cost

Primary

r3.8XL

Standby

r3.8XL

Replica

r3.8XLReplica

R3.8XL

Storage6 TB / 10 K PIOP

Storage6 TB / 10 K PIOP

Storage6 TB / 5 K PIOP

Storage6 TB / 5 K PIOP

$1.33/hr

$1.33/hr

$1.33/hr $1.33/hr

$2,42/hr

$2,42/hr $2,42/hr

Instance cost: $5.32 / hr

Storage cost: $8.30 / hr

Total cost: $13.62 / hr

$2,42/hr

Page 27: Amazon Aurora (Debanjan Saha) - AWS DB Day

Cost of ownership: Aurora vs. MySQLAurora configuration hourly cost

Instance cost: $4.86 / hr

Storage cost: $4.43 / hr

Total cost: $9.29 / hr

Primary

r3.8XL

Replica

r3.8XL

Replica

R3.8XL

Storage / 6 TB

$1.62 / hr $1.62 / hr $1.62 / hr

$4.43 / hr

*At a macro level Aurora saves over 50% in

storage cost compared to RDS MySQL.

31.8%

Savings

No idle standby instance

Single shared storage volume

No POIPs – pay for use I/O

Reduction in overall IOP

Page 28: Amazon Aurora (Debanjan Saha) - AWS DB Day

Cost of ownership: Aurora vs. MySQLFurther opportunity for saving

Instance cost: $2.43 / hr

Storage cost: $4.43 / hr

Total cost: $6.86 / hrStorage IOPs assumptions:

1. Average IOPs is 50% of Max IOPs

2. 50% savings from shipping logs vs. full pages

49.6%

Savings

Primary

r3.8XL

Replica

r3.8XLReplica

r3.8XL

Storage / 6TB

$0.81 / hr $0.81 / hr $0.81 / hr

$4.43 / hr

r3.4XL r3.4XL r3.4XL

Use smaller instance size

Pay-as-you-go storage

Page 29: Amazon Aurora (Debanjan Saha) - AWS DB Day

Ready to move?

We made it easy to migrate ..

Page 30: Amazon Aurora (Debanjan Saha) - AWS DB Day

1. Establish baseline

a. RDS MySQL to Aurora DB

snapshot migration

b. MySQL dump/import

2. Catch-up changes

a. Binlog replication

b. Tungsten replicator

Simplify migration from RDS MySQL

Application Users

MySQL Aurora

Network

Page 31: Amazon Aurora (Debanjan Saha) - AWS DB Day

Migration from EC2 & on-premise MySQL

Data migration service• Logical data replication from on-premise or EC2

• Code & schema conversion across engines

S3 integration• Load partial datasets directly from / to S3

• Ingest large database snapshots (>2TB)

• Snowball integration• Ingest huge database snapshots (>10TB)

• Send us your data in a suitcase!

Page 32: Amazon Aurora (Debanjan Saha) - AWS DB Day

Move data to the same or different database engine

Keep your apps running during the migration

Start your first migration in 10 minutes or less

Replicate within, to, or from Amazon EC2 or RDS

AWS Database

Migration Service

Migration non-MySQL databases

Page 33: Amazon Aurora (Debanjan Saha) - AWS DB Day

Q&A