47
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Dean Bryen - Solutions Architect - AWS - @deanbryen Mashooq Badar - Co-Founder - Codurance - @codurance July 7, 2016 Getting Started with Amazon DynamoDB

Getting Started with Amazon DynamoDB

Embed Size (px)

Citation preview

Page 1: Getting Started with Amazon DynamoDB

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Dean Bryen - Solutions Architect - AWS - @deanbryen Mashooq Badar - Co-Founder - Codurance - @codurance

July 7, 2016

Getting Started with Amazon DynamoDB

Page 2: Getting Started with Amazon DynamoDB

Agenda

Brief history of data processing

SQL vs. NoSQL

DynamoDB tables, API, data types, indexes

Scaling

Streams and Triggers

Customer Case Study - Codurance

Page 3: Getting Started with Amazon DynamoDB

History of Data Processing

Page 4: Getting Started with Amazon DynamoDB

Timeline of database technologyDa

ta P

ress

ure

Ledg

ers

Unit Rec

ords

Data Drum

s

File Syst

ems

RDBMSNoS

QL

Page 5: Getting Started with Amazon DynamoDB

Data volume since 2010Da

ta V

olum

e

Historical Current

90% of stored data generated in last 2 years

1 terabyte of data in 2010 equals 6.5 petabytes today

Linear correlation between data pressure and technical innovation

No reason these trends will not continue over time

Page 6: Getting Started with Amazon DynamoDB

SQL vs. NoSQL

Page 7: Getting Started with Amazon DynamoDB

Amazon’s path to DynamoDB

DynamoDBRDBMS

DB

Page 8: Getting Started with Amazon DynamoDB

Relational vs. NonRelational databases

Traditional SQL NoSQL

Primary Secondary

Scale up

DB

Scale out

DB

DBDB

DBDB

DB

DBDB

Page 9: Getting Started with Amazon DynamoDB

Why NoSQL?

Optimized for storage Optimized for compute

Normalized/relational Denormalized/hierarchical

Ad hoc queries Instantiated views

Scale vertically Scale horizontally

Good for OLAP Built for OLTP at scale

SQL NoSQL

Page 10: Getting Started with Amazon DynamoDB

SQL vs. NoSQL schema design

NoSQL design optimises for compute instead of storage

Page 11: Getting Started with Amazon DynamoDB

Intro to DynamoDB

Page 12: Getting Started with Amazon DynamoDB

Amazon DynamoDB

Fully managed

Low cost

Predictable performance

Massively scalable

Highly available

Page 13: Getting Started with Amazon DynamoDB

Over 200 million usersOver 4 billion items stored

Millions of ads per month

Cross-device ad solutions

130+ million new users in 1 year

150+ million messages per month

Process requests in milliseconds High-performance ads

Statcast uses burst scalabilityfor many games on a single day

Flexibility for fast growth

Web clickstream insights

Specialty online and retail stores

Over 5 billion items processed daily

About 200 million messages processed daily

Cognitive training

Job-matching platform

5+ million registered users

Mobile game analytics

10M global users

Home security

Wearable and IoTsolutions

170,000 concurrent players

Page 14: Getting Started with Amazon DynamoDB

Consistently low latency at scale

PREDICTABLE PERFORMANCE!

Page 15: Getting Started with Amazon DynamoDB

High availability and durability

WRITES Replicated continuously to 3 AZs Persisted to disk (custom SSD)

READS Strongly or eventually consistent

No latency trade-off

Designed to support

99.99% of availability

Built for high durability

Page 16: Getting Started with Amazon DynamoDB

How DynamoDB scales

partitions 1 .. N

table

DynamoDB automatically partitions data • Partition key spreads data (and workload) across

partitions • Automatically partitions as data grows and throughput

needs increase

Large number of unique hash keys +

Uniform distribution of workload across hash keys

High-scale apps

Page 17: Getting Started with Amazon DynamoDB

Flexibility and low cost

Reads per second

Writes per second

table

Customers can configure a table for just a few RPS or for hundreds of

thousands of RPS

Customers only pay for how much they provision

Provides maximum flexibility to adjust expenditure based on the workload

Page 18: Getting Started with Amazon DynamoDB

Fully managed service = automated operations

DB hosted on-premises DB hosted on Amazon EC2

App Optimisation

Scaling

High Availability

Database Backups

DB s/w patches

DB s/w installs

OS patches

OS installation

Server Maintenance

Rack & Stack

Power, HVAC, net

App Optimisation

Scaling

High Availability

Database Backups

DB s/w patches

DB s/w installs

OS patches

OS installation

Server Maintenance

Rack & Stack

Power, HVAC, net

Amazon DynamoDB

App Optimisation

Scaling

High Availability

Database Backups

DB s/w patches

DB s/w installs

OS patches

OS installation

Server Maintenance

Rack & Stack

Power, HVAC, net

Page 19: Getting Started with Amazon DynamoDB

DynamoDB Tables and Indexes

Page 20: Getting Started with Amazon DynamoDB

DynamoDB table structureTable

Items

Attributes

Partition key

Sort key

Mandatory Key-value access pattern Determines data distribution Optional

Model 1:N relationships Enables rich query capabilities

All items for key ==, <, >, >=, <= “begins with” “between” “contains” “in” sorted results counts top/bottom N values

Page 21: Getting Started with Amazon DynamoDB

00 55 A954 FFAA

Partition keysPartition key uniquely identifies an item Partition key is used for building an unordered hash index Allows table to be partitioned for scale

Id = 1 Name = Jim

Hash (1) = 7B

Id = 2 Name = Andy Dept = EngHash (2) = 48

Id = 3 Name = Kim Dept = Ops

Hash (3) = CD

Key Space

Page 22: Getting Started with Amazon DynamoDB

Partition:Sort keyPartition:Sort key uses two attributes together to uniquely identify an Item Within unordered hash index, data is arranged by the sort key No limit on the number of items (∞) per partition key

• Except if you have local secondary indexes

00:0 FF:∞

Hash (2) = 48

Customer# = 2 Order# = 10 Item = Pen

Customer# = 2 Order# = 11 Item = Shoes

Customer# = 1 Order# = 10 Item = Toy

Customer# = 1 Order# = 11 Item = Boots

Hash (1) = 7B

Customer# = 3 Order# = 10 Item = Book

Customer# = 3 Order# = 11 Item = Paper

Hash (3) = CD

55 A9:∞54:∞ AA

Partition 1 Partition 2 Partition 3

Page 23: Getting Started with Amazon DynamoDB

Partitions are three-way replicated

Id = 2 Name = Andy Dept = Engg

Id = 3 Name = Kim Dept = Ops

Id = 1 Name = Jim

Id = 2 Name = Andy Dept = Engg

Id = 3 Name = Kim Dept = Ops

Id = 1 Name = Jim

Id = 2 Name = Andy Dept = Engg

Id = 3 Name = Kim Dept = Ops

Id = 1 Name = Jim

Replica 1

Replica 2

Replica 3

Partition 1 Partition 2 Partition N

Page 24: Getting Started with Amazon DynamoDB

Local secondary index (LSI)

Alternate sort key attribute Index is local to a partition key

A1 (partition)

A3 (sort)

A2 (item key)

A1 (partition)

A2 (sort)

A3 A4 A5

LSIs A1 (partition)

A4 (sort)

A2 (item key)

A3 (projected)

Table

KEYS_ONLY

INCLUDE A3

A1 (partition)

A5 (sort)

A2 (item key)

A3 (projected)

A4 (projected) ALL

10 GB maximum per partition key; LSIs limit the number of range keys!

Page 25: Getting Started with Amazon DynamoDB

Global secondary index (GSI)Alternate partition and/or sort key Index is across all partition keys

A1 (partition)

A2 A3 A4 A5

GSIs A5 (partition)

A4 (sort)

A1 (item key)

A3 (projected)

Table

INCLUDE A3

A4 (partition)

A5 (sort)

A1 (item key)

A2 (projected)

A3 (projected) ALL

A2 (partition)

A1 (itemkey) KEYS_ONLY

Online indexing

Read capacity units (RCUs) and write capacity units (WCUs) are provisioned separately for GSIs

Page 26: Getting Started with Amazon DynamoDB

How do GSI updates work?

Table

Primary tablePrimary

tablePrimary tablePrimary

tableGlobal

secondary index

Client1. Update request

2. Asynchronous update (in progress)

2. Update response

If GSIs don’t have enough write capacity, table writes will be throttled!

Page 27: Getting Started with Amazon DynamoDB

LSI or GSI?

LSI can be modelled as a GSI If data size in an item collection > 10 GB, use GSI If eventual consistency is okay for your scenario, use GSI!

Page 28: Getting Started with Amazon DynamoDB

Scaling

Page 29: Getting Started with Amazon DynamoDB

Scaling

Throughput • Provision any amount of throughput to a table

Size • Add any number of items to a table

• Maximum item size is 400 KB • LSIs limit the number of range keys due to 10 GB limit

Scaling is achieved through partitioning

Page 30: Getting Started with Amazon DynamoDB

Throughput

Provisioned at the table level • Write capacity units (WCUs) are measured in 1 KB per second • Read capacity units (RCUs) are measured in 4 KB per second

• RCUs measure strictly consistent reads • Eventually consistent reads cost 1/2 of consistent reads

Read and write throughput limits are independent

WCURCU

Page 31: Getting Started with Amazon DynamoDB

Partitioning math

In the future, these details might change…

Number of partitionsBy capacity (Total RCU / 3000) + (Total WCU / 1000)By size Total Size / 10 GBTotal partitions CEILING(MAX (Capacity, Size))

Page 32: Getting Started with Amazon DynamoDB

Partitioning example Table size = 8 GB, RCUs = 5000, WCUs = 500

RCUs per partition = 5000/3 = 1666.67 WCUs per partition = 500/3 = 166.67 Data/partition = 10/3 = 3.33 GB

RCUs and WCUs are uniformly spread across partitions

Number of partitions

By capacity (5000 / 3000) + (500 / 1000) = 2.17

By size 8 / 10 = 0.8Total partitions CEILING(MAX (2.17, 0.8)) = 3

Page 33: Getting Started with Amazon DynamoDB

To learn more, please attend: Deep Dive on DynamoDB Floor 0, Room 3, 14:00 p.m.–14:45 p.m.Andreas Chatzakis, Solutions Architect

Page 34: Getting Started with Amazon DynamoDB

DynamoDB Streams and Triggers

Page 35: Getting Started with Amazon DynamoDB

Integration capabilities

DynamoDB Triggers ❑ Implemented as AWS Lambda

functions ❑ Your code scales automatically ❑ Java, Node.js, and Python

DynamoDB Streams ❑ Stream of table updates ❑ Asynchronous ❑ Exactly once ❑ Strictly ordered ❑ 24-hr lifetime per item

Page 36: Getting Started with Amazon DynamoDB

Customer Case Study - Codurance

Page 37: Getting Started with Amazon DynamoDB

Building AWS Loft Registration Site

@mashooq

Page 38: Getting Started with Amazon DynamoDB
Page 39: Getting Started with Amazon DynamoDB
Page 40: Getting Started with Amazon DynamoDB
Page 41: Getting Started with Amazon DynamoDB
Page 42: Getting Started with Amazon DynamoDB
Page 43: Getting Started with Amazon DynamoDB

Ireland (eu-west-1)

Secure:AWSLambda Amazon

DynamoDB

AmazonSES

users

AWSKMS AWSIAM

admin

AmazonCloudFront

AmazonCloudWatch

AWSCloudTrail

QR Reader

awsloft.londonAWSWAF

S3:StaticHTML/CSSandJavascriptcontent

forthesite.

APIforalldynamiccontent(proxytoAWSLambda)

Serverless Architecture

DynamoDBisapieceofthepuzzle.

Page 44: Getting Started with Amazon DynamoDB

Fast feedback …

• SingleServerLocalEnvironment• LocalDynamoDB• SimulatedAPIGateway• AbstractedLambdaAPI• MockedEncryption• MockedExternalServices

• SES,KMSetc.

• MicroservicesbasedCloudEnvironment• ContinuouslydeploytoQA• Oneclickdeploymenttoproduction

• Hotdeployment

Page 45: Getting Started with Amazon DynamoDB

Persistence Options

• RDS(Postgres)• Outofboxbackupandrecovery• Outofboxencryption• Maturedevelopmenttoolingandlibraries• Possibledowntimeduringscalling• Relativelycomplicatedmigrations• Morecomplicatedtomodelhierarchicalstructure

• DynamoDB• ElasticScaling• EvolutionarySchemadesign• Easytogetstarted• Customencryption• Complicatedjoins• Backupandrecoveryusingpipelines

Page 46: Getting Started with Amazon DynamoDB

Lessons

• DynamoDBiseasytogetstarted• RunslocallyforDevenvironments• Toolingandlibrariesaresurprisinglymature• API(atleastinClojure/Java)issimple• Customencryptionisinconvenientbuteasytoovercome• Backupsusingpipelinesarestraightforward• Schemamigrationsarerare• Possiblymorecostefficientifyouplanwelloruseautoscaling• Easytomonitor

• …it’spainless

Page 47: Getting Started with Amazon DynamoDB

Thank You!