33
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Nate Slater, AWS Solutions Architect July 30, 2015 Introduction to DynamoDB

AWS July Webinar Series - Getting Started with Amazon DynamoDB

Embed Size (px)

Citation preview

Page 1: AWS July Webinar Series - Getting Started with Amazon DynamoDB

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Nate Slater, AWS Solutions Architect

July 30, 2015

Introduction to DynamoDB

Page 2: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Agenda

• What is DynamoDB?• DynamoDB Fundamentals• Typical Workloads and Use Cases• Demo

Page 3: AWS July Webinar Series - Getting Started with Amazon DynamoDB

What is DynamoDB?

Page 4: AWS July Webinar Series - Getting Started with Amazon DynamoDB

What is DynamoDB?

DynamoDB is a fully managed, NoSQL document and key-value data store.

Page 5: AWS July Webinar Series - Getting Started with Amazon DynamoDB

What is NoSQL?

NoSQL is a term to describe data stores that trade full ACID compliance for high availability and scale.

A

C

I

D

solation

urability

onsistency

tomicity Single row/single item only

Eventual consistency

Dirty Read

Data replication on commodity storage

Page 6: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Why NoSQL?

• Dirty Reads?• Eventual Consistency?• Single row transactions only?• Why would anybody trade ACID compliance for this?

Page 7: AWS July Webinar Series - Getting Started with Amazon DynamoDB

NoSQL – Availability and Scale

Traditional SQL NoSQL

DBPrimary Secondary

Scale Up

DB

DB

DBDB

DB DB

Scale Out

Page 8: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Scale Up vs Scale Out

Scale-Up

Scale-Out

Cost

Complexity

Page 9: AWS July Webinar Series - Getting Started with Amazon DynamoDB

The CAP Theorem

Network partitions will happen in distributed systems:

DB

DBDB

DB DB

Consistency

Availability

Partition Tolerance

C A

P

CA

APCP

Page 10: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Why NoSQL?

• Horizontal Scaling allows for infinite scalability• Cheaper to scale out than to scale up• Full consistency or availability that can survive a network

partition• Full ACID compliance is often not needed

Page 11: AWS July Webinar Series - Getting Started with Amazon DynamoDB

What is DynamoDB?

DynamoDB is a fully managed, NoSQL document and key-value data store.

Page 12: AWS July Webinar Series - Getting Started with Amazon DynamoDB

What is a Managed Service?

• A managed service is a web service in which consumers of the service never need to interact directly with the underlying compute, storage, and network resources.

Page 13: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Why use a Managed Service?

Page 14: AWS July Webinar Series - Getting Started with Amazon DynamoDB

DynamoDB is a Managed Service

• AWS runs all the database infrastructure for you!• All the benefits and none of the operational overhead of running a

distributed system:• Infinitely scalable read and write I/O• High availability within a region• Data durably stored in 3 availability zones• Cross-region replication• Easily export data to S3• Triggers using Lambda functions• Tight integration with Kinesis, Lambda, EMR, and Redshift• Pay only for what you use, when you need it

Page 15: AWS July Webinar Series - Getting Started with Amazon DynamoDB

DynamoDB Fundamentals

Page 16: AWS July Webinar Series - Getting Started with Amazon DynamoDB

DynamoDB TableTable

Items

Attributes

HashKey

RangeKeyMandatory

Key-value access patternDetermines data distribution

OptionalModel 1:N relationshipsEnables rich query capabilities

All items for a hash key==, <, >, >=, <=“begins with”“between”sorted resultscountstop/bottom N valuespaged responses

Page 17: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Data types

String (S)

Number (N)

Binary (B)

String Set (SS)

Number Set (NS)

Binary Set (BS)

Boolean (BOOL)

Null (NULL)

List (L)

Map (M)

Used for storing nested JSON documents

Page 18: AWS July Webinar Series - Getting Started with Amazon DynamoDB

00 55 A954 AA FF

Hash tableHash key uniquely identifies an item

Hash key is used for building an unordered hash index

Table can be partitioned for scale

00 FF

Id = 1Name = Jim

Hash (1) = 7B

Id = 2Name = AndyDept = Engg

Hash (2) = 48

Id = 3Name = KimDept = Ops

Hash (3) = CD

Key Space

Page 19: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Partitions are three-way replicated

Id = 2Name = AndyDept = Engg

Id = 3Name = KimDept = Ops

Id = 1Name = Jim

Id = 2Name = AndyDept = Engg

Id = 3Name = KimDept = Ops

Id = 1Name = Jim

Id = 2Name = AndyDept = Engg

Id = 3Name = KimDept = Ops

Id = 1Name = Jim

Replica 1

Replica 2

Replica 3

Partition 1 Partition 2 Partition N

Page 20: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Hash-range table• Hash key and range key together uniquely identify an Item.• Within unordered hash index, data is sorted by the range key.• No limit on the number of items (∞) per hash key.

• Unless you have local secondary indexes

00:0 FF:∞

Hash (2) = 48

Customer# = 2Order# = 10Item = Pen

Customer# = 2Order# = 11Item = Shoes

Customer# = 1Order# = 10Item = Toy

Customer# = 1Order# = 11Item = Boots

Hash (1) = 7B

Customer# = 3Order# = 10Item = Book

Customer# = 3Order# = 11Item = Paper

Hash (3) = CD

55 A9:∞54:∞ AA

Partition 1 Partition 2 Partition 3

Page 21: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Local Secondary Index (LSI)

alternate range key + same hash keyindex and table data is co-located (same partition)

10 GB max per hash key, i.e. LSIs limit the # of range keys!

Page 22: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Global Secondary Index

any attribute indexed as new hash and/or range key

RCUs/WCUs provisioned separately for GSIs

Online indexing

Page 23: AWS July Webinar Series - Getting Started with Amazon DynamoDB

LSI or GSI?

LSI can be modeled as a GSI

If data size in an item collection > 10 GB, use GSI

If eventual consistency is okay for your scenario, use GSI!

Page 24: AWS July Webinar Series - Getting Started with Amazon DynamoDB

CreateTable

UpdateTable

DeleteTable

DescribeTable

ListTables

PutItem

UpdateItem

DeleteItem

BatchWriteItem

GetItem

Query

Scan

BatchGetItem

ListStreams

DescribeStream

GetShardIterator

GetRecords

Tabl

e A

PI

Item

AP

I

New

DynamoDB API

Stream API

Page 25: AWS July Webinar Series - Getting Started with Amazon DynamoDB

DynamoDB Streams and AWS Lambda

Page 26: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Emerging Architecture Pattern

Page 27: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Throughput

Provisioned at the table level• Write capacity units (WCUs) are measured in 1 KB per second• Read capacity units (RCUs) are measured in 4 KB per second

• RCUs measure strongly consistent reads• Eventually consistent reads cost 1/2 of consistent reads

Read and write throughput limits are independent

WCURCU

Page 28: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Partitioning example Table size = 8 GB, RCUs = 5000, WCUs = 500

RCUs per partition = 5000/3 = 1666.67WCUs per partition = 500/3 = 166.67Data/partition = 10/3 = 3.33 GBRCUs and WCUs are uniformly spread across partitions

# of partitions (IO capacity) = 5000/3000 RCU + 500/1000 WCU = 2.17

# of partitions (storage) = 8/10 GB = 0.8

# of partitions = ceiling(max(2.17, 0.8)) = 3

Page 29: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Typical Workloads and Use-Cases

Page 30: AWS July Webinar Series - Getting Started with Amazon DynamoDB

DynamoDB table examplescase class CameraRecord( cameraId: Int, // hash key ownerId: Int, subscribers: Set[Int], hoursOfRecording: Int, ...)

case class Cuepoint( cameraId: Int, // hash key timestamp: Long, // range key type: String, ...)HashKey RangeKey Value

Key Segment 1234554343254

Key Segment1 1231231433235

Page 31: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Typical Workloads• Ad-tech• IoT• Gaming• Web Analytics• Mobile Applications• Large Scale Websites

…And much more!

Page 32: AWS July Webinar Series - Getting Started with Amazon DynamoDB

Demo

Page 33: AWS July Webinar Series - Getting Started with Amazon DynamoDB