53
Chris Munns Solutions Architect Amazon Web Services Build High-Scale Applications with Amazon DynamoDB

AWS Webcast - Build high-scale applications with Amazon DynamoDB

Embed Size (px)

DESCRIPTION

Review this webinar to learn about Amazon DynamoDB. DynamoDB is a highly scalable, fully managed NoSQL database service. Built for consistent single-digit millisecond latency and high availability, DynamoDB is a great fit for gaming, ad-tech, mobile, and many other applications. Reasons to review: • Learn the fundamentals of DynamoDB • Understand how to design for common access patterns • Discover best practices • Hear how others uses DynamoDB to build their business Who should review: • Software Developers • Database Administrators • Solution Architects • Technical Decision Makers

Citation preview

Page 1: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Chris Munns Solutions Architect

Amazon Web Services

Build High-Scale Applications with

Amazon DynamoDB

Page 2: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Traditional Database Architecture

App/Web Tier

Client Tier

Database Tier

Page 3: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• key-value access • complex queries • transactions • analytics

One Database for All Workloads

App/Web Tier

Client Tier

RDBMS

Page 4: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Cloud Data Tier Architecture

App/Web Tier

Client Tier

Data Tier

Search Cache Blob Store

RDBMS NoSQL Data Warehouse

Page 5: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Workload Driven Data Store Selection

Data Tier

Search Cache Blob Store

RDBMS NoSQL Data Warehouse

logging analytics

key/value simple query

rich search hot reads complex queries and transactions

Page 6: AWS Webcast - Build high-scale applications with Amazon DynamoDB

AWS Services for the Data Tier

Data Tier

Amazon DynamoDB

Amazon RDS

Amazon ElastiCache

Amazon S3

Amazon Redshift

Amazon CloudSearch

logging analytics

key/value simple query

rich search hot reads complex queries and transactions

Page 7: AWS Webcast - Build high-scale applications with Amazon DynamoDB

RDBMS = Default Choice • Amazon.com page composed of responses from 1000’s of

independent services • Query patterns for different service are different

Catalog service is usually heavy key-value Ordering service is very write intensive (key-value) Catalog search has a different pattern for querying

Relational Era @ Amazon.com

RDBMS

Poor Availability Limited Scalability High Cost

Page 8: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Dynamo = NoSQL Technology • Replicated DHT with consistency management • Consistent hashing • Optimistic replication • “Sloppy quorum” • Anti-entropy mechanisms • Object versioning

Distributed Era @ Amazon.com

lack of strong every engineer needs to operational consistency learn distributed systems complexity

Page 9: AWS Webcast - Build high-scale applications with Amazon DynamoDB

DynamoDB = NoSQL Cloud Service

Cloud Era @ Amazon.com

Non-Relational

Fast & Predictable Performance

Seamless Scalability

Easy Administration

Page 10: AWS Webcast - Build high-scale applications with Amazon DynamoDB

DynamoDB Fundamentals

Page 11: AWS Webcast - Build high-scale applications with Amazon DynamoDB

database service

automated operations predictable performance

fast development

always durable

low latency cost effective

=

Page 12: AWS Webcast - Build high-scale applications with Amazon DynamoDB

partitions 1 .. N

table

• DynamoDB automatically partitions data by the hash key Hash key spreads data (& workload) across partitions

• Auto-partitioning occurs with: Data set size growth Provisioned capacity increases

Massive and Seamless Scale

large number of unique hash keys

+ uniform distribution of workload

across hash keys

ready to scale

app’s

Page 13: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Making life easier for developers…

• Developers are freed from: Performance tuning (latency) Automatic 3-way multi-AZ replication Scalability (and scaling operations) Security inspections, patches, upgrades Software upgrades, patches Automatic hardware failover Improving the underlying hardware …and lots of other stuff

Automated Operations

Page 14: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Provisioned Throughput • Request-based capacity provisioning model

• Throughput is declared and updated via the API or the console CreateTable (foo, reads/sec = 100, writes/sec = 150) UpdateTable (foo, reads/sec=10000, writes/sec=4500)

• DynamoDB handles the rest Capacity is reserved and available when needed Scaling-up triggers repartitioning and reallocation No impact to performance or availability

Predictable Performance

Page 15: AWS Webcast - Build high-scale applications with Amazon DynamoDB

WRITES Continuously replicated to 3 AZ’s Quorum acknowledgment Persisted to disk (custom SSD)

READS Strongly or eventually consistent

No trade-off in latency

Durable At Scale

Page 16: AWS Webcast - Build high-scale applications with Amazon DynamoDB

WRITES Continuously replicated to 3 AZ’s Quorum acknowledgment Persisted to disk (custom SSD)

READS Strongly or eventually consistent

No trade-off in latency

Low Latency At Scale

Page 17: AWS Webcast - Build high-scale applications with Amazon DynamoDB

DynamoDB Customers

Page 18: AWS Webcast - Build high-scale applications with Amazon DynamoDB

“DynamoDB has scaled effortlessly to match our company's explosive growth, doesn't burden our operations staff, and integrates beautifully with our other AWS assets”.

“I love how DynamoDB enables us to provision our desired throughput, and achieve low

latency and seamless scale, even with our constantly growing workloads.”

Page 19: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Weatherbug mobile app

Lightning detection & alerting for 40M users/month

Developed and tested in weeks, at “1/20th of the cost of the traditional DB approach”

Super Bowl promotion

Millions of interactions over a relatively short period of time

Built the app in 3 days, from

design to production-ready

Fast Development

Page 20: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Cost Effective

“Our previous NoSQL database required almost a full time administrator to run.

Now AWS takes care of it.”

“Being optimized at AdRoll means we spend more every month on snacks than

we do on DynamoDB – and almost nothing on an ops team”

Save Money Reduce Effort

Page 21: AWS Webcast - Build high-scale applications with Amazon DynamoDB

DynamoDB Primitives

Page 22: AWS Webcast - Build high-scale applications with Amazon DynamoDB

DynamoDB Concepts

table

Page 23: AWS Webcast - Build high-scale applications with Amazon DynamoDB

DynamoDB Concepts

table

items

Page 24: AWS Webcast - Build high-scale applications with Amazon DynamoDB

DynamoDB Concepts

attributes

items

table

schema-less schema is defined per attribute

Page 25: AWS Webcast - Build high-scale applications with Amazon DynamoDB

DynamoDB Concepts

attributes

items

table

scalar data types • number, string, and binary multi-valued types • string set, number set, and binary set

Page 26: AWS Webcast - Build high-scale applications with Amazon DynamoDB

DynamoDB Concepts

hash

hash keys mandatory for all items in a table key-value access pattern

PutItem UpdateItem DeleteItem BatchWriteItem

GetItem BatchGetItem

Page 27: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Hash = Distribution Key

partition 1 .. N

hash keys mandatory for all items in a table key-value access pattern determines data distribution

Page 28: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Hash = Distribution Key

large number of unique hash keys

uniform distribution of workload across hash keys

optimal schema design

+

Page 29: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Range = Query

range

hash

range keys model 1:N relationships enable rich query capabilities composite primary key

all items for a hash key ==, <, >, >=, <= “begins with” “between” sorted results counts top / bottom N values paged responses

Page 30: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Index Options

local secondary indexes (LSI) alternate range key + same hash key index and table data is co-located (same partition)

Page 31: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Projected Attributes

KEYS_ONLY INCLUDE ALL

Page 32: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Projected Attributes

KEYS_ONLY INCLUDE ALL

Page 33: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Projected Attributes

KEYS_ONLY INCLUDE ALL

Page 34: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Index Options

global secondary indexes (GSI)

any attribute indexed as new hash or range key

Same projected attribute options

Page 35: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Currently 13 operations in total

Simple API

Manage Tables

• CreateTable

• UpdateTable

• DeleteTable

• DescribeTable

• ListTables

Read and Write Items

• PutItem

• GetItem

• UpdateItem

• DeleteItem

Read and Write Multiple Items

• BatchGetItem

• BatchWriteItem

• Query

• Scan

Page 36: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Scalar data types String (S) - Unicode with UTF8 binary encoding Number (N) up to 38 digits precision and can be between 10-128 to

10+126

• Variable width encoding can occupy up to 21 bytes

• Multi-valued types String Set (SS) Number Set (NS) Not ordered

Data types

Page 37: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Data is indexed by the primary key Single Hash Key

• Targeted towards object persistence

Hash Range composite Key • Sorted collection within hash bucket • Can store series of events for a given entity

• Automatic partitioning Leading hash key spreads data & workload across partitions

• Traffic is scaled out and parallelized

Indexing & Partitioning

Page 38: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Consistent Reads Inventory, shopping cart applications

• Atomic Counters Increment and return new value in same operation

• Conditional Writes Expected value before write – fails on mismatch “state machine” use cases

• Sparse Indexes Ideal for sorted lists; fast access to a subset of items Popular: identify recently updated items; top lists; leaderboards

Other Features

Page 39: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Use API/SDK/CLI Management Console to crate tables • Use the AWS SDK to interact with DynamoDB

PutItem, UpdateItem, DeleteItem Query Scan etc.

How to use DynamoDB?

$client = $aws->get("dynamodb");

$tableName = "ProductCatalog";

$response = $client->putItem(array(

"TableName" => $tableName,

"Item" => $client->formatAttributes(array(

"Id" => 120,

"Title" => "Book 120 Title",

"ISBN" => "120-1111111111",

"Authors" => array("Author12", "Author22"),

"Price" => 20,

"Category" => "Book",

"Dimensions" => "8.5x11.0x.75",

"InPublication" => 0,

)

),

"ReturnConsumedCapacity" => 'TOTAL'

));

Libraries, SDK’s

Web Console

Interaction

Command Line

Figure: Writing an item to a table via the PHP SDK

Page 40: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Higher-Level Programming Interfaces

Object Persistence Model for .NET & Java Helper Classes for .NET Transaction Library for Java

• Local DynamoDB available for development and testing • Dynamic DynamoDB for auto-scaling • Many community contributed tools/frameworks

How to use DynamoDB?

[DynamoDBTable("ProductCatalog")]

public class Book

{

[DynamoDBHashKey]

public int Id { get; set; }

public string Title { get; set; }

public int ISBN { get; set; }

[DynamoDBProperty("Authors")]

public List<string> BookAuthors { get; set; }

[DynamoDBIgnore]

public string CoverPage { get; set; }

}

Figure: .NET class using object persistence model

Page 41: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Use Libraries and Tools

Transactions Atomic transactions across multiple items & tables Tracks status of ongoing transactions via two tables

1. Transactions 2. Pre-transaction snapshots of modified items

Geolocation Add location awareness to mobile

applications

Find Yourself – sample app

https://github.com/awslabs

Page 42: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Third party library for automating scaling decisions • Scale up for service levels, scale down for cost • CloudFormation template for fast deployment

Autoscaling with Dynamic DynamoDB

Page 43: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Disconnected development with full API support

No network No usage costs

Develop and Test Locally – DynamoDB Local

Note! DynamoDB Local does not have a durability or availability SLA

m2.4xlarge

DynamoDB Local

do this instead!

Page 44: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Some minor differences from Amazon DynamoDB • DynamoDB Local ignores your provisioned throughput

settings The values that you specify when you call CreateTable and

UpdateTable have no effect

• DynamoDB Local does not throttle read or write activity • The values that you supply for the AWS access key and the

Region are only used to name the database file • Your AWS secret key is ignored but must be specified

Recommended using a dummy string of characters

Develop and Test Locally – DynamoDB Local

Page 45: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Reports CloudWatch metrics Latency Consumed throughput Errors Throttling

• Alarms can be used to dynamically size throughput

Monitoring

CloudWatch

Page 46: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• DynamoDB can be used for large data ingest • Redshift can directly load data from DynamoDB (COPY) • EMR can directly read from DynamoDB by using Hive

Analytics

CREATE EXTERNAL TABLE pc_dynamodb (

[attributes]

)

STORED BY

'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler

'

TBLPROPERTIES ([properties]);

Amazon S3

Redshift

EMR

External Hive table

External Hive table

Hive DynamoDB

CREATE EXTERNAL TABLE pc_s3 (

[attributes]

)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

LOCATION 's3://myawsbucket1/catalog/';

Page 47: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Provisioned Throughput: $0.0065 per hour for every 10 units of Write Capacity 1 write per second for 1 KB items $0.0065 per hour for every 50 units of Read Capacity 1 consistent read per second for 4 KB items

• Storage $0.25 per GB-month of storage

• Free tier! 100MB storage + 50 writes/sec + 10 reads/sec each month

Pricing

Page 48: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Best Practices

Page 49: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Method 1. Describe the overall use case – maintain context 2. Identify the individual access patterns of the use case 3. Model each access pattern to its own discrete data set 4. Consolidate data sets into tables and indexes

• Benefits Single table fetch for each query Payloads are minimal for each access

Access Pattern Modeling

Page 50: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Design for uniform data access across items Partition distribution based on hash key Hash Key should be well distributed Access frequency should be distributed across different hash keys

• Time Series Pattern Logging Focus only on recent data

Table Best Practices

Hash Key value Efficiency

User ID, where the application has many users. Good

Status code, where there are only a few possible status codes. Bad

Device ID, where even if there are a lot of devices being tracked, one is by far more popular than all the others.

Bad

Page 51: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Use One-to-Many Tables instead of large set attributes

Break items up in multiple tables

• Use Multiple Tables to support Varied Access Patterns If you frequently access large items but do not use all attributes, store

smaller frequently attributes in separate tables

• Compress large attributes Reduces cost of storage and throughput

• Store large attributes in S3

Item Best Practices

Page 52: AWS Webcast - Build high-scale applications with Amazon DynamoDB

• Avoid sudden burst of read Activity Reduce page size of Scans Isolate scan operations; create separate tables and write to both:

• Mission-Critical Table • Shadow Table

• Take advantage of parallel scans Sequential scans take longer

Query and Scan Best Practices

Page 53: AWS Webcast - Build high-scale applications with Amazon DynamoDB

Quick Poll + Questions?

Thanks for joining!