Agenda
• Why migrating to NoSQL (not only “green field”)• What is a Table• What is a Schema• What about Stored Procedures• Transactions?• Top DynamoDB Mistakes or Optimization
Opportunities
Learn and Be Curious
Leaders are never done learning and always seek to improve themselves. They are curious about new possibilities and act to explore them
Supporting Amazon.com Journey to migrate from RDBMS to NoSQL
What is your first (DB) language?
RDBMS (ACID, SQL and Stored Procedures)MongoDB (Document Store)Hbase (Column Families)Redis (Advanced Data Types as Sorted Sets)
Small Partition UnitsHash Key Range Table Partitions
0000
FFFF
5333
A666
10GB
1000 1KB Writes / Second 3000 4KB Reads / Second8888 A8888 B8888 C8888 D
Put 8888 E
Update 8888 B
Get 8888 E
Get Range 8888 A8888 B8888 C
5555 A5555 B5555 C5555 D5555 E
9999 A9999 B
6666 A6666 B 8484 7777
0000
FFFF
5333
A666
Distributed Hashtable
Hash/Partition Key for O(1) lookup(Optional) Range/Sort Key for O(ln(n)) lookupAutomatic repartitioning on Size and Read/Write Capacity
10GB
Write1K*1KB IOPS
Read3K*4KB IOPS
Schema?
Schema for write - RDBMSSchema for Read - DynamoDBSchema on Read - Hadoop
Define an attribute in the schema ONLY if you need to LOOKUP with this attribute (not scan)
Hash Key (+ Range Key)
Images Table
User Image Date LinkBob aed4c 2013-10-01 s3://…Bob cf2e2 2013-09-05 s3://…Bob f93bae 2013-10-08 s3://…Alice ca61a 2013-09-12 s3://…
Table
Lookup KeyRange Key for Uniqueness
The main key is used to LOOKUP an item
Flexible Attributes
Images Table
User Image Date Link Size KB
Bob aed4c 2013-10-01 s3://… 124Bob cf2e2 2013-09-05 s3://… 251Bob f93bae 2013-10-08 s3://… 98Alice ca61a 2013-09-12 s3://… 155
Table
New Attribute
Most attributes are not needed for LOOKUP
Local Secondary Index
Images Table
User Image Date LinkBob aed4c 2013-10-01 s3://…Bob cf2e2 2013-09-05 s3://…Bob f93bae 2013-10-08 s3://…Alice ca61a 2013-09-12 s3://…
User Date ImageBob 2013-09-05 cf2e2Bob 2013-10-01 aed4cBob 2013-10-08 f93baeAlice 2013-09-12 ca61a
Table ByDate Local Secondary Index
Local Secondary Index on Date
An alternative sort for a hash key
To project or not to project?
Images Table
User Image Date LinkBob aed4c 2013-10-01 s3://…Bob cf2e2 2013-09-05 s3://…Bob f93bae 2013-10-08 s3://…Alice ca61a 2013-09-12 s3://…
User Date ImageBob 2013-09-05 cf2e2Bob 2013-10-01 aed4cBob 2013-10-08 f93baeAlice 2013-09-12 ca61a
Table ByDate Local Secondary Index
Additional attributes can be “fetched”
“Pay” on read or on write
Links3://…s3://…s3://…s3://…
Or projected
User Image Date LinkBob aed4c 2013-10-01 s3://…Bob cf2e2 2013-09-05 s3://…Bob f93bae 2013-10-08 s3://…Alice ca61a 2013-09-12 s3://…
Item Collection Size < 10GB
Images Table
User Image Date LinkBob aed4c 2013-10-01 s3://…Bob cf2e2 2013-09-05 s3://…Bob f93bae 2013-10-08 s3://…Alice ca61a 2013-09-12 s3://…
User Date ImageBob 2013-09-05 cf2e2Bob 2013-10-01 aed4cBob 2013-10-08 f93baeAlice 2013-09-12 ca61a
Table
Item Collection for Hash Key and all its LSI
Monitor for large Item Collection using ReturnItemCollectionMetrics
Links3://…s3://…s3://…s3://…
Up to 5 LSI
Sparse Secondary Index
Images Table
User Image Date LinkBob aed4c 2013-10-01 s3://…Bob cf2e2 2013-09-05 s3://…Bob f93baeAlice ca61a 2013-09-12 s3://…
User Date ImageBob 2013-09-05 cf2e2Bob 2013-10-01 aed4cAlice 2013-09-12 ca61a
Table ByDate Local Secondary Index
Set Date to NULL to remove it from the index
If any of the attributes of the key is missing it won’t be in the index
Global Secondary Index
ImageTags Table Query for images tagged Alice
User ImageBob aed4cBob f93baeAlice aed4cAlice f93bae
ByUser Global Secondary Index
Image Useraed4c Aliceaed4c Bobf93bae Alicef93bae Bob
Table
A completely new LOOKUP key
Global Secondary Index R/W Capacity
ImageTags Table Credit Bucket
ByUser Global Secondary Index
Image Useraed4c Aliceaed4c Bobf93bae Alicef93bae Bob
Table
If any of the indexes has no write capacity the write is throttled
Async
Place ImageLondon aed4cLondon f93baeRome ba763Rome 63f11
User ImageBob aed4cBob f93baeAlice aed4cAlice f93bae
ByPlace Global Secondary Index
Up to 5 GSI
Negative Example
Images Table
User Image Date Country
Bob aed4c 2013-10-01 USABob cf2e2 2013-09-05 USABob f93bae 2013-10-08 DEAlice ca61a 2013-09-12 BR
Country Date ImageUSA 2013-09-05 cf2e2USA 2013-10-01 aed4cUSA 2013-10-08 2ee4cUSA 2013-09-12 a5541
Table ByCountry Global Secondary Index
Bad distribution key – cardinality and skew
Updates Stream
24 HoursP
ut 8
888
E…
Upd
ate
8888
B…
(old
val
ue …
)
Put
848
4…
Put
555
5 …
Del
ete
8888
E …
Partial (Partition) OrderAnalytics
Archiver
Updates Stream
Images Table
User Image Date Size KB
Bob aed4c 2013-10-01 124Bob cf2e2 2013-09-05 251Bob f93bae 2013-10-08 98Alice ca61a 2013-09-12 155
Table
Aggregation with Stream
User Size KB
Bob 473Alice 155
Post-Processing items updates
Transactions?
Document basedConditional Update – Optimistic LocksAtomic CountersTransaction library – Not recommendedDynamoDB Updates Stream
Top DynamoDB Mistakes
Too much "old" data"Wrong" lookup keys (Market=NA, Status=Complete)Scaling up and down too much Writing "long" items Using DynamoDB for QueuesIntroducing Artificial GUIDCreating Storms
Guy ErnestAmazon.com Solutions [email protected]