View
253
Download
5
Category
Preview:
Citation preview
RocksDB and MongoRocks
Islam AbdelRahman Software Engineer
MongoDB using RocksDB storage engine
What is MongoRocks
• Embedded Persistent key-value store • Optimized for server work load • Open source • Used by Facebook, LinkedIn, Yahoo, Microsoft,
Netflix, Airbnb, Pinterest …
What is RocksDB
RocksDB Architecture
Log Structured Merge Trees
Level 0
Level 1
Level 2
Level 3
Memtable (64 MB)
(256 MB)
(512 MB)
(5 GB)
(50 GB)
Level 4 (500 GB)
Newer
Older
Writes
Level 0
…
Memtable (64 MB)
(256 MB)
WAL
(Key, Value)
Flush
Level 0
…
Memtable (64 MB)
(256 MB)
new
Compaction
Level 0
Level 1
Level 2
Level 3
Memtable (64 MB)
(256 MB)
(512 MB)
(5 GB)
(50 GB)
Level 4 (500 GB)
Compaction
Level 0
Level 1
Level 2
Level 3
Memtable (64 MB)
(256 MB)
(512 MB)
(5 GB)
(50 GB)
Level 4 (500 GB)
new new new
• Foreground • Write to memtable + Write Ahead Log
• Background • Flush • Compaction
Writes
File format Data BlockData Block
Data BlockData BlockData BlockIndex Block
Bloom Filter BlockStatistics Block
File format (Data Block)
AAAAAAA : VALAAAAAAB : VALAAAAAAC : VALAABAAAA : VALAABAAAX : VAL
AAAAAAA : VAL[6]B : VAL[6]C : VAL
[2]BAAAA : VAL[6]X : VAL
CompressedBlock
(Snappy / Zlib / etc.)
File format Data BlockData Block
Data BlockData BlockData BlockIndex Block
Bloom Filter BlockStatistics Block
Other files
Manifest WAL LOG
LSM State Recovery Debugging
Level 1+ files
1 -> 10 11 -> 50 60 -> 70 75 -> 80 90 -> 100
None overlapping key ranges
Level 0 files
20 -> 80 1 -> 100 11 -> 99 30 -> 40
Overlapping key ranges
Reads (point look up)
Level 0
Level 1
Level 2
Level 3
Memtable (64 MB)
(256 MB)
(512 MB)
(5 GB)
(50 GB)
Level 4 (500 GB)
Reads (Iterators)
Level 0
Level 1
Level 2
Level 3
Memtable (64 MB)
(256 MB)
(512 MB)
(5 GB)
(50 GB)
Level 4 (500 GB)
(1 Iterator)
(4 Iterators)
(1 Iterator)
(1 Iterator)
(1 Iterator)
(1 Iterator)
RocksDB Iterator
• MongoDB 3.0 introduced pluggable storage engine API
• MongoDB using RocksDB storage engine • Running in production since March 2015
MongoRocks
• Mobile backend as a service • One of the biggest MongoDB deployments • Millions of collections, millions of indexes
Parse
• Huge storage savings (compressed 5 TB to 285 GB)
• Document level locking • Better Backups
MongoRocks
• RocksDB files are immutable • Backups are fast • Incremental backup using rocks-strata • Queriable backups
MongoRocks Backups
MongoRocks Backup
Level 0
Level 1
Memtable
1 2 3 4 5
1 2 3 4 5
Level 2 6 6
Backup Directory
MongoRocks Backup
Level 0
Level 1
Memtable
1 2 3 4 5
1 2 3 4 5
Level 2 6 6
Backup Directory
7
8 9 7 8 9
• RocksDB: https://github.com/facebook/rocksdb/ • MongoRocks: https://github.com/mongodb-partners/mongo-
rocks • Rocks-Strata: https://github.com/facebookgo/rocks-strata
Thanks !
Recommended