View
1.220
Download
2
Category
Tags:
Preview:
Citation preview
Sharding with MongoDB
Tyler Brocktyler@10gen.com@TylerBrock
Philosophy
Concepts
Architecture
Mechanics
Philosophy
Philosophy
MongoDB is a database for developers.
Build
Philosophy
BuildScale
Philosophy
How to Draw an Owl
Philosophy
How to Draw an Owl
Philosophy
> db.runCommand({enablesharding: "<dbname>" })
> db.runCommand({ shardcollection: "<namespace>", key: <shardkeypatternobject> })
Draw Two Circles
Philosophy
Concepts
datastore
app
Read/Write
Simple Web Application
What happens when your working set exceeds memory?
What happens if your write load is enormous?
datastore
app
Vertical Scaling
app
Vertical Scaling
datastore
app
Vertical Scaling
datastore
appapp
68 GB RamRaid10 EBS
datastore
app
Vertical Scaling
appapp
128 GB RamRaid10 SSD
app
datastoredatastoredatastore
Horizontal Scaling
60gb
app
datastoredatastore datastore
20gb 20gb 20gb
Horizontal Scaling
Routing Logic
app
datastoredatastore datastore
20gb 20gb 20gb
Horizontal Scaling
metadata
Routing Logic
app
datastoredatastore datastore
20gb 20gb
Horizontal Scaling
metadata
60gb
app
Routing Logic
Balancer
datastoredatastore datastore
20gb 20gb
Horizontal Scaling
metadata
60gb
app
Routing Logic
Balancer
datastoredatastore datastore
Horizontal Scaling
metadata
30gb 30gb 30gb
Architecture
Really is just a mongod (or replica set)Where your data lives
mongod
Shard
Mongod started with --configsvr optionMust have 3 (or 1 in development)Data is commited using 2 phase commit
config
Config Server
mongos
Acts just like shard router / proxyOne or as many as you wantLight weight -- can run on App serversCaches meta-data from config servers
mongos
Routing Logic
Balancingmetadata
datastore datastoredatastore
metadata
datastore
mongos
datastoredatastore
metadata
datastore
mongos
datastoredatastore
app
datastore
mongos
config
datastoredatastore
app
datastore
mongos
config
datastoredatastore
config
config
app
mongos
config
mongod mongodmongod
config
config
app
mongos
config
mongod mongodmongod
mongod mongodmongod
mongod mongodmongod
RS RS RS
config
config
app
mongos
config
mongod mongodmongod
mongod mongodmongod
mongod mongodmongod
RS RS RS
config
config
app
Mechanics
How does MongoDB balance my data?
{ name: “Joe”, email: “Joe@fake.com”,},{ name: “Bob”, email: “bob@fake.com”,},{ name: “Tyler”, email: “tyler@fake.com”,}
Keys
test.users
> db.runCommand({
})
{ name: “Joe”, email: “Joe@fake.com”,},{ name: “Bob”, email: “bob@fake.com”,},{ name: “Tyler”, email: “tyler@fake.com”,}
shardcollection: “test.users”,
Keys
key: { email: 1 }
test.users
{ name: “Joe”, email: “Joe@fake.com”,},{ name: “Bob”, email: “bob@fake.com”,},{ name: “Tyler”, email: “tyler@fake.com”,}
shardcollection: “test.users”,
Keys
key: { email: 1 }
test.users
{ name: “Joe”, email: “Joe@fake.com”,},{ name: “Bob”, email: “bob@fake.com”,},{ name: “Tyler”, email: “tyler@fake.com”,}
Keys
key: { email: 1 }
test.users
Chunks
-∞ +∞
Chunks
-∞ +∞
joe@fake.com
moe@fake.com
tyler@fake.com
Chunks
-∞ +∞
Split!
joe@fake.com
moe@fake.com
tyler@fake.com
Chunks
-∞ +∞
Split!This is a chunk
This is a chunk
joe@fake.com
moe@fake.com
tyler@fake.com
Chunks
-∞ +∞
joe@fake.com
moe@fake.com
tyler@fake.com
Chunks
-∞ +∞
joe@fake.com
moe@fake.com
tyler@fake.com
Chunks
-∞ +∞
Split!
joe@fake.com
moe@fake.com
tyler@fake.com
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Split this big chunk into 2
chunks
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
These chunks have split
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Shard1, move a chunk to
Shard2
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Shard1, move another chunk
to Shard3
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Shard1, move another chunk
to Shard4
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
How does MongoDB route my queries?
Routed Request
mongos
shard shard shard
Routed Request1
mongos
shard shard shard
1. Query arrives at Mongos
Routed Request1
2
mongos
shard shard shard
1. Query arrives at Mongos
2. Mongos routes query to a single shard
Routed Request1
2
3
mongos
shard shard shard
1. Query arrives at Mongos
2. Mongos routes query to a single shard
3. Shard returns results of query
Routed Request1
2
3
4
mongos
shard shard shard
1. Query arrives at Mongos
2. Mongos routes query to a single shard
3. Shard returns results of query
4. Results returned to client
Scatter Gather Request
shard shard shard
mongos
Scatter Gather Request1
1. Query arrives at Mongos
shard shard shard
mongos
Scatter Gather Request1
1. Query arrives at Mongos
2 22
shard shard shard
mongos2. Mongos broadcasts queryto all shards
Scatter Gather Request1
1. Query arrives at Mongos
2 22
3 33
shard shard shard
mongos2. Mongos broadcasts queryto all shards
3. Each shard returns resultsfor query
Scatter Gather Request1
41. Query arrives at Mongos
2 22
3 33
shard shard shard
mongos2. Mongos broadcasts queryto all shards
3. Each shard returns resultsfor query
4. Results combined andreturned to client
mongos
Distributed Merge Sort Req.
shard shard shard
mongos
Distributed Merge Sort Req.1
shard shard shard
1. Query arrives at Mongos
mongos
Distributed Merge Sort Req.1
22 2
shard shard shard
1. Query arrives at Mongos
2. Mongos broadcasts query to all shards
mongos
Distributed Merge Sort Req.1
22 2
shard shard shard3 3 3
1. Query arrives at Mongos
2. Mongos broadcasts query to all shards
3. Each shard locally sorts results
mongos
Distributed Merge Sort Req.1
22 2
4 44
shard shard shard3 3 3
1. Query arrives at Mongos
2. Mongos broadcasts query to all shards
3. Each shard locally sorts results
4. Results returned to mongos
mongos
Distributed Merge Sort Req.1
5
22 2
4 44
shard shard shard3 3 3
1. Query arrives at Mongos
2. Mongos broadcasts query to all shards
3. Each shard locally sorts results
4. Results returned to mongos
5. Mongos merges sorted results
mongos
Distributed Merge Sort Req.1
6
5
22 2
4 44
shard shard shard3 3 3
1. Query arrives at Mongos
2. Mongos broadcasts query to all shards
3. Each shard locally sorts results
4. Results returned to mongos
5. Mongos merges sorted results
6. Combined results returned to client
Queries
By Shard Key Routed db.users.find({email: “bob@10gen.com”})
Sorted by shard key
Routed in order db.users.find().sort({email:-1})
Find by non shard key
Scatter Gather db.users.find({state:”NY”})
Sorted by non shard key
Distributed merge sort
db.users.find().sort({state:1})
Writes
Inserts Requires shard key db.users.insert({ name: “Bob”, email: “Bob@fake.com”})
Removes Routed db.users.delete({ email: “bob@fake.com”})
Removes
Scattered db.users.delete({name: “Bob”})
Updates Routed db.users.update( {email: “bob@10gen.com”}, {$set: { state: “NY”}})
Updates
Scattered db.users.update( {state: “CA”}, {$set:{ state: “NY”}} )
How do I choose my shard key?
Choose a field that is common to your queries.
Rule of Thumb
Write Scaling
Writes should be distributed.
{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken"}
Bad { time : 1 }
Writes should be distributed
{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken"}
Bad { time : 1 }
Better {node:1, application:1, time:1}
Writes should be distributed
Query Isolation & Data Locality
Queries should be routed to one shard.
Bad {msg: 1, node: 1}
{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken”}
Queries should be routed to one shard
Better {node: 1, time: 1}
Bad {msg: 1, node: 1}
{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken”}
Queries should be routed to one shard
Cardinality
Chunks should be able to split.
Bad {node: 1}
{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken"}
Chunks should be able to split
Better {node:1, time:1}
Bad {node: 1}
{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken"}
Chunks should be able to split
Configuration
mongodmongodmongod
Bring up mongods or Replica Sets
mongod mongodmongod
mongod mongodmongod
RS RS RS
mongod --shardsvrmongod --replSet --shardsvr
config
mongodmongodmongod
mongod mongodmongod
mongod mongodmongod
RS RS RS
Bring up Config Servers
config
config
mongod --configsvr
config
mongodmongodmongod
mongod mongodmongod
mongod mongodmongod
RS RS RS
Bring up Mongos
config
config
mongos
mongos --configdb <list of configdb uris>
> use admin> db.runCommand({"addShard": <shard uri>})
Connect to Mongos+ Add Shards
Enable Sharding
> db.runCommand( { enablesharding : "<dbname>" } );
> db.runCommand( { shardcollection : "<namespace>", key : <key> });
Shard a Collection
Recommended