Upload
harun-yardimci
View
4.526
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Introduction to MongoDB
Citation preview
MongoDB Introduction to MongoDB
Harun Yardımcı Software Architect @ eBay
@nosqlcozumler
What is MongoDB? MongoDB is a scalable, high-performance, open-source, schema-free, document-oriented database developed by 10gen.
- MongoDB is named from "huMONGOus," meaning "extremely large".
Features • Dynamic schemas • Full, flexible index support and rich queries • Sharding for horizontal scalability • Replication for high availability • Text search • Advanced security • Aggregation Framework and Map-Reduce • GridFS • Geospatial
Terminology
RDBMS MongoDB Table Collection Row Document Index Index Join Embedded Document Foreign Key Reference Partition Shard Database Database
NoSQL Databases
Document Database • What is a document?
o PDF, Word document, text file?
Document is an associated array
Deployment Architecture
• Standalone • Replica Set • Sharded Cluster
Standalone < Replica Set < Sharded Cluster*
* Typical sharded cluster consist of replica sets in production.
CAP Theorem
Availability Consistency
Partition Tolerance
N/A
CA
AP CP
Oracle MySQL Postgres
MongoDB BigTable HBase Redis Memcache
CouchDB Cassandra Riak Dynamo SimpleDB
Standalone $ mongod
--dbpath /data/db --fork --logappend --logpath /var/log/mongod.log --journal --port 27017
Configuration File vim /etc/mongod.conf
dbpath = /data/db fork = true logappend = true logpath = /var/log/mongodb.log journal = true port = 27017
$ mongod --config /etc/mongod.conf
Replication Redundancy, Backup, and Automatic Failover
• Master / Slave (Deprecated since 1.6)
• Replica Set
Replica Set $ mongod
--dbpath /data/db --fork --logappend --logpath /var/log/mongod.log --journal --port 27017 --replSet set01
Replica Set Initialization Startup mongod process and connect to one of them > var cnf = { _id : 'set01', members : [
{ _id : 0, host: 'myhost1.net:27017' }, { _id : 1, host: 'myhost2.net:27017' }, { _id : 2, host: 'myhost3.net:27017' } ]
} > rs.initiate(cnf)
Replica Set • Majority • Voting
• Arbiters
• Hidden Members
• Delayed Members
Application Concerns • Write Concern
o Errors Ignored o Unacknowledge o Acknowledged (default) o Journaled o Replica Acknowledged
• Read Preferences o primary
o primaryPreferred
o secondary
o secondaryPreferred
o nearest
Sharding • Why?
• Scale-out
• Store different part of data on different hosts
• Auto Balancing
Sharding Components • Shards
o Standalone o Replica Set
• Config Servers • Mongos
Replica Set as a Shard Add each replica set as a shard > sh.addShard( "set01/myhost1.net:27017, myhost2.net:27017,myhost3.net:27017" ) To add standalone server as a shard.. DON’T DO IT!! > sh.addShard( "myhost9.net:27017" )
Selecting Shard Key • Distribution of Data
• Hashed Shard Key
• Chunks and Balancing
Enable Sharding You must enable sharding on each database separately
> sh.enableSharding("mydatabase")
And for each collection > sh.shardCollection( "mydatabase.mycol", { "userId" : 1 } )
Thanks
If you have any questions, please feel free to ask