Upload
dothinger
View
1.692
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Talk by Murad Kamalov from DoThinger Team at the Codecraft meeting (Dunedin, New Zealand).
Citation preview
Starting with MongoDB
Murad Kamalov
Estonia
Tallinn Tartu
Finland
Helsinki
Dunedin
DoThinger.com
• Do things together with people nearby
@DoThinger
Intro to MongoDB
• Schemaless
• Supports replication and shardling
• Indexing
• File storage
• Aggregation (Map/Reduce)
• Geospacial queries
Benefits of MongoDB
• Great choice for agile projects
– no schema = no limitations
– ease of use
• Scalability & high availability
– thanks to shardling and replication
• Failover and automatic recovery
Main Concepts
• Database is set of collections • Collection is like a table in MySQL (e.g events
collection) • Document belongs to a collection, like row in
MySQL, unlike in MySQL documents in same collection can have different structure
• Documents store fields as key-value pairs, where value can be basic data type, another document or array
• Data is stored in JSON format (serialized as BSON)
Example
• Inserting document into events collection:
> db.events.insert({ “title” : “CodeCraft Meeting”,
“description” : “CodeCraft meeting in Dunedin on Tuesday evening”,
“start_datetime” : ISODate(“2012-04-03T17:30:00”),
“category” : “social”
})
> db.events.insert({ “title” : “CodeCraft Meeting 2”,
“description” : “CodeCraft meeting in Dunedin on Tuesday evening”,
“start_datetime” : ISODate(“2012-05-03 17:30:00”),
“category” : “social”,
“participants” : 50
})
• When Inserting, indexed field _id is created automatically for each inserted document
Embedded Documents
• Linking versus Embedding – e.g instead of creating separate collection for comments,
just embed comments in events collection – MongoDB doesn’t support JOINs = use embedding
> db.events.insert({
“title” : “Title”,
“description” : “Description”,
“start_date” : ISODate(“2012-04-03T17:30:00”),
“category” : “social”,
“comments”: [
{“author”: <user_id>, “comment” :“My First Comment”},
{“author”: <user_id>, “comment”: “My Second Comment”}]
})
Querying
• Can be done in many various ways in MongoDB
> db.events.find({“title”: “CodeCraft Meeting”})
> {“_id”: ObjectId(41234351451345), “title” : “CodeCraft Meeting”, “description” : “Description”, “start_datetime” : ISODate(“2012-04-03T17:30:00”), “category” : “social”}
> db.events.find({“start_datetime” : {$gte : ISODate(“2012-04-01T00:00:00”)}})
> {“_id”: ObjectId(41234351451345), “title” : “CodeCraft Meeting”, “description” : “Description”, “start_datetime” : ISODate(“2012-04-03T17:30:00”), “category” : “social”}
{“_id”: ObjectId(41226542345234), “title” : “CodeCraft Meeting 2”, “description” : “Description”, “start_datetime” : ISODate(“2012-05-03T17:30:00”), “category” : “social”, “participants” : 50}
Updates
• Replacing the entire document
> db.events.update({“_id”: ObjectId(41234351451345)},
{“title” : “New Title”,
“description” : “Description”,
“start_date” : ISODate(“2012-04-03T17:30:00”),
“category” : “social”,
“comments”: [
{“author”: <user_id>, “comment” :“My First Comment”},
{“author”: <user_id>, “comment”: “My Second Comment”
})
• Atomic updates
> db.events.update({“_id”:ObjectId(41234351451345)},
{$set : {“title”: {“New Title”}}}
)
Updates (2)
• Pushing Elements to Arrays > db.events.update({“_id”:ObjectId(41234351451345)},
{$push : {“comments”: {“author”: <user_id>, “comment”: “My
Third Comment”}}})
> {“title” : “Title”,
“description” : “Description”,
“start_date” : ISODate(“2012-04-03T17:30:00”),
“category” : “social”,
“comments”: [
{“author”: <user_id>, “comment” :“My First Comment”},
{“author”: <user_id>, “comment”: “My Second Comment”},
{“author”: <user_id>, “comment”: “My Third Comment”}]
})
Geospatial queries
• MongoDB has build-in support for geospacial queries > db.events.insert({
“title” : “Title”,
“description” : “Description”,
“start_datetime” : ISODate(“2012-04-03T17:30:00”),
“category” : “social”,
“location” : [30, 30]
})
> db.events.ensureIndex({“location” : “2d”})
> db.events.find({“location”: {$near: [28,28],
$maxDistance: 5}})
Files (GridFS)
• In MySQL it’s a bad practice to save large binary files into the database
• In MongoDB it’s a bad practice NOT TO save large files into the database.
• MongoDB splits saved files into chunks, which allows querying of only necessary parts of the binary files
• Example(pymongo library) >>> fs = gridfs.GridFs(db_name)
>>> filename = fs.put(“Example file data”)
>>> file_data = fs.get(filename).read()
Map/Reduce
db.events.mapReduce(map_func, reduce_func,
{out: “out_collection”})
• map_func and reduce_func are written in JavaScript
• Output of reduce_func is stored in output collection
Language Drivers
• Different language drivers for MongoDB provide different levels of abstraction – PyMongo (similar to mongo shell) vs MongoEngine (ORM-like)
class Event(mongoengine.Document):
title = StringField()
description = StringField()
start_date = DateTimeField(default=datetime.now)
category = StringField()
comments = ListField(EmbeddedDocumentField(Comment))
#querying
event = Events.objects(title = “Title”)
#adding comments
event.update(push__comments = comment)
Thank You!
Questions?