45
MongoDB and Indexes Doug Duncan [email protected] @dugdun

MongoDB and Indexes - MUG Denver - 20160329

Embed Size (px)

Citation preview

Page 1: MongoDB and Indexes - MUG Denver - 20160329

MongoDB and Indexes

Doug Duncan

[email protected]@dugdun

Page 2: MongoDB and Indexes - MUG Denver - 20160329

What we’ll cover• What are indexes?

• Types• Properties

• Why use indexes?• How to create indexes.• Commands to check indexes and plans.

Page 3: MongoDB and Indexes - MUG Denver - 20160329

What are indexes?

Page 4: MongoDB and Indexes - MUG Denver - 20160329

What are indexes?Indexes are special data-structures that store a subset of your data in an easily traversable format.

MongoDB stores indexes in a b-tree format which allows for efficient access to the index content.

Proper index use is good and makes a system run optimally. Improper index use can bring a system to a grinding halt.

Page 5: MongoDB and Indexes - MUG Denver - 20160329

What are indexes?Indexes are stored similar in a format similar to the following if there was an index on Origin:[ABE] -> 0xa193b48c[ABE] -> 0x8e8b242a[ABE] -> 0x0928cdc1…[DEN] -> 0x24aa4ecd[DEN] -> 0x87396a3c[DEN] -> 0x9392ab2f…[LAX] -> 0x89ccede0…

Page 6: MongoDB and Indexes - MUG Denver - 20160329

Types of indexes• _id• Simple• Compound• Multikey• Full-Text• Geo-spatial• Hashed

Page 7: MongoDB and Indexes - MUG Denver - 20160329

The _id index• The _id index is automatically created and cannot be

removed.• This is the same as a primary key in traditional RDBMS.• Default value is a 12-byte ObjectId:

• 4-byte time stamp• 3-byte machine id• 2-byte process id• 3-byte counter

Page 8: MongoDB and Indexes - MUG Denver - 20160329

Simple index

• A simple index is an index on a single key• This is similar to a book’s index where you

look up a word to find the pages it’s referenced on.

Page 9: MongoDB and Indexes - MUG Denver - 20160329

Compound index

• A compound index is created over two or more fields in a document

• This is similar to a phone book where you can find the phone number of a person given their first and last names.

Page 10: MongoDB and Indexes - MUG Denver - 20160329

Multikey index• A multikey index is an index that’s created

on a field that contains an array.• If using in a compound index, only a single

field in a given document can be an array.• You will get one entry in the index for

every item in the array for the given document. This means if you have an array with 100 items, that document will have 100 index entries.

Page 11: MongoDB and Indexes - MUG Denver - 20160329

Full-text index

• This is an index over a text based field, similar to how Google indexes web pages.

Page 12: MongoDB and Indexes - MUG Denver - 20160329

Geo-spatial index

• A geo-spatial index will allow you to determine distance from a given point.

• Works on both planar and spherical geometries.

Page 13: MongoDB and Indexes - MUG Denver - 20160329

Hashed indexes• A hashed index is used in hash based

sharding, and allows for a more randomized distribution.

• Hashed indexes cannot contain compound keys or be unique.

• Hashed indexes can contain the key in both a hashed and non-hashed version. The non-hashed version will allow for range based queries.

Page 14: MongoDB and Indexes - MUG Denver - 20160329

Index properties

• Unique• Sparse• TTL• Partial (new in 3.2)

Page 15: MongoDB and Indexes - MUG Denver - 20160329

Unique• The unique property allows for only a

single value for the indexed field, or combination of fields for a compound indexdb.collection.createIndex({“email”: 1}, {“unique”:

true})

• A unique index can only have a single null or missing field value for all documents in the collection.

Page 16: MongoDB and Indexes - MUG Denver - 20160329

Sparse• The sparse property allows you to index

only documents that contain a value for the given field.db.collection.createIndex({“kids”: 1}, {“sparse”: true})

• A sparse index will not be used if it would result in an incomplete result set, unless specifically hinted.

db.collection.find({“kids”: {“$gte”: 5})

Page 17: MongoDB and Indexes - MUG Denver - 20160329

TTL• The TTL property allows for the automatic

removal of documents after a given time period.db.collection.createIndex({“accessTime”: 1}, {“expireAfterSeconds”:

“1200”})

• The indexed field should contain an ISODate() value. If any other type is used the document will not be removed.

• The TTL removal process runs once every 60 seconds so you might see the document even though the time has expired.

Page 18: MongoDB and Indexes - MUG Denver - 20160329

Partial• The partial property allows you to index a

subset of your data.db.collection.createIndex({“movie”: 1, “reviews”: 1},

{“rating”: {“$gte”: 4}})

• The index will not be used if it would provide an incomplete result set (similar to the sparse index).

Page 19: MongoDB and Indexes - MUG Denver - 20160329

Why use indexes?

Page 20: MongoDB and Indexes - MUG Denver - 20160329

Why use indexes?• Efficiently retrieving document matches

• Equality matching• Inequality or range matching

• Sorting• Lack of a usable index will cause MongoDB

to scan the entire collection.

Page 21: MongoDB and Indexes - MUG Denver - 20160329

How to create indexes.

Page 22: MongoDB and Indexes - MUG Denver - 20160329

Before creating indexes• Think about the queries you will be running

and try to create as few indexes as possible to support those queries. Similar query patterns could use the same (or very similar) indexes.

• Think about the data that you will query and put your highly selective fields first in the index if possible.

• Check your current indexes before creating new ones. MongoDB will allow you to create indexes with the same fields in different orders.

Page 23: MongoDB and Indexes - MUG Denver - 20160329

Simple indexes• When creating a simple index, the sort

order, ascending (1) or descending (-1), of the values doesn’t matter as much as MongoDB can walk the index forwards and backwards.

• Simple index creation:db.flights.createIndex({“Origin”: 1})

Page 24: MongoDB and Indexes - MUG Denver - 20160329

Compound indexes• When creating a compound index, the sort order, ascending (1) or

descending (-1), of the values starts to matter, especially if the index is used to sort on multiple keys.

• When creating compound indexes you want to add keys to the index in the following key order:• Equality matches• Sort fields• Inequality matches

• A compound index will also help any queries that are made based off the left most subset of keys.

Page 25: MongoDB and Indexes - MUG Denver - 20160329

Compound indexes• Compound index creation:

db.flights.createIndex({“Origin”: 1, “Dest”: 1, “FlightDate”: -1})

• Queries supported:db.flights.find({“Origin”: “DEN”})

db.flights.find({“Origin”: “DEN”, “Dest”: “JFK”})

db.flights.find({“Origin”: “DEN”, “Dest”: “JFK”}).sort({“FlightDate”: -1})

db.flights.find({“Origin”: “DEN”, “Dest”: “JFK”}).sort({“FlightDate”: 1})

Page 26: MongoDB and Indexes - MUG Denver - 20160329

Compound indexes• An index created as follows:

db.flights.createIndex({“Origin”: 1, “Dest”: -1})

Could be used with either of the following queries as well since MongoDB can walk the index either way:

db.flights.find().sort({“Origin”: 1, “Dest”: -1})

db.flights.find().sort({“Origin”: -1, “Dest”: 1})

Page 27: MongoDB and Indexes - MUG Denver - 20160329

Full-text indexes• Full-text index creation:• db.messages.createIndex({“body”: “text”})• To search using the index finding any of the

words:db.messages.find({“$text”: {“$search”: “some text”}})

• To search using the index finding a phrasedb.message.find({“$text”: {“$search”: “\”some text\””}}

Page 28: MongoDB and Indexes - MUG Denver - 20160329

Covering indexes• Covering indexes are indexes that will answer

a query without going back to the data. For example:db.flights.createIndex({“Origin”: 1, “Dest”: 1,

“ArrDelay”: 1, “UniqueCarrier”: 1})

• The following query would be covered as all fields are in the index:db.flights.find({“Origin”: “DEN”, “Dest”: “JFK”},

{“UniqueCarrier”: 1, “ArrDelay”: 1, “_id”: 0}).sort({“ArrDelay”: -1})

Page 29: MongoDB and Indexes - MUG Denver - 20160329

Indexing nested fields/documents

• Let’s say you have documents with nested documents in them like the following:

db.locations.findOne()

{

“_id”: ObjectId(…),

…,

“location”: {

“state”: “Colorado”,

“city”: “Lyons”

}

}

Page 30: MongoDB and Indexes - MUG Denver - 20160329

Indexing nested fields/documents

• You can index on embedded fields by using dot notation:

db.locations.createIndex({“location.state”: 1})

Page 31: MongoDB and Indexes - MUG Denver - 20160329

Indexing nested fields/documents

• You can also index embedded documentsdb.locations.createIndex({“location”: 1})

• If you do this the query must match the document exactly (keys in the same order). That means that this will return the document:

db.locations.find({“location”: {“state”: “Colorado”, “city”: “Lyons”})

• But this won’t:db.locations.find({“location”: {“city”: “Lyons”, “state”:

“Colorado”})

Page 32: MongoDB and Indexes - MUG Denver - 20160329

Index Intersection• Index intersection is when MongoDB uses two or

more indexes to satisfy a query.• Given the following two indexes:

db.orders.createIndex({“qty”: 1})

db.orders.createIndex({“item”: 1})

• Index intersection means a query such as the following could use both indexes in parallel with the results being merged together to satisfy the query:db.orders.find({“item”: “ABC123”, “qty”: {“$gte”: 15}})

Page 33: MongoDB and Indexes - MUG Denver - 20160329

Indexing arrays• You can index fields that contain arrays as well.• Compound indexes however can only have a single field that is an array in a given document. If a document has two indexed fields that are arrays, you will get an error.

db.arrtest.createIndex({“a”: 1, “b”: 1})

db.arrtest.insert({"b": [1,2,3], "a": [1,2,3]})

cannot index parallel arrays [b] [a]

WriteResult({

"nInserted": 0,

"writeError": {

"code": 10088,

"errmsg": "cannot index parallel arrays [b] [a]"

}

})

Page 34: MongoDB and Indexes - MUG Denver - 20160329

Index Intersection• Index intersection is when MongoDB uses two or

more indexes to satisfy a query.• Given the following two indexes:

db.orders.createIndex({“qty”: 1})

db.orders.createIndex({“item”: 1})

• Index intersection means a query such as the following could in theory use both indexes in parallel with the results being merged together to satisfy the query:db.orders.find({“item”: “ABC123”, “qty”: {“$gte”: 15}})

Page 35: MongoDB and Indexes - MUG Denver - 20160329

Removing indexes

• The command to remove indexes is similar to the one to create the index.db.flights.dropIndex({“Origin”: 1, “Dest”: -1})

Page 36: MongoDB and Indexes - MUG Denver - 20160329

Commands to check indexes and index

usage

Page 37: MongoDB and Indexes - MUG Denver - 20160329

View all indexes in a database

• To view all indexes in a database use the following command:db.system.indexes.find()

• For each index you’ll see the fields the index was created with, the name of the index and the namespace (db.collection) that the index was built on.

Page 38: MongoDB and Indexes - MUG Denver - 20160329

View indexes for a given collection

• To view all indexes for a given collection use the following command:db.collection.getIndexes()

• This returns the same information as the previous command, but is limited to the given collection.

Page 39: MongoDB and Indexes - MUG Denver - 20160329

View index sizes• To view the size of all indexes in a

collection:db.collection.stats()

• You will see the size of all indexes and the size of each individual index in the results. The sizes are in bytes.

Page 40: MongoDB and Indexes - MUG Denver - 20160329

How to see if an index is used• If you want to see if an index is used,

append the .explain() operator to your querydb.flights.find({“Origin”: “DEN”}).explain()

• The explain operator has three levels of verbosity:• queryPlanner - this is the default, and it returns the winning query

plan

• executionStats - adds execution stats for the plan

• allPlansExecution - adds stats for the other candidate plans

Page 41: MongoDB and Indexes - MUG Denver - 20160329

Notes on indexes.• When creating an index you need to know

your data and the queries that will run against it.

• Don’t build indexes in isolation! • While indexes can improve performance,

be careful to not over index as every index gets updated every time you write to the collection.

Page 42: MongoDB and Indexes - MUG Denver - 20160329

Q & A

Page 43: MongoDB and Indexes - MUG Denver - 20160329

End Notes• User group discounts

• Manning publications: www.manning.com• Code ‘ug367’ to save 36% off order

• APress publications: www.appress.com• Code ‘UserGroup’ to save 10% off order

• O’Reilly publication: www.oreilly.com• Still waiting to get information

Page 44: MongoDB and Indexes - MUG Denver - 20160329

End Notes

• Communication• Twitter: @MUGDenver and #MUGDenver• Email: [email protected]• Slack: ???

Page 45: MongoDB and Indexes - MUG Denver - 20160329

End Notes

• MongoDB World• When: June 28th and 29th• Where: NYC• Save 25% by using code ‘DDuncan’