41
Indexing Strategies to Help you Scale Senior Solutions Architect, MongoDB Dmitry Baev

Indexing Strategies to Help You Scale

  • Upload
    mongodb

  • View
    936

  • Download
    6

Embed Size (px)

DESCRIPTION

Learn all about Indexing Strategies for MongoDB.

Citation preview

Page 1: Indexing Strategies to Help You Scale

Indexing Strategies to Help you Scale

Senior Solutions Architect, MongoDB

Dmitry Baev

Page 2: Indexing Strategies to Help You Scale

Agenda

• What are indexes?

• Indexing Basics

• Evaluation / Tuning

• Geospatial

• Text Search

• Scaling

Page 3: Indexing Strategies to Help You Scale

What Are Indexes?

Page 4: Indexing Strategies to Help You Scale

What Are Indexes?

Imagine you're looking for a recipe in a cookbook ordered by recipe name. Looking up a recipe by name is quick and easy.

Page 5: Indexing Strategies to Help You Scale

Consult the Index

Page 6: Indexing Strategies to Help You Scale

Linked List

Page 7: Indexing Strategies to Help You Scale

Finding 7 in a Linked List

Page 8: Indexing Strategies to Help You Scale

Finding 7 In a Tree

Page 9: Indexing Strategies to Help You Scale

Indexes in MongoDB are B-Trees

Page 10: Indexing Strategies to Help You Scale

Queries, inserts and deletes: O(log(n)) time

Page 11: Indexing Strategies to Help You Scale

Indexes are the single biggest tunable performance factor in MongoDB

Page 12: Indexing Strategies to Help You Scale

Indexing Basics

Page 13: Indexing Strategies to Help You Scale

13

• Single biggest tunable performance factor in the DB.– Index efficiency should be reviewed early– Avoid duplicates

– .

// index on author (ascending)>db.articles.ensureIndex( { author : 1 } )

// index on author (descending)>db.articles.ensureIndex( { author : -1 } )

// index on arrays of values – multi key index.>db.articles.ensureIndex( { tags : 1 } )

Indexing Basics

Page 14: Indexing Strategies to Help You Scale

14

• Index on sub-documents– Using dot notation

Sub-document indexes

{‘_id’ : ObjectId(..),

‘article_id’ : ObjectId(..), ‘section’ : ‘schema’,

‘date’ : ISODate(..),‘daily’: { ‘views’ : 45,

‘comments’ : 150 } ‘hours’ : { 0 : { ‘views’ : 10 }, 1 : { ‘views’ : 2 }, … 23 : { ‘views’ : 14,

‘comments’ : 10 } }}

>db.interactions.ensureIndex(

{ “daily.comments” : 1}

}

>db.interactions.find(

{“daily.comments” : { $gte : 150} } ,

{ _id:0, “daily.comments” : 1 } )

Page 15: Indexing Strategies to Help You Scale

15

• Indexes that use multiple values

Compound indexes

//To view via the console> db.articles.ensureIndex( { author : 1, tags : 1 } )

> db.articles.find( { author : ‘Joe D’, tags : ‘MongoDB’} )//and> db.articles.find( { author : ‘Joe D’ } )

// you don’t need this> db.articles.ensureIndex( { author : 1 } )

Page 16: Indexing Strategies to Help You Scale

16

• Sort doesn’t matter on single indexes– We can read from either side of the btree

• { attribute: 1 } or { attribute: -1 }

• Sort order matters on compound indexes– We’ll want to query on author and sort by date in the

application

Sort order

// index on author ascending but date descending

>db.articles.ensureIndex( { ‘author’ : 1, ‘date’ -1 } )

Page 17: Indexing Strategies to Help You Scale

17

• Returns data from the index– Rather than the database files– Performance optimization – Works with compound indexes

• Invoke with a projection

Covered or Index only Queries

> db.users.ensureIndex( { user : 1, password :1 } )

> db.user.find({ user:”joe” }, { _id:0, password:1 }

)

Tip: use projections anyway to reduce data sent back to the client

Page 18: Indexing Strategies to Help You Scale

18

Options

• Uniqueness constraints (unique, dropDups)

• Sparse Indexes

// index on author must be unique

>db.articles.ensureIndex( { ‘author’ : 1}, { unique : true } )

// allow multiple documents to not have likes field

>db.articles.ensureIndex( { ‘author’ : 1, ‘likes’ : 1}, { sparse: true } )

* Missing fields are stored as null(s) in the index

Page 19: Indexing Strategies to Help You Scale

19

Background Index Builds

• Index creation is a blocking operation that can take a long time

• Background creation yields to other operations

• Build more than one index in background concurrently

• Restart secondaries in standalone to build index

// To build in the background> db.articles.ensureIndex(

{ ‘author’ : 1, ‘date’ -1 }, {background : true}

)

Page 20: Indexing Strategies to Help You Scale

20

• Use to evaluate operations and indexes– Which indexes have been used.. If any.– How many documents / objects have been scanned– View via the console or via code

Explain plan

//To view via the console> db.articles.find({author:’Joe D'}).explain()

Page 21: Indexing Strategies to Help You Scale

21

Explain plan output (no index)

{"cursor" : ”BasicCursor","isMultiKey" : false,"n" : 12,"nscannedObjects" : 25820,"nscanned" : 25820,…"indexOnly" : false,…"millis" : 27,…

}

Other Types:

• BasicCursor• Full collection scan

• BtreeCursor• GeoSearchCursor• Complex Plan• TextCursor

Page 22: Indexing Strategies to Help You Scale

22

Explain plan output

{"cursor" : "BtreeCursor

author_1_date_-1","isMultiKey" : false,"n" : 12,"nscannedObjects" : 12,"nscanned" : 12,…"indexOnly" : false,…"millis" : 0,…

}

Other Types:

• BasicCursor• Full collection scan

• BtreeCursor• GeoSearchCursor• Complex Plan• TextCursor

Page 23: Indexing Strategies to Help You Scale

23

• Enable to see slow queries– (or all queries)– Default 100ms

Database profiler

//Enable database profiler on the console, 0=off 1=slow 2=all> db.setProfilingLevel(1, 100){ "was" : 0, "slowms" : 100, "ok" : 1 }

//View profile with > show profile

//or>db.system.profile.find().pretty()

Page 24: Indexing Strategies to Help You Scale

24

The Query Optimizer

• For each "type" of query, MongoDB periodically tries all useful indexes

• Aborts the rest as soon as one plan wins

• The winning plan is temporarily cached for each “type” of query (used for next 1,000 times)

• MongoDB 2.6 can use the intersection of multiple indexes to fulfill queries

Page 25: Indexing Strategies to Help You Scale

25

Other Index Types

• Geospatial Indexes (2d Sphere)

• Text Indexes

• TTL Collections (expireAfterSeconds)

• Hashed Indexes for sharding

Page 26: Indexing Strategies to Help You Scale

Geo Spatial Indexes

Page 27: Indexing Strategies to Help You Scale

27

• Indexes on geospatial fields– Using GeoJSON objects– Geometries on spheres

2dSphere

//GeoJSON object structure for indexing{ name: ’MongoDB Palo Alto’, location: { type : “Point”,

coordinates: [ 37.449157 , -122.158574 ] }}

// Index on GeoJSON objects>db.articles.ensureIndex( { location: “2dsphere” } )

Supported GeoJSON objects:

PointLineStringPolygonMultiPointMultiLineStringMultiPolygonGeometryCollection

Page 28: Indexing Strategies to Help You Scale

28

Extended Articles document

• Store the location article was posted from….

• Geo location from browser

Articles collections>db.articles.insert({

'text': 'Article content…’, 'date' : ISODate(...), 'title' : ’Intro to MongoDB’, 'author' : ’Joe D’, 'tags' : ['mongodb',

'database',

'nosql’],

‘location’ : { ‘type’ : ‘Point’, ‘coordinates’ :

[37.449, -122.158] }

});

//Javascript function to get geolocation.navigator.geolocation.getCurrentPosition();

//You will need to translate into GeoJSON

Page 29: Indexing Strategies to Help You Scale

29

– Query for locations ’near’ a particular coordinate

Example

>db.articles.find( { location: { $near :

{ $geometry : { type : "Point”, coordinates : [37.449, -

122.158] } }, $maxDistance : 5000 }

} )

Page 30: Indexing Strategies to Help You Scale

Text Search

Page 31: Indexing Strategies to Help You Scale

31

Text Indexes

• Use text indexes to support text search of string content in documents of a collection.

• Text indexes can include any field whose value is a string or an array of string elements.

• To perform queries that access the text index, use the $text query operator.

Page 32: Indexing Strategies to Help You Scale

32

Text Search

• Only one text index per collection

• $** operator to index all text fields in the collection

• Use weight to change importance of fields

>db.articles.ensureIndex({title: ”text”, content:

”text”})

>db.articles.ensureIndex( { "$**" : “text”,

name : “MyTextIndex”} )

>db.articles.ensureIndex( { "$**" : "text”}, { weights :

{ ”title" : 10, ”content" : 5}, name : ”MyTextIndex” })

Operators$text, $search, $language, $meta

Page 33: Indexing Strategies to Help You Scale

33

• Use the $text and $seach operators to query

• Now returns a cursor

• $meta for scoring results

– .// Search articles collection> db.articles.find ({$text: { $search: ”MongoDB" }})

> db.articles.find({ $text: { $search: "MongoDB" }}, { score: { $meta: "textScore" }, _id:0, title:1 } )

{ "title" : "Intro to MongoDB", "score" : 0.75 }

Search

Page 34: Indexing Strategies to Help You Scale

Scaling

Page 35: Indexing Strategies to Help You Scale

Working Set Exceeds Physical Memory

Page 36: Indexing Strategies to Help You Scale

• When a specific resource becomes a bottle neck on a machine or replica set

• RAM• Disk IO• Storage• Concurrency

When to consider Scaling?

Page 37: Indexing Strategies to Help You Scale

Vertical Scalability (Scale Up)

Page 38: Indexing Strategies to Help You Scale

Horizontal Scalability (Scale Out)

Page 39: Indexing Strategies to Help You Scale

Sharding

• User defines shard key

• Shard key defines range of data

• Data is partitioned into shards according to shard key

Page 40: Indexing Strategies to Help You Scale

40

Scalability

Auto-Sharding

• Increase capacity as you go

• Commodity and cloud architectures

• Improved operational simplicity and cost visibility

Page 41: Indexing Strategies to Help You Scale

Thank You