Mongodb in-anger-boston-rb-2011

Using MongoDB inAnger

Techniques andConsiderations

Kyle Bankerkyle@10gen.com and @hwaet

Four topics:Schema design

Indexing

Concurrency

Durability

I. Schema design

Document sizeKeys are stored in the documentsthemselves

For large data sets, you should use smallkey names.

> doc = { _id: ObjectId("4e94886ebd15f15834ff63c4"), username: 'Kyle', date_of_birth: new Date(1970, 1, 1), site_visits: 1027 }

> Object.bsonsize( doc );85

> doc = { _id: ObjectId("4e94886ebd15f15834ff63c4"), name: 'Kyle', dob: new Date(1970, 1, 1), v: 1027 }

> Object.bsonsize( doc );61 // 28% smaller!

Document growthCertain schema designs require documentsto grow significantly.

This can be expensive.

// Sample: user with followers{ _id: ObjectId("4e94886ebd15f15834ff63c4"), name: 'Kyle' followers: [ { user_id: ObjectId("4e94875fbd15f15834ff63c3") name: 'arussell' }, { user_id: ObjectId("4e94875fbd15f15834ff63c4") name: 'bsmith' } ]}

An initial design:// Update using $push will grow the documentnew_follower = { user_id: ObjectId("4e94875fbd15f15834ff63c5") name: 'jcampbell' }db.users.update({name: 'Kyle'}, { $push: {friends: { $push: new_follower } } )

Let's break this down...At first, documents are inserted with noextra space.

But updates that change the size of thedocuments will alter the padding factor.

Even with a large padding factor,documents that grow unbounded will stilleventually have to be moved.

Relocation is expensive:All index entry pointers must be updated.

Entire document must be rewritten in a newplace on disk (possibly not in RAM).

May cause fragmentation. Increases thenumber of entries in the free list.

A better design:// User collection{ _id: ObjectId("4e94886ebd15f15834ff63c4"), name: 'Kyle'}// Followers collection{ friend_id: ObjectId("4e94875fbd15f15834ff63c3") name: 'arussell' },{ friend_id: ObjectId("4e94875fbd15f15834ff63c4") name: 'bsmith' }

The upshot?Rich documents are still useful. Theysimplify the representation of objects andcan increase query performance because oftheir pre-joined structure.

However, if your documents are going togrow unbounded, it's best to separate theminto multiple collections.

Pre-aggregation

AggregationMap-reduce and group are adequate, butmay not be fast enough for large data sets.

MongoDB 2.2 has a new, fast aggregationframework!

Still, pre-aggregation will be faster thanpost-aggregation in a lot of cases. For real-time apps, it's almost a necessity.

Example: a counter cache.// User collection{ _id: ObjectId("4e94886ebd15f15834ff63c4"), name: 'Kyle', follower_ct: 4}

Using the $inc operator:// This increment is in-place.// (i.e., no rewriting of the document).db.users.update({name: 'Kyle'}, {$inc: {follower_ct: 1}})

Need a real-world example?

A sophisticated example of pre-aggregation.

{ _id: { uri: BinData("0beec7b5ea3f0fdbc95d0dd47f35"), day: '2011-5-1' }, total: 2820, hrs: { 0: 500, 1: 700, 2: 450, 3: 343, // ... 4-23 go here } // Minutes are rolling. This gives real-time // numbers for the last hour. So when you increment // minute n, you need to $set minute n-1 to 0. mins: { 1: 12, 2: 10, 3: 5, 4: 34 // ... 5-60 go here }}

Schema design summaryThink hard about the size of yourdocuments. Optimize keys and data types(not discussed).

If your documents are growing unbounded,you may have the wrong schema design.

Consider operations that rewrite documents(and individual values) in-place. $inc and(sometimes) $set is great examples of this.

II. Indexing

It's all about efficiency:Fundamental, but widely misunderstood.

The right indexes gives you the mostefficient use of your hardware (RAM, disk,and CPU).

The wrong indexes, or no indexesaltogether, make trivial workloadsimpossible to run, even on high-endhardware.

The BasicsEvery query should use an index. Use theMongoDB log or the query profiler to identifyqueries not using an index. The value ofnscanned should be low.

Know about compound-key index. Knowwhich indexes can be utilized for sorts,ranges, etc. Learn to use explain().

Good resources on indexing: MongoDB inAction and High Performance MySQL.

For the best performance, you should haveenough RAM to contain indexes andworking set.

Working setWorking set is the portion of your total datasize that's regularly used by the application.For some applications, working set might be50% of data size. For others, it's close to100%.

For example, think about Foursquare'scheckins database. Because checkins areconstantly queried to calculate badges,checkins must live in RAM. So working seton this database is 100%.

Working set (cont.)On the other end of the spectrum, Craigslistuses MongoDB as a listing archive. Thisarchive is rarely queried. Therefore, itdoesn't matter if data size is much largerthan RAM, since the working set is small.

Special indexing features...

Sparse indexesUse a sparse index to reduce index size. Asparse include will include only thosedocument having the indexed key.

For example, suppose you have 10 millionusers, of which only 100K are payingsubscribers. You can index only those fieldsrelevant to paid subscriptions with a sparseindex.

A sparse index:db.users.ensureIndex({expiration: 1}, {sparse: true})// All users whose accounts expire next monthdb.users.find({expiration: {$lte: new Date(2011, 11, 30), $gte: new Date(2011, 11, 1)})

Index-only queriesIf you only need a few values, you canreturn those values directly from the index.This eliminates the indirection from index todata files on the server.

Specify the fields you want, and exclude the_id field.

The explain() method will display{indexOnly: true}.

An index-only query:db.users.ensureIndex({follower_ct: 1, name: 1})// This will be index-only.db.users.find({}, {follower_ct: 1, name: 1, _id: 0}).sort({follower_ct: -1})

Indexing summaryLearn about indexing.

Ensure that your queries are using the mostefficient index.

Investigate sparse indexes and index-onlyqueries for performance-intensive apps.

Concurrency

Current implementation:Concurrency is still somewhat coarse-grained. For any given mongod, there's aserver-wide reader-writer lock, with a varietyof yielding optimizations.

For example, in MongoDB 2.0, the serverwon't hold a write lock around a page fault.

On the roadmap are database-level locking,collection-level locking, and extent-basedlocking.

To avoid concurrency-relatedbottlenecks:

Separate orthogonal concerns into multiplesmaller deployments. For example, one foranalytics and another for the rest of the app.

Ensure that your indexes and working set fitin RAM.

Do not attempt to scale reads withsecondary nodes unless your application ismostly read-heavy.

mostly read-heavy.

IV. Durability

Four topics:Storage

Journaling

Write concern

Replication

Storage

Each file is mapped to virtual memory.

All writes to data files are to a virtualmemory address.

Sync to disk is handled by the OS, with aforced flush every 60 seconds.

Virtual Memory(Per Process)

PhysicalMemory

Journaling

Data written to an append-only log, andsynced every 100ms.

This imposes a write penalty, especially onslow drives.

If you use journaling, you may want tomount a separate drive for the journaldirectory.

Enabled by default in MongoDB 2.0.

Replication

Fast, automatic failover.

Simplifies backups.

If you don't want to use journaling, you canuse replication instead. Recovery can betrickier, but writes will be faster.

Write concern

A default, fire-and-forget write:@users.insert( {'name' => 'Kyle'} )

Write with a round trip:@users.insert( {'name' => 'Kyle'}, :safe => true )

Write to two nodes with a 1000mstimeout:

@users.insert( {'name' => 'Kyle'}, :safe => {:w => 2, :wtimeout => 1000})

Write concern advice:Use a level of write concern appropriate tothe data you're writing.

By default, use {:safe => true}. That is,ensure a single round trip.

For especially sensitive data, use replicationacknowledgment.

For analyics, clicks, logging, etc., use fire-and-forget.

Durability in angerUse replication for durability. You can,optionally, keep a single, passive replicawith durability enabled.

Use write concern judiciously.

Topics we didn't cover:Hardware and deployment practices.

Sharding and schema design at scale.

(Lots of videos on these at 10gen.com!)

Announcements, Questions,and Credits

http://www.flickr.com/photos/foamcow/34055184/

http://www.flickr.com/photos/reedinglessons/2239767394

http://www.flickr.com/photos/edelman/6031599707

http://www.flickr.com/photos/curtisperry/5386879526/

http://www.flickr.com/photos/ryanspalding/4756905846

Thank you

Mongodb in-anger-boston-rb-2011

Documents

MongoDB Atlas - On Tour!: Introduction to MongoDB

Fahrplan 2020 RB 58 - Transdev · 2020. 4. 22. · RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB

ANGER. Anger I. What is anger? II. Is anger a sin? III. The type of anger VI. The steps that lead to unjustifiable anger V. How to overcome

云数据库 MongoDB - UCloud · MongoDB⽬前⽀持MongoDB 2.4、MongoDB 2.6、MongoDB 3.0、MongoDB 3.2、MongoDB3.4、MongoDB 3.6和MongoDB 4.0，⽤⼾可以根据需求选择相应的云数据库版本。

MongoDB Profiler Deep Dive; MongoDB Austin 2013

ASSESSMENT OF KRODHA (ANGER) WITH ‘NOVACO ANGER …

The MongoDB Strikes Back / MongoDB 의 역습

MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins

Sonderfahrplan zur Landesgartenschau in Kamp-Lintfort 2020 ......RB 31 Gültig ab 16.05.2020 Verkehrstage RB 31 RB 31 RB 31 RB 31 RB 31 RB 31 RB 31 RB 31 RB 31 RB 31 RB 31 RB 31 RB

MongoDB in use(김인범, mongodb korea)

Fachbereich Grundwasser - Rainbach · Rb-T014 Rb-T012 Rb-T011 Rb-T008 Rb-T007 Rb-T006 Rb-T005 Rb-T004 Rb-T003 Rb-T002 Rb-T001 Fr-T001 Rb-Q003 Rb-Q002 Rb-Q001 Rb-D013 Rb-D016 Rb-D022

Automate MongoDB with MongoDB Management Service

MongoDB Europe 2016 - Distributed Ledgers, Blockchain + MongoDB

NoSQL Concepts MongoDB Concepts MongoDB Demos Agenda

Realtime Analytics with MongoDB - MongoDB Meetup NYC

MongoDB Days UK: MongoDB and Spark

Anger management - The Secret to Anger FREE

Treatment of Anger or Anger Management

RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB ... · RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB 58 RB

MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines