1
Performance analysis of joins in reactive queries Mitar Milutinović ([email protected]), Amy Pavel ([email protected]) Motivation NoSQL databases like MongoDB remove relations between documents and leave it to users to resolve relations on their own. Resolving relations on the client side results in round trip delays. Modern web frameworks like Meteor provide an API to create reactive queries to the MongoDB: after initial query results, query is kept open and any update to the results set is streamed to the client. But if a query contains joins, constructing reactive queries by hand and resolving relations becomes hard. Moreover, consequence of multiple round trip delays accumulates. PeerDB offers a way to embed useful fields from related documents and keep them updated, reducing read time on queries with joins, while allowing easy construction of reactive queries as well. { "_id": "k7cgWtxQpPQ3gLgxa", "body": "I'm a post about rabbits!", "author": { "_id": "frqejWeGWjDTPMj7P" }, "tags": [{ "_id": "KMYNwr7TsZvEboXCw" }, { "_id": "tMgj8mF2zF3gjCftS" }] } { "_id": "k7cgWtxQpPQ3gLgxa", "body": "I'm a post about rabbits!", "author": { "_id": "frqejWeGWjDTPMj7P", "name": "Henry", "picture": "http://img.com/img.jpg" }, "tags": [{ "_id": "KMYNwr7TsZvEboXCw", "name": "animals" }, { "_id": "tMgj8mF2zF3gjCftS", "name": "information" }] } MongoDB MongoDB with PeerDB Client Database SQL Q1 Client Database PeerDB R1 Client Database Q1 R1 Q2 R2 Q1 R1 MongoDB Time Time Time Client Database SQL Q1 Client Database PeerDB R1 Q1 R1 MongoDB Time Time Client Database Q1 R1 Time Q2 Q3 Q4 Q5 Q6 Q7 R2 R3 R4 R5 R6 R7 Goals Evaluate PeerDB and recommend design guidelines for embed- ding fields. In particular, we will compare performance of the following variations using a blog application: PeerDB (high-level Meteor and low-level queries) MongoDB (high-level Meteor and low-level queries) Postgres (high-level Django and low-level queries) Person Name Bio Picture Tag Name Description Post Body Author (person_id) Tags (tag ids) Comment Body Post (post_id) Blog application schema We will offer design guidelines on when to use PeerDB embed- ded fields versus other methods by varying these parameters: Size of fields in related objects Number of of objects Number of database instances Comparison results We found that high performance reads of PeerDB come at the expense of slow writes (figure to the right). Based on our eval- uation we recommend using postgres for write-heavy applica- tions. Design guidelines Based on our evaluation and use of PeerDB in PeerLibrary we came up with the following design guidelines for picking fields to embed. These design considerations can be automated or handled during schema specification. Fields to embed are small in size Values of all embedded fields are shared across all queries Reads on documents using embedded fields are much more common than writes Reactive queries with joins In addition to making static queries to evaluate PeerDB per- formance we evaluated reactive queries with joins as well. Related work Materialized views Column stores and database cracking Customizing NoSQL databases Github links PeerDB: github.com/peerlibrary/meteor-peerdb Benchmarking: github.com/mitar/peerdb-benchmark

MongoDB MongoDB with PeerDB - EECS at UC Berkeleykubitron/courses/cs262a-F14/... · create reactive queries to the MongoDB: after initial query results, query is kept open and any

  • Upload
    buithuy

  • View
    222

  • Download
    0

Embed Size (px)

Citation preview

Performance analysis of joins in reactive queriesMitar Milutinović ([email protected]), Amy Pavel ([email protected])

MotivationNoSQL databases like MongoDB remove relations betweendocuments and leave it to users to resolve relations on theirown. Resolving relations on the client side results in roundtrip delays.

Modern web frameworks like Meteor provide an API tocreate reactive queries to the MongoDB: after initial queryresults, query is kept open and any update to the resultsset is streamed to the client. But if a query contains joins,constructing reactive queries by hand and resolving relationsbecomes hard. Moreover, consequence of multiple round tripdelays accumulates.

PeerDB offers a way to embed useful fields from relateddocuments and keep them updated, reducing read time onqueries with joins, while allowing easy construction of reactivequeries as well.

{ "_id": "k7cgWtxQpPQ3gLgxa", "body": "I'm a post about rabbits!", "author": { "_id": "frqejWeGWjDTPMj7P" }, "tags": [{ "_id": "KMYNwr7TsZvEboXCw" }, { "_id": "tMgj8mF2zF3gjCftS" }]}

{ "_id": "k7cgWtxQpPQ3gLgxa", "body": "I'm a post about rabbits!", "author": { "_id": "frqejWeGWjDTPMj7P", "name": "Henry", "picture": "http://img.com/img.jpg" }, "tags": [{ "_id": "KMYNwr7TsZvEboXCw", "name": "animals" }, { "_id": "tMgj8mF2zF3gjCftS", "name": "information" }]}

MongoDB MongoDB with PeerDB

Client Database

SQL

Q1

Client Database

PeerDB

R1

Client Database

Q1

R1

Q2

R2

Q1

R1

MongoDB

Time

Time

Time

Client Database

SQL

Q1

Client Database

PeerDB

R1

Q1

R1

MongoDB

Time

Time

Client Database

Q1

R1

Time

Q2Q3Q4Q5Q6Q7

R2R3R4

R5R6R7

GoalsEvaluate PeerDB and recommend design guidelines for embed-ding fields. In particular, we will compare performance of thefollowing variations using a blog application:∙ PeerDB (high-level Meteor and low-level queries)∙ MongoDB (high-level Meteor and low-level queries)∙ Postgres (high-level Django and low-level queries)

Person

NameBioPicture

Tag

NameDescription

Post

BodyAuthor (person_id)Tags (tag ids)

Comment

BodyPost (post_id)

Blog application schema

We will offer design guidelines on when to use PeerDB embed-ded fields versus other methods by varying these parameters:∙ Size of fields in related objects∙ Number of of objects∙ Number of database instances

Comparison resultsWe found that high performance reads of PeerDB come at theexpense of slow writes (figure to the right). Based on our eval-uation we recommend using postgres for write-heavy applica-tions.

Design guidelinesBased on our evaluation and use of PeerDB in PeerLibrary wecame up with the following design guidelines for picking fieldsto embed. These design considerations can be automated orhandled during schema specification.

∙ Fields to embed are small in size∙ Values of all embedded fields are shared across all queries∙ Reads on documents using embedded fields are much more

common than writes

Reactive queries with joinsIn addition to making static queries to evaluate PeerDB per-formance we evaluated reactive queries with joins as well.

Related work∙ Materialized views∙ Column stores and database cracking∙ Customizing NoSQL databases

Github linksPeerDB: github.com/peerlibrary/meteor-peerdbBenchmarking: github.com/mitar/peerdb-benchmark