Upload
others
View
15
Download
0
Embed Size (px)
Citation preview
CMPSC431W: Database Management Systems
Instructor: Yu-‐San Lin [email protected]
Course Website: hCp://www.cse.psu.edu/~yul189/cmpsc431w
Lecture 23 10/21/15
1
What is NoSQL?
• NoSQL stands for ___________ • It is a non-‐relaRonal database management systems
• Designed for distributed data stores where very large scale of data storing is needed – Data storing nowadays may not require fixed schema, avoid join operaRons and typically scale horizontally
4
Why NoSQL?
• NoSQL data modeling o[en starts from the applicaRon-‐specific queries: – RelaRonal modeling: “What answers do I have?” – NoSQL data modeling: “What ques4ons do I have?”
• NoSQL data modeling o[en requires a deeper understanding of data structures and algorithms than relaRonal database modeling does
• ____________ and ________________ are first-‐class ciRzens
• NoSQL soluRons are surprisingly strong for hierarchical or graph-‐like data modeling and processing.
7
RDBMS v.s. NoSQL
RDBMS • Structured and organized data • Structured query language
(SQL) • Data and its relaRonships are
stored in separate tables. • Data ManipulaRon Language,
Data DefiniRon Language • Tight Consistency • BASE TransacRon
NoSQL • Stands for Not Only SQL • No declaraRve query language • No predefined schema • Key-‐Value pair storage, Column
Store, Document Store, Graph databases
• Eventual consistency rather ACID property
• Unstructured and unpredictable data
• _____ Theorem • PrioriRzes high performance, high
availability and scalability
9
CAP Theorem
• _____________: data in the database remains consistent a[er the execuRon of an operaRon, e.g., a[er an update operaRon, all clients see the same data
• _____________: system is always on, no downtown
• __________________: system conRnues to funcRon even the communicaRon among the servers is unreliable
10
NoSQL Pros & Cons
Pros • High scalability • Distributed compuRng • Lower cost • Schema flexibility, semi-‐
structure data • No complicated
relaRonships
Cons • No _____________ • Limited query capabiliRes
(so far)
12
Types of NoSQL Databases
• ____________ database • ____________ database • ____________ databases • ____________ databases * There is not a single soluRon that is beCer than the others.
13
Key-‐Value Database
• The most basic types of NoSQL databases • Designed to handle huge amount of data • Based on Amazon’s Dynamo paper • Allows developer to store __________ data • Database stores data as __________ where each key is unique and the value can be string, JSON, BLOB (basic large object) etc.
14
Key-‐Value Database (cont.)
• A key may be strings, hashes, lists, sets, sorted sets
• Key-‐value stores can be used as collecRons, dicRonaries, associaRve arrays etc.
• Key-‐value stores follow the ___________ and __________ aspects of CAP theorem
• Suitable for: shopping cart contents, individual values like color schemes
15
Column Family Database
18
• Primarily work on columns and every column is treated individually
• Values of a single column are stored conRguously
• Column stores data in column specific files • In column stores, query processors work on columns too
Column Family Database (cont.)
• All data within each column datafile have the same type which makes it ideal for compression
• Column stores can improve the performance of queries as it can access specific column data
• High performance on ___________ queries, e.g., COUNT, SUM, AVG, MIN, MAX
• Suitable for: customer relaRonship management (CRM), library card catalogs
19
Graph Database
• Stores data in _______ • Capable of elegantly represenRng any kind of data in a highly accessible way
• Is a collecRon of _______ and ________ • Each note represents an ________, and each edge represents a connecRon or relaRonship between two nodes
22
Graph Database (cont.)
• Every node and edge is defined by a unique idenRfier
• Each node knows its adjacent nodes • As the number of nodes increases, the cost of a local step (or hop) remains the same
23
RelaRonal Model v.s. Graph Model
Rela9onal Model Graph Model Tables VerRces and Edges set Rows VerRces Columns Key-‐value pairs Joins Edges
25
Document Database
• A collecRon of ___________ • Data in this model is stored inside documents • A document is a key value collecRon where the key allows access to its value
• Documents are not typically forced to have a schema and therefore are flexible and easy to change
27
Document Database (cont.)
• Documents are stored into collecRons in order to group different kinds of data
• Documents can contain many different key value pairs, or key array pairs, or even nested documents
28
RelaRonal Model v.s. Document Model
Rela9onal Model Document Model Tables CollecRons Rows Documents Columns Key-‐value pairs Joins N/A
29
References
• Ilya Katsov, NoSQL Data Modeling Techniques, hCps://highlyscalable.wordpress.com/2012/03/01/nosql-‐data-‐modeling-‐techniques/
• Pramod Sadalage, NoSQL Databases: An Overview, hCp://www.thoughtworks.com/insights/blog/nosql-‐databases-‐overview
• NoSQL tutorial, hCp://www.w3resource.com/mongodb/nosql.php
32