Upload
nimat-khattak
View
2.021
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
NoSQL Databases.
ByNimat Ullah Khattak.11-MS(IT)-27
& Majid Hussain. 11-MS(IT)-02
Overview of The Presentation
NoSQL Why NoSQL Categories of NoSQL databases Comparison of different NoSQL databases.
NoSQL
A term which stands for
NoSQL (Beginning)
First used by Carlo Strozzi in 1998.
Reintroduced by Eric Evan in 2009.
NoSQL(What is NoSQL?)
NoSQL doesn’t mean to stop using SQL. OR SQL won’t be used.
The term refers to those databases that differ from relational databases.
Simply Non-relational databases.
Some terms we must know…
ACID (Atomicity, Consistency, Isolation, Durability).
CAP (Consistency, Availability, Partition Tolerance).
ACID
Atomicity. All of the operations in the transaction will complete, or none will.
Consistency. The database will be in a consistent state when the transaction begins and ends.
Isolation. The transaction will behave as if it is the only operation being performed upon the database. (No interference of transaction)
Durability. Upon completion of the transaction, the operation will not be reversed.
CAP
Consistency. No contradiction b/w data. Availability. Every operation must terminate in an
intended response. Partition tolerance. Operations will complete, even if
individual components are unavailable
NoSQL are based on CAP…
CAP (For NoSQL)
Why NoSQL?
NoSQL didn’t come because of the shortfalls of SQL…
Why NoSQL (Features)
It provides: Horizontal scalability Open-source Schema-freeness Easy replication support Simple API
Why NoSQL (Features)
NoSQL databases are eventually consistent / CAP (not ACID).
Scalability
To maintain performance. Horizontal Scalability:
To increase the no-of machines but maintaining proportional performance.
Vertical scalability: To add more resources to your single machine to
optimize performance.
Open source Most of the NoSQL Projects are Open source. So any one
can use, modify it, like Cassandra by facebook. Couch and MongoDB. Neo4j etc
Bigtable by google but only allowed for Google application.
Schema freeness NoSQL databases doesn’t use any fixed schema like
relational database.
Internal schemaExternal schema etc
The original intention of NoSQL is the modern web-scale
databases.
Easy replication support
The use of redundant resources to improve: Reliability Fault-tolerance Performance
Why NoSQL (Benefits)
1. Scaling
RDBs weren’t easy to scale out.
On the other hand NoSQL DBs are specially designed to scale out.
Why NoSQL (Benefits)
2. Big dataSingle RDBMS is almost unable to handle today’s huge amount of data and the transaction on that data.
But
Non-Relational databases are specially designed to handle big data.
Why NoSQL (Benefits)
3. Needs no Expert DBAs
Although RDMS vendors claim that RDBMS provide management facilities but it still need an expert DBA to operate it.
In contrast NoSQL DBs don’t need expert DBAs, as it provides automatic repair, data distribution, and simpler data models, which lead to lower administration.
Why NoSQL (Benefits)
4. Economics
RDBMS requires expensive components for providing
efficient service.
NoSQL uses cheap commodity servers to manage the same
amount of data for which RDBMS needs expensive server.
so NoSQL is economical as well.
Why NoSQL (Benefits)
5. Flexibility of data models
There can occur changes in the requirements of an organization with the passage of time…
Changes in RDBMS after its deployment creates many
problems and also affects its services or some time it’s even
almost impossible to make changes.
NoSQL database can be changed at any instance, i-e
existing columns can be altered and new can be added.
Categories of NoSQL databases1) Key Value stores
Don’t have any schema Fast lookups facility Can be use in a forum software where user’s statistic
and messages are recorded. User’s id will serve as a key and will retrieve a string that represents all the relevant info of the user. And a background process recalculates the information and writes to the store independently after fixed interval of time.
Example Redis : Redis is an open source, advanced key-
value store. It is often referred to as a data structure server since keys (data types) can contain strings, hashes, lists, sets and sorted sets.
Categories (cont…) API: Tons of languages, Written in: C, Concurrency: in memory and saves asynchronous disk
after a defined time.
2) Document databases web application tolerance of incomplete data low query performance no standard query syntax Used when we don’t have complete data about all the
entities of database but we still need to create database.
ExampleCouchDB: API: JSON, Protocol: REST, Query Method: MapReduceR of JavaScript Funcs, Replication: Master Master, Written in: Erlang.
Categories (cont…) MongoDBMongoDB: API: BSON, Protocol: lots of langs, Query : API: BSON, Protocol: lots of langs, Query
Method: dynamic object-based language, Method: dynamic object-based language, Replication: Master Slave, Written in: C++.Replication: Master Slave, Written in: C++.
3) Graph databases Social networkingSocial networking graph algorithms, connectedness, degree of graph algorithms, connectedness, degree of
relationshipsrelationships has to traverse the entire graph to get definitive has to traverse the entire graph to get definitive
answer.answer. not easy to cluster.not easy to cluster. used in a situation where we want to analyze the on used in a situation where we want to analyze the on
going trends and take decision on the basis of those going trends and take decision on the basis of those trends.trends.
Categories (cont…)
Example:Example:Neo4J:Neo4J: API: lots of langs, Protocol: Java embedded / API: lots of langs, Protocol: Java embedded / REST, Query Method: SparQL, nativeJavaAPI, JRuby, REST, Query Method: SparQL, nativeJavaAPI, JRuby, Replication: typical MySQL style master/slave, Replication: typical MySQL style master/slave, Written in: Java, Concurrency: non-block reads, Written in: Java, Concurrency: non-block reads, writes locks involved nodes/relationships until writes locks involved nodes/relationships until commit, Misc: ACID possiblecommit, Misc: ACID possible
4) XML databases Publishing mature search technologies schema validation re-writing is easier than updating
Categories (cont…)
Used in a situation where one wants to produce documents of articles etc from a huge amount of documents but the format of those article doesn’t allow the publisher to perform search on it. Those articles are converted into xml database and wrap it in a readable-URL web service for the document production systems.
5) Distributed Peer Stores distributed file systems Fast lookups Good distributed storage of data Very low level API Best for voting system. In such a situation one
store/user and one store/piece of content is created. The user store will hold all the votes they have ever casted and the content will store a copy of the content on which vote was casted.
Categories (cont…)
ExampleExampleCassandra:Cassandra: API: many Thrift languages, Query API: many Thrift languages, Query Method: MapReduce, Written in: Java, Concurrency: Method: MapReduce, Written in: Java, Concurrency: eventually consistent , Misc: like "Big-Table on eventually consistent , Misc: like "Big-Table on Amazon Dynamo alike", initiated by Facebook.Amazon Dynamo alike", initiated by Facebook.
Tabular comparison:
The End…
Any Confusion…???