Upload
athira-mukundan
View
102
Download
0
Embed Size (px)
Citation preview
Databases in 30 minutes.It might just end up taking 40!
WhoamiAthira MukundanSolution Consultant @[email protected]
Let’s start at the beginning - RDBMS.SQL has ruled for 2 decades!
1)Store persistent data allowing applications to grab the bits they need through queries
2)Application Integration ensuring all apps have consistent up-to-date data
3)Mostly standard, use SQL
4)Transactions
5)Reporting
The power of RDBMS - ACID TransactionsTransaction : reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system failure, when execution stops (completely or partially) and many operations upon a database remain uncompleted, with unclear status.
1)Atomicity: All or none
2)Consistency: any transaction will bring the database from one valid state to another
3)Isolation: ensures that the concurrent execution of transactions results in a system state that would be obtained if transactions were executed sequentially
4)Durability: ensures that once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors.
DB-Engines Ranking
Where would RDBMS fail - 1)Rise of unstructured Data
2)Difficult to scale horizontally.
3)Sharding causes operational problems
4)Satisfying ACID is a hinderance for scaling
5)Impedance mismatch
Enter NoSQLA scale-out, shared-nothing architecture designed to run on a cluster
A non-locking concurrency control mechanism so real-time reads won’t conflict with writes
Scalable replication and distribution
Don’t have a fixed schema, allowing you to store any data in any record
Reduce Development Drag
CAP TheoremWhen partition occurs -
1) CA: Single site cluster, therefore all nodes are always in contact. When a partition occurs, the system blocks.
2) CP: Some data is not accessible but the rest is still consistent and accurate.
3) AP: System is available under partitioning but some of the data returned may be inaccurate
Data Model Based Categories in NoSql1)Key-Value
2)Document
3)Columnar
4)Graph
Key-Value DatabasesDesigned for storing, retrieving, and managing associative arrays, a data structure more commonly known today as a dictionary or hash
Advantages : fast look ups
Disadvantages: schema less
eg) Dynamo db, Redis
Document DatabasesDesigned for storing, retrieving and managing document-oriented information, also known as semi-structured data.
Advantages: Incomplete Data Tolerant
Disadvantages: Query performance, no structured query syntax
eg) mongodb, couchDB
Columnar DatabasesStores data tables as columns rather than as rows
Advantages : precisely access the data to answer a query rather than scanning and discarding unwanted data in rows. Query performance is often increased as a result, particularly in very large data sets.
Disadvantages: very long time to insert individual records into a table and even longer for updates
eg) Cassandra, HBase
Graph DatabasesUses graph structures for semantic queries with nodes, edges and properties to represent
and store data.
The relationships allow data in the store to be linked together directly, and in many cases retrieved with one operation.
Advantage: Graph algorithms - Shortest path, connectedness etc
Disadvantage: Not easy to cluster, traverse the whole graph to get the result
eg) Neo4j
How to choose a DB-Think about -
the CAP theorem.
the data model
ACID/BASE (Basically Available Solid State Eventual Consistency)
Development effort
What and When.
source:https://martinfowler.com/articles/nosql-intro-original.pdf