31
Architecting Database Jony Sugianto, Research Engineer, Detikcom [email protected]

Architecting Database by Jony Sugianto (Detik.com)

Embed Size (px)

Citation preview

Architecting Database

Jony Sugianto, Research Engineer, [email protected]

Design Database System

● Complexity

- Database model

● Volume

- Sharding

● Traffic read/write

- Replication

Database Model

● Key-values Stores

● Document Databases

● Relational Databases

● Graph Databases

Key-values Stores

● A Key-Values Stores is a simple Hash table

● Where all the accesses to the Stores via primary keys

● A client can:

- Get the value for a key

- Put a value for a key

- Delete a key from the Stores

● Keys and Values can be complex compound objects and sometime lists, maps or other data structures

● Key-value data access enable high performance

● Easy to distribute across cluster

Key-Value Example

Key-Values: Cons

● No complex query filters

● All joins must be done in code

● No foreign key constraints

● Poor for interconnected data

Key-Values Stores Implementation

● Memory based

- Memcached

- Redis

● Memory and Disk based

- MapDB

- Diskv

● Database System

- Dynamo DB

- Aerospike

Document Databases

● Documents are the main concept

● Documents are:

-self-describing

-Hierarchical tree data structures(map. List, scalar-values)

{ name:ade, usia:20, alamat:depok}

{ name:wahyu, usia:30, pekerjaan:dosen}

Document Databases

Document Databases:Pros and Cons

● Pros:

- Simple model

- Built in Map-reduce ?

- Scalable

● Cons:

- Poor for interconnected data

Document Database Implementation

● MongoDB

● CouchDB

● Etc.

Relational Database

● Most popular Database system

● The model is based on tables, rows and columns and the manipulation of data stored within

● Relational database is a collection of these tables

Relational Model

Relational Database Pros/Cons

● Pros

- simple, well-establish, standard approach

- maps well to data with consistent structure

- has extensive join capabilities

● Cons

- hard to scale

- does not map well to semi-structured data

- knowledge of the database structure is required to create queries

Relational Database Implementation

● Postgres

● Mysql

● sqllite

Graph Database

● Relation as first class citizen

● Very old fundamental theory (1700)

● huge amount of graph algorithm existing

Graph Database

Graph Model

Key Value to Graph

Document to Graph

Relational to Graph

Graph Database Pros/Cons

● Pros

- powerful data model

- easy to query(relation as pointer to object)

- map well to semi-structured data

- can easily evolve schema

● Cons

- hard to scale

- lacks of tool and framework support

- requires new art of problem solving

Graph Database implementation

● JgraphT(library)

● JUNG(library)

● Neo4J

● HyergraphDB

● OrientDB

Sharding and Replication

● Handle huge amount of data

● Handle high traffic read/write data

What is Sharding?

● Sharding is NOT Master/Slave Database

● Sharding is NOT Replication

● Sharding is NOT Clustering

● Sharding is Splitting data across databases

● Splitted Data share nothing

● Important issues sharding key

Sharding

Database Shard ShardShard

Aggregation

Sharding Implementation

● Application site

● Middleware site

- Vitess

- Gizzard

● Server site

- MongoDB

Replication

● Creating and maintaining multiple copies of the same databases

● Improve:

- reliability

- fault-tolerance

- accessibility

● Important issues strategy of synchronizing data between database replicas

Replication

Scalability, Complexity and Database Model

Scalability

Complexity

Key-Values Stores

Document Database

Relational Database

Graph Database

Questions?