OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016

Preview:

Citation preview

- the 2nd generation of

(Multi-Model) NoSQLAnd why GraphDB are the

starting point of this revolutionAnd why GraphDB are the

starting point of this revolution

#OrientDB - @ldellaquila

Luigi Dell’AquilaCore Developer and Director of ConsultingOrientDB LTDTwitter: @ldellaquilahttp://www.orientdb.com

#OrientDB - @ldellaquila

“90% of the data in the world today has been created in the last two years alone.”

- IBM

#OrientDB - @ldellaquila

Order #134 (Order) John

(Provider)

Commodore Amiga 1200

(Product)

Frank (Customer)

Monitor 40” (Product)

Mouse (Product)

Bruno (Provider)

#OrientDB - @ldellaquila

Order #134 (Order) John

(Provider)

Commodore Amiga 1200

(Product)

Frank (Customer)

Monitor 40” (Product)

Mouse (Product)

Bruno (Provider)

Data by itself has little value, it’s the relationship

between data that gives it incredible value

#OrientDB - @ldellaquila

Order #134 (Order) John

(Provider)

Commodore Amiga 1200

(Product)

(Sells)

Frank (Customer)

(Has)(Makes)

Monitor 40” (Product)

(Sells)(Has)

Mouse (Product)

Bruno (Provider)

(Sells)

(Has)

#OrientDB - @ldellaquila

Key/Value DatabasesDocument Databases

Graph DatabasesColumn Databases

#OrientDB - @ldellaquila

Key/Value Databases

Document Databases Graph Databases

Column Databases

#OrientDB - @ldellaquila

Why do most NoSQL products avoid

managing relationships?

#OrientDB - @ldellaquila

ID Name

10 John

11 John

24 Mike

28 Mike

ID Address

10 24

10 33

32 44

ID Location

24 Milan

33 London

18 Paris

18 Madrid

44 Moscow

Customer CustomerAddress Address

Is this familiar?

#OrientDB - @ldellaquila

What’swrongwithJOIN?

#OrientDB - @ldellaquila

ID Name

10 John

11 John

24 Mike

28 Mike

ID Address

10 24

10 33

32 44

ID Location

24 Milan

33 London

18 Paris

18 Madrid

44 Moscow

Customer CustomerAddress AddressJoins are executed every time

you cross relationships

Querying millions of records joining 3-4 tables could

generate billions of combinations

#OrientDB - @ldellaquila

This is why the database query performance

suffers as the database increases in size

O(Log N)

#OrientDB - @ldellaquila

PERFORMANCE

DATABASE SIZE

RDBMS performance on traversal

#OrientDB - @ldellaquila

Solution: Graph Database!

#OrientDB - @ldellaquila

Back to school: Graph Theory crash course

#OrientDB - @ldellaquila

Basic Graph

Luigi RomeVisited

#OrientDB - @ldellaquila

VerticesandEdgescanhaveproperties

Verticesaredirected

*https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model

Property Graph Model*

Romecountry:Italy

Luigicompany:OrientDB

VerticesandEdgescanhaveproperties

VerticesandEdgescanhaveproperties

Visitedyear:2016

#OrientDB - @ldellaquila

Luigi Rome

Visitedyear:2016

AnEdgeconnectsonly2vertices

Usemultipleedgestorepresent1-NandN-Mrelationships

Workedyear:2016

1-N and N-M Relationships

#OrientDB - @ldellaquila

Congrats! This is your diploma in «Graph Theory»

#OrientDB - @ldellaquila

How does a true* Graph Database

manage relationships?

*a “Graph” layer on top of a DBMS doesn’t qualify as a true GraphDB

#OrientDB - @ldellaquila

Luigi Rome

Visitedyear:2015

#13:55#15:99

Each element in the Graph has own immutable

Record ID

#22:11

(Edge)

(Vertex)(Vertex)

Each element in the Graph has own immutable

Record ID

Each element in the Graph has own immutable

Record ID

#OrientDB - @ldellaquila

Connections use persistent pointers

Luigi Rome

Visitedon:2015

#13:55#15:99out = #22:11

in = #22:11

#22:11

(Edge)

(Vertex)(Vertex)

out = #13:55

in = #15:99

#OrientDB - @ldellaquila

Luigi Rome

Visitedon:2015

#13:55#15:99out = #22:11

in = #22:11

#22:11

(Edge)

(Vertex)(Vertex)

out = #13:55

in = #15:99

#OrientDB - @ldellaquila

Luigi Rome

Visitedon:2015

#13:55#15:99out = #22:11

in = #22:11

#22:11

(Edge)

(Vertex)(Vertex)

out = #13:55

in = #15:99

#OrientDB - @ldellaquila

A Graph Database creates the relationship just once

(when the edge is created)

VS

RDBMS computes the relationship every time you query a database

#OrientDB - @ldellaquila

When you move from a RDBMS to a Graph Database you jump

from a O(log N) speed to a near O(1)

With a Graph Database, the traversing time is

not affected by database size!

This is huge in the BigData age

#OrientDB - @ldellaquila

No costs to traverse relationships: • Recommendation engines • Social Applications • Spatial Apps • Master Data Management • Information Clustering

John

Thriller

Comedy

Pulp Fiction

Mr Bean

Theater B

Theater A

Theater C

NYC

San Josè

Lives in

Likes

#OrientDB - @ldellaquila

So the Graph Model Is the only solution to efficiently

manage relationships

But what about data complexity? And data consistency?

And scaling?

#OrientDB - @ldellaquila

Rel

atio

nshi

ps C

ompl

exity

>

Data Complexity >

Relational

Key Value

Column

Graph

Document

First Generation NoSQL

#OrientDB - @ldellaquila

RDBMS

Key/ValueStore

DocumentDatabase

GraphDatabase

Application

ETL

Primary DB

#OrientDB - @ldellaquila

RDBMS

Key/ValueStore

DocumentDatabase

GraphDatabase

Application

ETL

- No standard between NoSQL products - Multiple vendors = multiple skills - ETL + synchronization code is costly to write and maintain - Performance and Reliability is hard to predict

#OrientDB - @ldellaquila

2nd Generation NoSQL is

Multi-Model

2nd Generation NoSQL is

Multi-model

#OrientDB - @ldellaquila

GraphDocument

Object

Key/Value

Multi Model represents the intersection

of multiple models in just one product

What’s Multi-Model DBMS?

#OrientDB - @ldellaquila

GraphDocument

Object

Key/Value

Multi Model represents the intersection

of multiple models in just one product

- Just one product to learn and maintain - Just one vendor relationship to manage - No ETL, no synchronization required - Performance and Reliability is easy to test from the beginning

What’s Multi-Model DBMS?

#OrientDB - @ldellaquila

`

{ ”@rid": “12:382”, ”@class": ”Customer", “name”: “Frank”, “surname” : “Raggio”, “phone” : “+39 33123212”, “details”: { “city”:”London", “tags”:”millennial” } }

Frank

Order

Makes

General purpose solution: • JSON • Schema-less • Schema-full • Schema-hybrid • Nested documents • Rich indexing and querying • Developer friendly

#OrientDB - @ldellaquila

Second Generation NoSQLRelationship Complexity >

Data Complexity >

Relational

Key Value

Column

Graph

Document

Multi-Model

#OrientDB - @ldellaquila

With a true Graph, Document, Key/Value and Object Oriented engine

#OrientDB - @ldellaquila

•Support for TinkerPop standard for Graph DB: Gremlin language and Blueprints API

•SQL + extensions for graphs•JDBC driver to connect any BI tool•HTTP/JSON support•Drivers in Java, Node.js, Python,

PHP, .NET, Perl, C/C++ and more

API & Standards

#OrientDB - @ldellaquila

• Basic HTTP authentication (+HTTPS/SSL)

• User/Role authentication system. One User can have multiple Roles

• Privileges are managed in Roles

• Roles can inherit from other Roles

• Record-level security: every record can contain the user/role can create/read/update/delete the record

• Auditing available in Enterprise Edition

Security

#OrientDB - @ldellaquila

• HTTPS/SSL

• Starting from OrientDB v2.2:• Support for Kerberos• Encryption at REST using AES and DES of the entire

database or portions• PBKDF2 HASH algorithm with a 24-bit length Salt per

user for a configurable number of iterations.

Encryption

#OrientDB - @ldellaquila

• Full Backup and Restore

• Delta Backup (v2.2) Enterprise Edition and Restore is available

• Studio web tool

• Command line Console

Administration

#OrientDB - @ldellaquila

• Import/Export in JSON

• Import from SQL script

• OrientDB ETL tool (http://orientdb.com/docs/last/ETL-Introduction.html)

• Teleporter (v2.2)

Data Extraction and Loading

#OrientDB - @ldellaquila

• Multi-Master architecture

• Tunable consistency through the usage of a quorum, per database or single class (table)

• Synchronous and Asynchronous replication

• Zero config: if multicast is enabled the server is attached to the cluster

Scale out and HA

#OrientDB - @ldellaquila

Master Node Master Node

C

C C C

CC

C

Multi-master Replication

Atomic, Consistent, Isolated and Durable (ACID) multi-statement transactions

#OrientDB - @ldellaquila

Master Node Master Node

CC C C

CC

C

Auto-Discovered

Node

#OrientDB - @ldellaquila

Architectures

•  Single, stand-alone node

•  Embedded (in-process) DB

•  Multi-Master Replica

•  Mixed

DB Application

Application DB

Application

DB Application Application DB DB

(replica N)

DB Application Application DB DB

Application

DB

(replica N)

#OrientDB - @ldellaquila

FEATURES ORIENTDB)) MONGODB NEO4J MYSQL)(RDBMS)

Operational Database X X X Graph Database X X Document Database X X Object-Oriented Concepts X Schema-full, Schema-less, Schema mix X User and Role & Record Level Security X Record Level Locking X X X SQL X X ACID Transaction X X X Relationships (Linked Documents) X X X Custom Data Types X X X Embedded Documents X X Multi-Master Zero Configuration Replication X Sharding X X Server Side Functions X X X Native HTTP Rest/ JSON X X Embeddable with No Restrictions X

#OrientDB - @ldellaquila

Udemy Getting Started Training is and Free

http://www.orientechnologies.com/getting-started

OrientDB Enterprise is Free for Development

OrientDB Community is FREE for any purpose (APACHE 2 license)

#OrientDB - @ldellaquila

DEMO

#OrientDB - @ldellaquila

Thank you!Luigi Dell’Aquila@ldellaquilahttp://www.orientdb.com

Q/A

Recommended