#OrientDB - @ldellaquila
- the 2nd generation of
(Multi-Model) NoSQLAnd why GraphDB are the starting point
of this revolution
#OrientDB - @ldellaquila
Luigi Dell’AquilaDirector of ConsultingOrientDB LTDTwitter: @ldellaquilahttp://www.orientdb.com
#OrientDB - @ldellaquila
“90% of the data in the world today has been created in the last two years alone.”
- IBM
#OrientDB - @ldellaquila
Order #134 (Order) John
(Provider)
Commodore Amiga 1200
(Product)
Frank (Customer)
Monitor 40” (Product)
Mouse (Product)
Bruno (Provider)
#OrientDB - @ldellaquila
Order #134 (Order) John
(Provider)
Commodore Amiga 1200
(Product)
Frank (Customer)
Monitor 40” (Product)
Mouse (Product)
Bruno (Provider)
Data by itself has little value, it’s the relationship
between data that gives it incredible value
#OrientDB - @ldellaquila
Order #134 (Order) John
(Provider)
Commodore Amiga 1200
(Product)
(Sells)
Frank (Customer)
(Has)(Makes)
Monitor 40” (Product)
(Sells)(Has)
Mouse (Product)
Bruno (Provider)
(Sells)
(Has)
#OrientDB - @ldellaquila
Key/Value DatabasesDocument Databases
Graph DatabasesColumn Databases
#OrientDB - @ldellaquila
Key/Value Databases
Document Databases Graph Databases
Column Databases
#OrientDB - @ldellaquila
Why do most NoSQL products avoid
managing relationships?
#OrientDB - @ldellaquila
ID Name
10 John
11 John
24 Mike
28 Mike
ID Address
10 24
10 33
32 44
ID Location
24 Milan
33 London
18 Paris
18 Madrid
44 Moscow
Customer CustomerAddress Address
Is this familiar?
#OrientDB - @ldellaquila
What’swrongwithJOIN?
#OrientDB - @ldellaquila
ID Name
10 John
11 John
24 Mike
28 Mike
ID Address
10 24
10 33
32 44
ID Location
24 Milan
33 London
18 Paris
18 Madrid
44 Moscow
Customer CustomerAddress AddressJoins are executed every time
you cross relationships
Querying million of records joining 3-4 tables could
generate billions of combinations
#OrientDB - @ldellaquila
This is why the database query performance
suffers as the database increases in size
O(Log N)
#OrientDB - @ldellaquila
PERFORMANCE
DATABASE SIZE
RDBMS performance on traversal
#OrientDB - @ldellaquila
Solution: Graph Database!
#OrientDB - @ldellaquila
Back to school: Graph Theory crash course
#OrientDB - @ldellaquila
Basic Graph
Luigi RomeVisited
#OrientDB - @ldellaquila
VerticesandEdgescanhaveproperties
Verticesaredirected
*https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model
Property Graph Model*
Romecountry:Italy
Luigicompany:OrientDB
VerticesandEdgescanhaveproperties
VerticesandEdgescanhaveproperties
Visitedyear:2016
#OrientDB - @ldellaquila
Luigi Rome
Visitedyear:2016
AnEdgeconnectsonly2vertices
Usemultipleedgestorepresent1-NandN-Mrelationships
Workedyear:2016
1-N and N-M Relationships
#OrientDB - @ldellaquila
Congrats! This is your diploma in «Graph Theory»
#OrientDB - @ldellaquila
How does a true* Graph Database
manage relationships?
*a “Graph” layer on top of a DBMS doesn’t qualify as a true GraphDB
#OrientDB - @ldellaquila
Luigi Rome
Visitedyear:2015
#13:55#15:99
Each element in the Graph has own immutable
Record ID
#22:11
(Edge)
(Vertex)(Vertex)
Each element in the Graph has own immutable
Record ID
Each element in the Graph has own immutable
Record ID
#OrientDB - @ldellaquila
Connections use persistent pointers
Luigi Rome
Visitedon:2015
#13:55#15:99out = #22:11
in = #22:11
#22:11
(Edge)
(Vertex)(Vertex)
out = #13:55
in = #15:99
#OrientDB - @ldellaquila
Luigi Rome
Visitedon:2015
#13:55#15:99out = #22:11
in = #22:11
#22:11
(Edge)
(Vertex)(Vertex)
out = #13:55
in = #15:99
#OrientDB - @ldellaquila
Luigi Rome
Visitedon:2015
#13:55#15:99out = #22:11
in = #22:11
#22:11
(Edge)
(Vertex)(Vertex)
out = #13:55
in = #15:99
#OrientDB - @ldellaquila
A Graph Database creates the relationship just once
(when the edge is created)
VS
RDBMS computes the relationship every time you query a database
#OrientDB - @ldellaquila
When you move from a RDBMS to a Graph Database you jump
from a O(log N) speed to a near O(1)
With a Graph Database, the traversing time is
not affected by database size!
This is huge in the BigData age
#OrientDB - @ldellaquila
No costs to traverse relationships: • Recommendation engines • Social Applications • Spatial Apps • Master Data Management • Information Clustering
John
Thriller
Comedy
Pulp Fiction
Mr Bean
Theater B
Theater A
Theater C
NYC
San Josè
Lives in
Likes
#OrientDB - @ldellaquila
So the Graph Model Is the only solution to efficiently
manage relationships
But what about data complexity? And data consistency?
And scaling?
#OrientDB - @ldellaquila
Rel
atio
nshi
ps C
ompl
exity
>
Data Complexity >
Relational
Key Value
Column
Graph
Document
First Generation NoSQL
#OrientDB - @ldellaquila
RDBMS
Key/ValueStore
DocumentDatabase
GraphDatabase
Application
ETL
Primary DB
#OrientDB - @ldellaquila
RDBMS
Key/ValueStore
DocumentDatabase
GraphDatabase
Application
ETL
- No standard between NoSQL products - Multiple vendors = multiple skills - ETL + synchronization code is costly to write and maintain - Performance and Reliability is hard to predict
#OrientDB - @ldellaquila
2nd Generation NoSQL is
Multi-Model
2nd Generation NoSQL is
Multi-model
#OrientDB - @ldellaquila
GraphDocument
Object
Key/Value
Multi Model represents the intersection
of multiple models in just one product
What’s Multi-Model DBMS?
#OrientDB - @ldellaquila
GraphDocument
Object
Key/Value
Multi Model represents the intersection
of multiple models in just one product
- Just one product to learn and maintain - Just one vendor relationship to manage - No ETL, no synchronization required - Performance and Reliability is easy to test from the beginning
What’s Multi-Model DBMS?
#OrientDB - @ldellaquila
`
{ ”@rid": “12:382”, ”@class": ”Customer", “name”: “Frank”, “surname” : “Raggio”, “phone” : “+39 33123212”, “details”: { “city”:”London", “tags”:”millennial” } }
Frank
Order
Makes
General purpose solution: • JSON • Schema-less • Schema-full • Schema-hybrid • Nested documents • Rich indexing and querying • Developer friendly
#OrientDB - @ldellaquila
Second Generation NoSQL
Relationship Complexity >
Data Complexity >
Relational
Key Value
Column
Graph
Document
Multi-Model
#OrientDB - @ldellaquila
With a true Graph, Document, Key/Value and Object Oriented engine
#OrientDB - @ldellaquila
•Support for TinkerPop standard for Graph DB: Gremlin language and Blueprints API
•SQL + extensions for graphs•JDBC driver to connect any BI tool•HTTP/JSON support•Drivers in Java, Node.js, Python,
PHP, .NET, Perl, C/C++ and more
API & Standards
#OrientDB - @ldellaquila
•OrientDB footprint is minimal and the embedded version can run with few MB of RAM
• OrientDB requires a Java Runtime
•When run distributed, OrientDB uses Hazelcast (Apache2 licensed) library embedded
Requirements and Dependencies
#OrientDB - @ldellaquila
• Basic HTTP authentication (+HTTPS/SSL)
• User/Role authentication system. One User can have multiple Roles
• Privileges are managed in Roles
• Roles can inherit from other Roles
• Record-level security: every record can contain the user/role can create/read/update/delete the record
• Auditing available in Enterprise Edition
Security
#OrientDB - @ldellaquila
• HTTPS/SSL
• Starting from OrientDB v2.2:• Support for Kerberos• Encryption at REST using AES and DES of the entire
database or portions• PBKDF2 HASH algorithm with a 24-bit length Salt per
user for a configurable number of iterations.
Encryption
#OrientDB - @ldellaquila
• Full Backup and Restore
• Delta Backup (v2.2) Enterprise Edition and Restore is available
• Studio web tool
• Command line Console
Administration
#OrientDB - @ldellaquila
• Import/Export in JSON
• Import from SQL script
• OrientDB ETL tool (http://orientdb.com/docs/last/ETL-Introduction.html)
• Teleporter (v2.2)
Data Extraction and Loading
#OrientDB - @ldellaquila
• Multi-Master architecture
• Tunable consistency through the usage of a quorum, per database or single class (table)
• Synchronous and Asynchronous replication
• Zero config: if multicast is enabled the server is attached to the cluster
Scale out and HA
#OrientDB - @ldellaquila
Master Node Master Node
C
C C C
CC
C
Multi-master Replication
Atomic, Consistent, Isolated and Durable (ACID) multi-statement transactions
#OrientDB - @ldellaquila
Master Node Master Node
C
C C C
CC
C
Auto-Discovered
Node
#OrientDB - @ldellaquila
Architectures
• Single, stand-alone node
• Embedded (in-process) DB
• Multi-Master Replica
• Mixed
DB Application
Application DB
Application
DB Application Application DB DB
(replica N)
DB Application Application DB DB
Application
DB
(replica N)
#OrientDB - @ldellaquila
FEATURES ORIENTDB)) MONGODB NEO4J MYSQL)(RDBMS)
Operational Database X X X Graph Database X X Document Database X X Object-Oriented Concepts X Schema-full, Schema-less, Schema mix X User and Role & Record Level Security X Record Level Locking X X X SQL X X ACID Transaction X X X Relationships (Linked Documents) X X X Custom Data Types X X X Embedded Documents X X Multi-Master Zero Configuration Replication X Sharding X X Server Side Functions X X X Native HTTP Rest/ JSON X X Embeddable with No Restrictions X
#OrientDB - @ldellaquila
Udemy Getting Started Training is and Free
http://www.orientechnologies.com/getting-started
OrientDB Enterprise is Free for Development
OrientDB Community is FREE for any purpose (APACHE 2 license)
#OrientDB - @ldellaquila
DEMO
#OrientDB - @ldellaquila
Thank you!Luigi Dell’Aquila@ldellaquilahttp://www.orientdb.com
Q/A
#OrientDB - @ldellaquila
#OrientDB - @ldellaquila
50,000 Downloads per
Month from 200+ countries.
70+ Committers
contributing to the product
1000s Users from SMBs
to Fortune 10 Companies.
17+ Years of Research have been put in
the product
#OrientDB - @ldellaquila