Graph Database, a little connected tour - Castano

Preview:

DESCRIPTION

Slides from Francisco Fernandez Castano talk @ codemotion roma 2014

Citation preview

Graph DatabasesA little connected tour

!

@fcofdezc

Francisco Fernández Castaño @fcofdezc Sw Engineer @biicode

Beginning

The old town of Königsberg has seven bridges:

Can you take a walk through town, visiting each part of the town and crossing each bridge only once?

El origenG = (V, E)

What is a Graph DB?

Graph

Nodes Relationships

Properties

Stor

e Store

Connect

HaveHave

Written in Java

ACID

Rest interface

Cypher

Why NOSQL?

The value of Relational Databases

Ventajas de BD Relacionales

ConcurrenciaPersistenciaIntegraciónEstándar

Persistence

Ventajas de BD Relacionales

ConcurrenciaPersistenciaIntegraciónEstándar

Concurrency

Ventajas de BD Relacionales

ConcurrenciaPersistenciaIntegraciónEstándar

Integration

Ventajas de BD Relacionales

ConcurrenciaPersistenciaIntegraciónEstándar

Standard

inconveniences Relational DBs

El OrigenImpedance Mismatch

class Client < ActiveRecord::Base has_one :address has_many :orders has_and_belongs_to_many :rolesend

DesVentajas de BD Relacionales

Fricción!InteroperabilidadAdaptación al cambioEscalabilidadNo está destinada para ciertos escenarios

Interoperability

Adaptation to changes

!Scalability

The traditional way in the context of connected data is artificial

Depth MySQL time (s) Neo4j time (s) Results

2 0.016 0.01 ~2500

3 30.267 0.168 ~110,000

4 1543.505 1.359 ~600,000

5 No Acaba 2.132 ~800,000

MySQL vs Neo4j

* Neo4J in Action

Person

Id Person

1 Frank

2 John

.. …

99 Alice

PersonFriend

PersonID FriendID

1 2

2 1

.. …

99 2

O(log n)

O(1)

O(m log n)

O(m)

We can transform our domain model in a natural way

Use cases

Social Networks

Follow

Follow

John Jeff

Douglas

Geospatial problems

Fraud detection

Authorization

Network management

CypherDeclarative language

ASCII oriented

Pattern matching

CypherCypher

Traverser API

Core API

Kernel

Cypher

a b

(a)-->(b)

Cypher

clapton cream

(clapton)-[:play_in]->(cream)

play_in

Follow

FollowJohn Jeff

Douglas

Cypher

(john:User)-[:FOLLOW]->(jeff:User)!

(douglas:User)-[:FOLLOW]->(john:User)

Cypher

clapton {name: Eric Clapton}

cream

(clapton)-[:play_in]->(cream)<-[:labeled]-(blues)

play_in {date: 1968}

Blues

labeled

Cypher

MATCH (a)-—>(b)RETURN a,b;

Cypher

MATCH (a)-[:PLAY_IN]—>(b)RETURN a,b;

Cypher

MATCH (a)-[:PLAY_IN]—>(g), (g)<-[:LABELED]-(e)RETURN a.name, t.date, e.name;

Cypher

MATCH (c {name: ‘clapton’})-[t:PLAY_IN]—>(g), (g)<-[:LABELED]-(e)RETURN c.name, t.date, e.name;

Cypher

MATCH (c {name: ‘clapton’})-[t:PLAY_IN]—>(g), (g)<-[:LABELED]-(e {name: ‘blues’})RETURN c.name, e.nameORDER BY t.date

Cypher

MATCH (c {name: ‘clapton’})-[r:PLAY_IN | PRODUCE]—>(g), (g)<-[:LABELED]-(e {name: ‘blues’})RETURN c.name, e.nameWHERE r.date > 1968ORDER BY r.date

Cypher

MATCH (carlo)-[:KNOW*5]—>(john)

MATCH p = (startNode:Station {name: ‘Sol’}) -[rels:CONNECTED_TO*]-> (endNode:Station {name: ‘Retiro’})RETURN p AS shortestPath, reduce(weight=0, r in rels: weight + r.weight) as tWeightORDER BY tWeight ASCLIMIT 1

Recommendation System

Social network

Movies social network

Users rate movies

People act in movies

People direct movies

Users follow other users

Movies social network

How do we model it?

Movies social network

Follow

Rate {stars}

User

Film

User

Actor

Director

Act in

Direct

Movies social network

MATCH (fran:User {name: ‘Fran’}) -[or:Rate]-> (pf:Film {title: ‘Pulp Fiction’}),! (pf)<-[:Rate]-(other_users)-[:Rate]->(other_films)!RETURN distinct other_films.title;

Movies social network

Rate {stars}

Rate {stars}

User 1

Film PF

Fran User 2

Rate {stars}

Film

Film

Rate {stars}

Rate {stars}

Movies social network

MATCH (fran:User {name: ‘Fran’}) -[or:Rate]-> (pf:Film {title: ‘Pulp Fiction’}),! (pf)<-[:Rate]-(other_users)-[r:Rate]->(other_films)!WHERE or.stars = r.stars!RETURN distinct other_films.title;

Movies social network

MATCH (fran:User {name: ‘Fran’}) -[or:Rate]-> (pf:Film {title: ‘Pulp Fiction’}),! (pf)<-[:Rate]-(other_users)-[r:Rate]->(other_films),! (other_users)-[:FOLLOW]-(fran)!WHERE or.stars = r.stars!RETURN distinct other_films.title;

Movies social network

Rate {star}

User 1

Film PF

FranRate {stars}

Film

Follow

Rate {star}

Movies social network

MATCH (tarantino:User {name: ‘Quentin Tarantino’}),(tarantino)-[:DIRECT]->(movie)<-[:ACT_IN]-(tarantino)RETURN movie.title

Movies social network

Film Actor

Director

Act_in

Direct

Movies social network

Now you should be able to categorize the movies

Movies social network

Film

SubGenre

Belongs_to

SubGenre

Belongs_to

GenreGenre

Belongs_toBelongs_to

Movies social network

MATCH (fran:User {name: ‘Fran’}) -[or:Rate]-> (pf:Film {title: ‘Pulp Fiction’}),! (pf)<-[:Rate]-(other_users)-[r:Rate]->(other_films), (film)->[:BELONGS_TO*3]->(genre)<-[:BELONGS_TO]-(other_films),! (other_users)-[:FOLLOW]-(fran)!WHERE or.stars = r.stars!RETURN distinct other_films.title;

Neo4J extensions

Managed

Unmanaged

Neo4J extensions

Managed

Unmanaged

Neo4J extensions

Managed

Unmanaged

Drivers/Clients

Instead of just picking a relational database because everyone does, we need to

understand the nature of the data we’re storing and how we want to manipulate it.

Martin Fowler

References

Neo4J as a service

http://www.graphenedb.com

Grazie

Recommended