50

Eifrem neo4j

Embed Size (px)

DESCRIPTION

presentation on Nodes and relationships

Citation preview

Page 1: Eifrem neo4j
Page 2: Eifrem neo4j

Neo4jthe benefits of

graph databases

Page 3: Eifrem neo4j

What's the plan?

Why now? – Four trends

NoSQL overview

Graph databases && Neo4j

Conclusions

Food

Page 4: Eifrem neo4j

200740

2010

988

Trend 1:data set size

Page 5: Eifrem neo4j

Trend 2: connectedness

Text documents

1990

Info

rmat

ion

conn

ectiv

ity

Folksonomies

Tagging

User-generated

content

Wikis

RSS

Blogs

Hypertext

2000 2010 2020web 1.0 web 2.0 “web 3.0”

Ontologies

RDF

GiantGlobalGraph (GGG)

Page 6: Eifrem neo4j

Trend 3: semi-structureIndividualization of content!

In the salary lists of the 1970s, all elements had exactly one job

In the salary lists of the 2000s, we need 5 job columns! Or 8? Or 15?

Trend accelerated by the decentralization of content generation that is the hallmark of the age of participation (“web 2.0”)

Page 7: Eifrem neo4j

Data complexity

Per

form

ance

Relational database

Majority ofWebapps

Social network

Semantic Trading

Salary List

} custom

Aside: RDBMS performance

Page 8: Eifrem neo4j

Trend 4: architecture

1990s: Database as integration hub

Page 9: Eifrem neo4j

Trend 4: architecture

2000s: (Slowly towards...)

Decoupled services with own backend

Page 10: Eifrem neo4j

Why NoSQL 2009?

Trend 1: Size.

Trend 2: Connectivity.

Trend 3: Semi-structure.

Trend 4: Architecture.

Page 11: Eifrem neo4j

NoSQL

overview

Page 12: Eifrem neo4j

First off: the damn name

NoSQL is NOT “Never SQL”

NoSQL is NOT “No To SQL”

NoSQL is NOT “WE HATE CHRIS' DOG”

Page 13: Eifrem neo4j

NoSQL

ot nly !

is simply

Page 14: Eifrem neo4j

Four (emerging) NoSQL categoriesKey-value stores

Based on Amazon's Dynamo paper

Data model: (global) collection of K-V pairs

Example: Dynomite, Voldemort, Tokyo

BigTable clones

Based on Google's BigTable paper

Data model: big table, column families

Example: Hbase, Hypertable

Page 15: Eifrem neo4j

Four (emerging) NoSQL categoriesDocument databases

Inspired by Lotus Notes

Data model: collections of K-V collections

Example: CouchDB, MongoDB

Graph databases

Inspired by Euler & graph theory

Data model: nodes, rels, K-V on both

Example: AllegroGraph, VertexDB, Neo4j

Page 16: Eifrem neo4j

NoSQL data models

Complexity

Siz

e

Bigtable clones

Key-value stores

Document databases

Graph databases

Page 17: Eifrem neo4j

NoSQL data models

Complexity

Siz

e

Bigtable clones

Key-value stores

Document databases

90%of

usecases

(This is still ofnodes & relationships)

Graph databases

Page 18: Eifrem neo4j
Page 19: Eifrem neo4j

Graph DBs

& Neo4j intro

Page 20: Eifrem neo4j

The Graph DB model: representationCore abstractions:

Nodes

Relationships between nodes

Properties on both

name = “Emil”age = 29sex = “yes”

type = KNOWStime = 4 years

type = carvendor = “SAAB”model = “95 Aero”

Page 21: Eifrem neo4j

Example: The Matrix

name = “Thomas Anderson”age = 29

name = “The Architect”

CODED_BY

disclosure = public

name = “Cypher”last name = “Reagan”

disclosure = secretage = 6 months

name = “Agent Smith”version = 1.0blanguage = C++

KNOWS KNOWS

name = “Morpheus”rank = “Captain”occupation = “Total badass”

age = 3 days

name = “Trinity”

KNOWS

KNOWS

KN

OW

S

Page 22: Eifrem neo4j

Code (1): Building a node spaceNeoService neo = ... // Get factory

// Create Thomas 'Neo' AndersonNode mrAnderson = neo.createNode();mrAnderson.setProperty( "name", "Thomas Anderson" );mrAnderson.setProperty( "age", 29 );

// Create MorpheusNode morpheus = neo.createNode();morpheus.setProperty( "name", "Morpheus" );morpheus.setProperty( "rank", "Captain" );morpheus.setProperty( "occupation", "Total bad ass" );

// Create a relationship representing that they know each othermrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );// ...create Trinity, Cypher, Agent Smith, Architect similarly

Page 23: Eifrem neo4j

Code (1): Building a node spaceNeoService neo = ... // Get factoryTransaction tx = neo.beginTx();

// Create Thomas 'Neo' AndersonNode mrAnderson = neo.createNode();mrAnderson.setProperty( "name", "Thomas Anderson" );mrAnderson.setProperty( "age", 29 );

// Create MorpheusNode morpheus = neo.createNode();morpheus.setProperty( "name", "Morpheus" );morpheus.setProperty( "rank", "Captain" );morpheus.setProperty( "occupation", "Total bad ass" );

// Create a relationship representing that they know each othermrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );// ...create Trinity, Cypher, Agent Smith, Architect similarly

tx.commit();

Page 24: Eifrem neo4j

Code (1b): Defining RelationshipTypes// In package org.neo4j.api.corepublic interface RelationshipType{ String name();}

// In package org.yourdomain.yourapp// Example on how to roll dynamic RelationshipTypesclass MyDynamicRelType implements RelationshipType{ private final String name; MyDynamicRelType( String name ){ this.name = name; } public String name() { return this.name; }}

// Example on how to kick it, static-RelationshipType-likeenum MyStaticRelTypes implements RelationshipType{ KNOWS, WORKS_FOR,}

Page 25: Eifrem neo4j

Whiteboard friendly

Björn Big Car

DayCare

Björn

owns

drivesbuild

Page 26: Eifrem neo4j

The Graph DB model: traversalTraverser framework for high-performance traversing across the node space

name = “Emil”age = 29sex = “yes”

type = KNOWStime = 4 years

type = carvendor = “SAAB”model = “95 Aero”

Page 27: Eifrem neo4j

Example: Mr Anderson s friendsʼ

name = “Thomas Anderson”age = 29

name = “The Architect”

CODED_BY

disclosure = public

name = “Cypher”last name = “Reagan”

disclosure = secretage = 6 months

name = “Agent Smith”version = 1.0blanguage = C++

KNOWS KNOWS

name = “Morpheus”rank = “Captain”occupation = “Total badass”

age = 3 days

name = “Trinity”

KNOWS

KNOWS

KN

OW

S

Page 28: Eifrem neo4j

Code (2): Traversing a node space

// Instantiate a traverser that returns Mr Anderson's friendsTraverser friendsTraverser = mrAnderson.traverse(

Traverser.Order.BREADTH_FIRST,StopEvaluator.END_OF_GRAPH,ReturnableEvaluator.ALL_BUT_START_NODE,RelTypes.KNOWS,Direction.OUTGOING );

// Traverse the node space and print out the resultSystem.out.println( "Mr Anderson's friends:" );for ( Node friend : friendsTraverser ){

System.out.printf( "At depth %d => %s%n",friendsTraverser.currentPosition().getDepth(),friend.getProperty( "name" ) );

}

Page 29: Eifrem neo4j

$ bin/start-neo-exampleMr Anderson's friends:

At depth 1 => MorpheusAt depth 1 => TrinityAt depth 2 => CypherAt depth 3 => Agent Smith$

friendsTraverser = mrAnderson.traverse( Traverser.Order. BREADTH_FIRST, StopEvaluator. END_OF_GRAPH, ReturnableEvaluator. ALL_BUT_START_NODE, RelTypes. KNOWS, Direction. OUTGOING );

name = “Thomas Anderson”age = 29

name = “Morpheus”rank = “Captain”occupation = “Total badass”

name = “The Architect”

disclosure = public

age = 3 days

name = “Trinity”

name = “Cypher”last name = “Reagan”

disclosure = secretage = 6 months

name = “Agent Smith”version = 1.0blanguage = C++

KNOWS KNOWS CODED_BYKNOWS

KNOWS

KN

OW

S

Page 30: Eifrem neo4j

Example: Friends in love?

name = “Thomas Anderson”age = 29

name = “Morpheus”rank = “Captain”occupation = “Total badass”

name = “The Architect”

disclosure = public

name = “Trinity”

name = “Cypher”last name = “Reagan”

disclosure = secretage = 6 months

name = “Agent Smith”version = 1.0blanguage = C++

KNOWS KNOWS CODED_BYKNOWS

KNOWS

KN

OW

S

LOVES

Page 31: Eifrem neo4j

Code (3a): Custom traverser

// Create a traverser that returns all “friends in love”Traverser loveTraverser = mrAnderson.traverse(

Traverser.Order.BREADTH_FIRST,StopEvaluator.END_OF_GRAPH,new ReturnableEvaluator(){

public boolean isReturnableNode( TraversalPosition pos ){

return pos.currentNode().hasRelationship( RelTypes.LOVES, Direction.OUTGOING );

}},RelTypes.KNOWS,Direction.OUTGOING );

Page 32: Eifrem neo4j

Code (3a): Custom traverser

// Traverse the node space and print out the resultSystem.out.println( "Who’s a lover?" );for ( Node person : loveTraverser ){

System.out.printf( "At depth %d => %s%n",loveTraverser.currentPosition().getDepth(),person.getProperty( "name" ) );

}

Page 33: Eifrem neo4j

new ReturnableEvaluator(){ public boolean isReturnableNode( TraversalPosition pos) { return pos.currentNode(). hasRelationship( RelTypes. LOVES, Direction .OUTGOING ); }},

$ bin/start-neo-exampleWho’s a lover?

At depth 1 => Trinity$

name = “Thomas Anderson”age = 29

name = “Morpheus”rank = “Captain”occupation = “Total badass”

name = “The Architect”

disclosure = public

name = “Trinity”

name = “Cypher”last name = “Reagan”

disclosure = secretage = 6 months

name = “Agent Smith”version = 1.0blanguage = C++

KNOWS KNOWS CODED_BYKNOWS

KNOWS

KN

OW

SLOVES

Page 34: Eifrem neo4j

Bonus code: domain modelHow do you implement your domain model?

Use the delegator pattern, i.e. every domain entity wraps a Neo4j primitive:

// In package org.yourdomain.yourappclass PersonImpl implements Person{ private final Node underlyingNode; PersonImpl( Node node ){ this.underlyingNode = node; }

public String getName() { return this.underlyingNode.getProperty( "name" ); } public void setName( String name ) { this.underlyingNode.setProperty( "name", name ); }}

Page 35: Eifrem neo4j

Domain layer frameworksQi4j (www.qi4j.org)

Framework for doing DDD in pure Java5

Defines Entities / Associations / Properties

Sound familiar? Nodes / Rel s / Properties!ʼNeo4j is an “EntityStore” backend

NeoWeaver (http://components.neo4j.org/neo-weaver)

Weaves Neo4j-backed persistence into domain objects in runtime (dynamic proxy / cglib based)

Veeeery alpha

Page 36: Eifrem neo4j

Neo4j system characteristicsDisk-based

Native graph storage engine with custom binary on-disk format

Transactional

JTA/JTS, XA, 2PC, Tx recovery, deadlock detection, MVCC, etc

Scales up (what's the x and the y?)

Several billions of nodes/rels/props on single JVM

Robust

6+ years in 24/7 production

Page 37: Eifrem neo4j

Social network pathExists()

~1k persons

Avg 50 friends per person

pathExists(a, b) limit depth 4

Two backends

Eliminate disk IO so warm up caches

Page 38: Eifrem neo4j

Social network pathExists()

Mike

Marcus

Emil

John

Leigh

Kevin

Bruce

# persons query timeRelational database 1 000 2 000 msGraph database (Neo4j) 1 000 2 msGraph database (Neo4j) 1 000 000 2 ms

Page 39: Eifrem neo4j

Pros & Cons compared to RDBMS+ No O/R impedance mismatch (whiteboard friendly)

+ Can easily evolve schemas

+ Can represent semi-structured info

+ Can represent graphs/networks (with performance)

- Lacks in tool and framework support

- Few other implementations => potential lock in

- No support for ad-hoc queries

Page 40: Eifrem neo4j

Language bindingsNeo4j.py – bindings for Jython and CPython

http://components.neo4j.org/neo4j.py

Neo4jrb – bindings for JRuby (incl RESTful API)http://wiki.neo4j.org/content/Ruby

Clojurehttp://wiki.neo4j.org/content/Clojure

Scala (incl RESTful API)http://wiki.neo4j.org/content/Scala

… .NET? Erlang?

Page 41: Eifrem neo4j
Page 42: Eifrem neo4j
Page 43: Eifrem neo4j

Grails Neoclipse screendump

Page 44: Eifrem neo4j

ConclusionGraphs && Neo4j => teh awesome!

Available NOW under AGPLv3 / commercial license

AGPLv3: “if you re open source, we re open source”ʼ ʼ

If you have proprietary software? Must buy a commercial license

But up to 1M primitives it s free for all uses!ʼ

Download

http://neo4j.org

Feedback

http://lists.neo4j.org

Page 45: Eifrem neo4j

Poop 1Key-value stores?

=> the awesome

… if you have 1000s of BILLIONS records OR you don't care about programmer productivity

What if you had no variables at all in your programs except a single globally accessible hashtable?

Would your software be maintainable?

Page 46: Eifrem neo4j

Poop 2In a not-suck architecture...

… the only thing that makes sense is to have an embedded database.

Page 47: Eifrem neo4j

Looking ahead: polyglot persistence

Page 48: Eifrem neo4j
Page 49: Eifrem neo4j

Questions?

Image credit: lost again! Sorry :(

Page 50: Eifrem neo4j

http://neotechnology.com