28
Graph Databases and Neo4j Tobias Ivarsson Hacker @ Neo Technology twitter: @thobe / #neo4j email: [email protected] web: http://www.neo4j.org / web: http://www.thobe.org /

Geek Nights Neo4j Code Jam

Embed Size (px)

DESCRIPTION

Neo4j is a graph database. It is an embedded, disk-based, fully transactional Java persistence engine that stores data structured in graphs rather than in tables. A graph (mathematical lingo for a network) is a flexible data structure that allows a more agile and rapid style of development.

Citation preview

Page 2: Geek Nights Neo4j Code Jam

2

username fullname registration speaker payment

mtiberg Michael Tiberg null no 0

thobe Tobias Ivarsson 2010-04-07 yes 0

joe John Doe 2010-02-05 no 700

... ... ... ... ...

AttendeesWe all know the relational model.

It has been predominant for a long time.

Page 3: Geek Nights Neo4j Code Jam

3

username fullname registration speaker payment

mtiberg Michael Tiberg null no 0

thobe Tobias Ivarsson 2010-04-07 yes 0

joe John Doe 2010-02-05 no 700

... ... ... ... ...

Attendees

username latitude longitude title publish

thobe 55°36'47.70"N 12°58'34.50"E Malmö yes

joe 37°49'36.00"N 122°25'22.00"W San Francisco

no

... ... ... ... ...

Location

The relational model has a few problems, such as:•poor support for sparse data•modifying the data model is almost exclusively done through adding tables

Page 4: Geek Nights Neo4j Code Jam

4

username fullname registration speaker payment

mtiberg Michael Tiberg null no 0

thobe Tobias Ivarsson 2010-04-07 yes 0

joe John Doe 2010-02-05 no 700

... ... ... ... ...

Attendees

username latitude longitude title publish

thobe 55°36'47.70"N 12°58'34.50"E Malmö yes

joe 37°49'36.00"N 122°25'22.00"W San Francisco

no

... ... ... ... ...

Location

id title time room ...

... ... ... ... ...

... ... ... ... ...

Sessions

session user

... ...

... ...

Session attendance

... ...

... ...

... ...

More complication...... ...

... ...

... ...

... ...

... ...

... ...

... ...

... ...

... ...

After a while, modeling complex relationships leads to complicated schemas

Page 5: Geek Nights Neo4j Code Jam

5

E D

CF

G B

A

Most of the emerging database technologies are concerned with scaling to huge amounts of data and massive load.They do so by making data opaque and distribute elements based on key.

Most focus on scaling to large numbers

Page 6: Geek Nights Neo4j Code Jam

5

E D

CF

G B

A

Most of the emerging database technologies are concerned with scaling to huge amounts of data and massive load.They do so by making data opaque and distribute elements based on key.

Most focus on scaling to large numbers

Page 7: Geek Nights Neo4j Code Jam

5

E D

CF

G B

A

Most of the emerging database technologies are concerned with scaling to huge amounts of data and massive load.They do so by making data opaque and distribute elements based on key.

Most focus on scaling to large numbers

Page 8: Geek Nights Neo4j Code Jam

5

E D

CF

G B

A

Most of the emerging database technologies are concerned with scaling to huge amounts of data and massive load.They do so by making data opaque and distribute elements based on key.

Most focus on scaling to large numbers

Page 9: Geek Nights Neo4j Code Jam

5

E D

CF

G B

A

Most of the emerging database technologies are concerned with scaling to huge amounts of data and massive load.They do so by making data opaque and distribute elements based on key.

Most focus on scaling to large numbers

Page 10: Geek Nights Neo4j Code Jam

Scaling to size vs. Scaling to complexity

6

Size

Complexity

Key/Value stores

Bigtable clones

Document databases

Graph databases

Page 11: Geek Nights Neo4j Code Jam

Scaling to size vs. Scaling to complexity

6

Size

Complexity

Key/Value stores

Bigtable clones

Document databases

Graph databases

> 90% of use cases

Billions of nodesand relationships

Page 12: Geek Nights Neo4j Code Jam

The Property Graph data model

7

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

Page 13: Geek Nights Neo4j Code Jam

The Property Graph data model

7

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

Page 14: Geek Nights Neo4j Code Jam

The Property Graph data model

7

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

Page 15: Geek Nights Neo4j Code Jam

The Property Graph data model

7

LIVES WITHLOVES

OWNSDRIVES

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

Page 16: Geek Nights Neo4j Code Jam

The Property Graph data model

7

LIVES WITHLOVES

OWNSDRIVES

LOVES

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

Page 17: Geek Nights Neo4j Code Jam

The Property Graph data model

7

LIVES WITHLOVES

OWNSDRIVES

LOVESname: “James”age: 32twitter: “@spam”

name: “Mary”age: 35

brand: “Volvo”model: “V70”

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

Page 18: Geek Nights Neo4j Code Jam

The Property Graph data model

7

LIVES WITHLOVES

OWNSDRIVES

LOVESname: “James”age: 32twitter: “@spam”

name: “Mary”age: 35

brand: “Volvo”model: “V70”

item type: “car”

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

Page 19: Geek Nights Neo4j Code Jam

Graphs are whiteboard friendly

8Image credits: Tobias Ivarsson

An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database.With a Graph Database the model from the whiteboard is implemented directly.

Page 20: Geek Nights Neo4j Code Jam

Graphs are whiteboard friendly

8

1

*

1

*

*

1*

1

*

*

Image credits: Tobias Ivarsson

An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database.With a Graph Database the model from the whiteboard is implemented directly.

Page 21: Geek Nights Neo4j Code Jam

Graphs are whiteboard friendly

8

thobe

Wardrobe Strength

Joe project blog

Image credits: Tobias Ivarsson

An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database.With a Graph Database the model from the whiteboard is implemented directly.

Hello Joe

Neo4j performance analysis

Modularizing Jython

Page 22: Geek Nights Neo4j Code Jam

What is Neo4j?๏Neo4j is a Graph Database

•Non-relational (“#nosql”), transactional (ACID), embedded

•Data is stored as a Graph / Network

‣Nodes and Relationships with properties

‣“Property Graph” or “edge-labeled multidigraph”

๏Neo4j is Open Source / Free (as in speech) Software

•AGPLv3

•Commercial (“dual license”) license available

‣Free (as in beer) for first server installation

‣Inexpensive (as in startup-friendly) when you grow 9

Prices are available at http://neotechnology.com/

Contact us if you have questions and/or special license needs (e.g. if you want an evaluation license)

Page 23: Geek Nights Neo4j Code Jam

More about Neo4j๏Neo4j is stable

• In 24/7 operation since 2003

๏Neo4j is in active development

•Neo Technology received VC funding October 2009

๏Neo4j delivers high performance graph operations

• traverses 1’000’000+ relationships / secondon commodity hardware

10

Page 24: Geek Nights Neo4j Code Jam

http://neotechnology.com

Page 25: Geek Nights Neo4j Code Jam

Path exists in social network๏Each person has on average 50 friends

12

Database # persons query timeRelational databaseNeo4j Graph DatabaseNeo4j Graph Database

1 000 2 000 ms1 000 2 ms

1 000 000 2 ms

Tobias

Emil

JohanPeter

Page 26: Geek Nights Neo4j Code Jam

Path exists in social network๏Each person has on average 50 friends

12

Database # persons query timeRelational databaseNeo4j Graph DatabaseNeo4j Graph Database

1 000 2 000 ms1 000 2 ms

1 000 000 2 ms

Tobias

Emil

JohanPeter

Page 27: Geek Nights Neo4j Code Jam

Path exists in social network๏Each person has on average 50 friends

12

Database # persons query timeRelational databaseNeo4j Graph DatabaseNeo4j Graph Database

1 000 2 000 ms1 000 2 ms

1 000 000 2 ms

Tobias

Emil

JohanPeter

Page 28: Geek Nights Neo4j Code Jam

Path exists in social network๏Each person has on average 50 friends

12

Database # persons query timeRelational databaseNeo4j Graph DatabaseNeo4j Graph Database

1 000 2 000 ms1 000 2 ms

1 000 000 2 ms

Tobias

Emil

JohanPeter