FluxGraph @ GraphDevRoom

Preview:

DESCRIPTION

 

Citation preview

FluxGraph: A time-machine for your graphs

Davy SuveeMichel Van Speybroeck

Janssen Pharmaceutica

about me

➡ working as an it lead / software architect @ janssen pharmaceutica• dealing with big scientific data sets

• hands-on expertise in big data and NoSQL technologies

who am i ...

Davy Suvee@DSUVEE

➡ founder of datablend• provide big data and NoSQL consultancy

• share practical knowledge and big data use cases via blog

relational databases ...

id col1 col2 created_at updated_at

1 “ba” “ba” “1-jan-2011” “5-jun-2012”2 “da” “da” “2-jan-2011”3 “boem” “bam” “3-jan-2011” “12-jun-2012”

time

relational databases ...

id col1 col2 created_at updated_at

1 “ba” “ba” “1-jan-2011” “5-jun-2012”2 “da” “da” “2-jan-2011”3 “boem” “bam” “3-jan-2011” “12-jun-2012”

entity_id property old_value new_value date

1 “col1” “bi” “ba” “5-jun-2012”2 “col1” “bim” “boem” “1-mar-2012”2 “col2” “bim” “bam” “12-jun-2012”

time

audit trail

graphs ...

➡ graphs are continuously changing ...

graphs ...

➡ graphs are continuously changing ...

➡ graphs and time ... ★ neo-versioning by david montag 1

★ representing time dependent graphs in neo4j by the isi foundation 2

★ modeling a multilevel index in neo4j by peter neubauer 3

copy and relink semantics

1. http://github.com/dmontag/neo4j-versioning 2. http://github.com/ccattuto/neo4j-dynagraph/wiki 3. http://blog.neo4j.org/2012/02/modeling-multilevel-index-in-neoj4.html

graphs ...

➡ graphs are continuously changing ...

➡ graphs and time ... ★ neo-versioning by david montag 1

★ representing time dependent graphs in neo4j by the isi foundation 2

★ modeling a multilevel index in neo4j by peter neubauer 3

copy and relink semantics

1. http://github.com/dmontag/neo4j-versioning 2. http://github.com/ccattuto/neo4j-dynagraph/wiki 3. http://blog.neo4j.org/2012/02/modeling-multilevel-index-in-neoj4.html

๏ graph size

๏ object identity

๏ mixing data-model and time-model

FluxGraph ...

➡ towards a time-aware graph ...

FluxGraph ...

➡ implement a blueprints-compatible graph on top of Datomic

➡ towards a time-aware graph ...

FluxGraph ...

➡ implement a blueprints-compatible graph on top of Datomic

➡ make FluxGraph fully time-aware ★ travel your graph through time★ time-scoped iteration of vertices and edges★ temporal graph comparison

➡ towards a time-aware graph ...

travel through time

FluxGraph fg = new FluxGraph();

travel through time

FluxGraph fg = new FluxGraph();

Vertex davy = fg.addVertex();davy.setProperty(“name”,”Davy”);

Davy

travel through time

FluxGraph fg = new FluxGraph();

Vertex davy = fg.addVertex();davy.setProperty(“name”,”Davy”);

Davy

Peter

Vertex peter = ...

travel through time

FluxGraph fg = new FluxGraph();

Vertex davy = fg.addVertex();davy.setProperty(“name”,”Davy”);

Michael

Davy

Peter

Vertex peter = ...Vertex michael = ...

travel through time

FluxGraph fg = new FluxGraph();

Vertex davy = fg.addVertex();davy.setProperty(“name”,”Davy”);

Michael

Davy

Peter

Vertex peter = ...Vertex michael = ...

Edge e1 = fg.addEdge(davy, peter,“knows”);

knows

travel through time

Date checkpoint = new Date();

Michael

Davy

Peter

knows

travel through time

Date checkpoint = new Date();

davy.setProperty(“name”,”David”);

Michael

Davy

Peter

knows

travel through time

Date checkpoint = new Date();

davy.setProperty(“name”,”David”);

Michael

Peter

knows

David

travel through time

Date checkpoint = new Date();

davy.setProperty(“name”,”David”);

Michael

Peter

Edge e2 = fg.addEdge(davy, michael,“knows”);

knows

David

knows

travel through time

Michael

Davy

Peter

DavidDavy

Peter

knows

knows

Michael

knows

checkpoint

currenttime

by default

travel through time

Michael

Davy

Peter

DavidDavy

Peter

knows

knows

Michael

knows

checkpoint

currenttime

fg.setCheckpointTime(checkpoint);

tcurrrentt3t2

time-scoped iteration

change change change

Davy’’’Davy’ Davy’’

t1

Davy

➡ how to find the version of the vertex you are interested in?

next next next

previouspreviousprevious

tcurrrentt3t2

time-scoped iteration

Davy’’’Davy’ Davy’’

t1

Davy

Vertex previousDavy = davy.getPreviousVersion();

next next next

previouspreviousprevious

tcurrrentt3t2

time-scoped iteration

Davy’’’Davy’ Davy’’

t1

Davy

Vertex previousDavy = davy.getPreviousVersion();Iterable<Vertex> allDavy = davy.getNextVersions();

next next next

previouspreviousprevious

tcurrrentt3t2

time-scoped iteration

Davy’’’Davy’ Davy’’

t1

Davy

Vertex previousDavy = davy.getPreviousVersion();Iterable<Vertex> allDavy = davy.getNextVersions();

Iterable<Vertex> selDavy = davy.getPreviousVersions(filter);

next next next

previouspreviousprevious

tcurrrentt3t2

time-scoped iteration

Davy’’’Davy’ Davy’’

t1

Davy

Vertex previousDavy = davy.getPreviousVersion();Iterable<Vertex> allDavy = davy.getNextVersions();

Iterable<Vertex> selDavy = davy.getPreviousVersions(filter);Interval valid = davy.getTimerInterval();

time-scoped iteration

➡ When does an element change?

time-scoped iteration

➡ vertex:★ setting or removing a property ★ add or remove it from an edge★ being removed

➡ When does an element change?

time-scoped iteration

➡ vertex:★ setting or removing a property ★ add or remove it from an edge★ being removed

➡ When does an element change?

➡ edge:★ setting or removing a property ★ being removed

time-scoped iteration

➡ vertex:★ setting or removing a property ★ add or remove it from an edge★ being removed

➡ When does an element change?

➡ edge:★ setting or removing a property ★ being removed

➡ ... and each element is time-scoped!

MichaelMichael

Davy

Peter

David Davy

Peter

temporal graph comparison

knows

knows

knows

current checkpoint

what changed?

temporal graph comparison

➡ difference (A , B) = union (A , B) - B

temporal graph comparison

➡ difference (A , B) = union (A , B) - B

➡ ... as a (immutable) graph!

temporal graph comparison

➡ difference (A , B) = union (A , B) - B

➡ ... as a (immutable) graph!

difference ( , ) =

David

knows

FluxGraph ...

http://github.com/datablend/fluxgraph

➡ available on github

t3t2t1

use case: longitudinal patient data

patient patient

smoking

patient

smoking

t4

patient

cancer

t5

patient

cancer

death

use case: longitudinal patient data

➡ historical data for 15.000 patients over a period of 10 years (2001- 2010)

use case: longitudinal patient data

➡ historical data for 15.000 patients over a period of 10 years (2001- 2010)

➡ example analysis: ★ if a male patient is no longer smoking in 2005★ what are the chances of getting lung cancer in 2010, comparing

patients that smoked before 2005

patients that never smoked

use case: longitudinal patient data

➡ get all male non-smokers in 2005

use case: longitudinal patient data

➡ get all male non-smokers in 2005

fg.setCheckpointTime(new DateTime(2005,12,31).toDate());

use case: longitudinal patient data

➡ get all male non-smokers in 2005

fg.setCheckpointTime(new DateTime(2005,12,31).toDate());

Iterator<Vertex> males = fg.getVertices("gender", "male").iterator()

use case: longitudinal patient data

➡ get all male non-smokers in 2005

fg.setCheckpointTime(new DateTime(2005,12,31).toDate());

Iterator<Vertex> males = fg.getVertices("gender", "male").iterator()

while (males.hasNext()) { Vertex p2005 = males.next(); boolean smoking2005 = p2005.getEdges(OUT,"smokingStatus").iterator().hasNext();}

use case: longitudinal patient data

➡ which patients were smoking before 2005?

use case: longitudinal patient data

boolean smokingBefore2005 = ((FluxVertex)p2005).getPreviousVersions(new TimeAwareFilter() { public TimeAwareElement filter(TimeAwareElement element) { return element.getEdges(OUT, "smokingStatus").iterator().hasNext() ? element : null; }

}).iterator();

➡ which patients were smoking before 2005?

use case: longitudinal patient data

➡ which patients have cancer in 2010

use case: longitudinal patient data

Graph g = fg.difference(smokerws, time2010.toDate(), time2005.toDate());

➡ which patients have cancer in 2010

use case: longitudinal patient data

Graph g = fg.difference(smokerws, time2010.toDate(), time2005.toDate());

➡ which patients have cancer in 2010

working set of smokers

use case: longitudinal patient data

Graph g = fg.difference(smokerws, time2010.toDate(), time2005.toDate());

➡ which patients have cancer in 2010

working set of smokers

➡ extract the patients that have an edge to the cancer node

gephi plugin for fluxgraph2010

gephi plugin for fluxgraph2001

gephi plugin for blueprints!

http://github.com/datablend/gephi-blueprints-plugin

➡ available on github

➡ Support for neo4j, orientdb, dex, rexter, ...

1. Kudos to Timmy Storms (@timmystorms)

1

Questions?