19
Graph Databases : Connecting the Dots in Big Data Darren Wood Chief Architect, InfiniteGraph

NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Embed Size (px)

DESCRIPTION

Darren Wood is the Architect and Lead Developer of InfiniteGraph, the distributed graph database, produced by Objectivity, Inc. Darren has spent the majority of his career architecting and building distributed systems with an emphasis on elastic scalability and data management. Prior to joining Objectivity, Inc. in 2007, Darren held positions as a Senior Consultant with IONA Technologies and a Development Team Lead for Citect Australia. Darren holds a First Class Honors Degree in Computer Systems Engineering from the University of Technology in Sydney, Australia.

Citation preview

Page 1: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Graph Databases : Connecting the Dots in Big Data

Darren WoodChief Architect, InfiniteGraph

Page 2: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Relationships are everywhere

Page 3: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Graph Databases

• Not Really Graph Problems– Average age of my customers that purchased X– Which zip code buys the most of Y

• Graph Problems– How is person A connected to person B– Can suspect Y be associated with location Z– Who are influencers within a social network ?

Copyright © InfiniteGraph

Page 4: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Graph Databases

• Optimized around data relationships– Relationships as first class citizens– Super fast traversal between entities– Rich/flexible annotation of connections

• Small focused API (typically not SQL)– Natively work with concepts of Vertex/Edge– SQL has no concept of “navigation”– Most attempts based in SQL are convoluted

Copyright © InfiniteGraph

Page 5: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Physical Storage Comparison

Copyright © InfiniteGraph

Meetings

P1 Place TimeP2Alice Denver 5-27-10Bob

Calls

From Time DurationToBob 13:20 25CarlosBob 17:10 15Charlie

Payments

From Date AmountToCarlos 5-12-10 100000Charlie

Met5-27-10Alice

Called13:20Bob

Payed100000Carlos

Charlie

Called17:10

Rows/Columns/Tables Relationship/Graph Optimized

Page 6: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Simple API

Copyright © InfiniteGraph

Vertex alice = myGraph.addVertex(new Person(“Alice”)); Vertex bob = myGraph.addVertex(new Person(“Bob”)); Vertex carlos = myGraph.addVertex(new Person(“Carlos”)); Vertex charlie = myGraph.addVertex(new Person(“Charlie”));

alice.addEdge(new Meeting(“Denver”, “5-27-10”), bob);bob.addEdge(new Call(timestamp), carlos);carlos.addEdge(new Payment(100000.00), charlie);bob.addEdge(new Call(timestamp), charlie);

Alice Carlos CharlieBobMeets Calls Pays

Calls

Page 7: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Query and Navigation• Queries – but not as you know them• More like a rules based search and discovery• Asynchronous Results

Copyright © InfiniteGraph

Alice Carlos CharlieBobMeets Calls Pays

Calls

“Find all paths between Alice and Charlie”

“Find all paths between Alice and Charlie – within 2 degrees”

“Find all paths between Alice and Charlie – events in May 2010”

Page 8: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Navigation Example

Copyright © InfiniteGraph

// Create a qualifier that describes the target vertexQualifier findCharliePredicate =

new VertexPredicate(personType, "name == ’Charlie'");

// Construct a navigator which starts with Alice and uses a result qualifier// to find all paths in the graph to CharlieNavigator charlieFinder = alice.navigate(

Guide.SIMPLE_BREADTH_FIRST, // default guide Qualifier.ANY, // no path constraints

findCharliePredicate , // find paths ending with Charlie

myResultHandler); // fire results to supplied handler

// Start the navigatorcharlieFinder.start();

Page 9: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Navigational Query Performance

Page 10: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Scaling Graphs – Getting Data In

Copyright © InfiniteGraph

IG Core/API

ConfigurationNavigation Execution

Management Extensions

Session / TX ManagementPlacement

Standard Blocking Ingest/Placement (MDP Plugin)

Objectivity/DB

App-1(Ingest V1)

App-2(Ingest V2)

App-3(Ingest V3)

V1V1 V2

V2 V3V3

App-1(E1 2{ V1V2})

App-2(E23{ V2V3})

App-3

E12E12 E23

E23

Page 11: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Accelerated Ingest

Copyright © InfiniteGraph

IG Core/API

ConfigurationNavigation Execution

Management Extensions

Session / TX Management

Placement(Standard)Placement

(Accelerated)

V1V1

V2V2

V3V3

E12E12

E23E23

Distributed

Pipelines

Sta

ging

Con

tain

ers P

ipeline Containers

E(1->2)

E(3->1)

E(2->3)

E(2->1)

E(2->3)E(3->1)

E(1->2)

E(3->2)

E(1->2)

E(2->3)

E(3->1)

E(2->1)

E(2->3)

E(3->1)

E(3->2)

E(1->2)

Page 12: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Choose Your Own Consistency…

Copyright © InfiniteGraph

// Describe your requested model using policiesPolicyChain myPolicies =

new PolicyChain(new EdgePipeliningPolicy(true));

// Start a transaction with the policies you wantTransaction tx = myGraph.beginTransaction(

AccessMode.READ_WRITE, myPolicies);

// This code doesn’t change, can be used with any policiesalice.addEdge(new Meeting(“Denver”, “5-27-10”), bob);bob.addEdge(new Call(timestamp), carlos);

tx.commit();

Page 13: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Indexing Framework

• Focused on providing choice !• Manual Indexes for grouping data• Automatic Indexes for cross population• Query interface with qualification language• Pluggable query operators• External index support (Lucene)

Copyright © InfiniteGraph

Page 14: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

InfiniteGraph Visualizer

Copyright © InfiniteGraph

Page 15: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Scaling Graphs – Distributed Navigation

• Graph algorithms naturally branch• Requires orchestration of threads/agents

Copyright © InfiniteGraph

Alice

Carlos CharlieBobMeets Calls Pays

Dave EveChuckCalls

Lives With

Meets

Page 16: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Distributed API

Application(s)

Partition 1 Partition 3Partition 2 Partition ...n

Processor Processor Processor Processor

Big Distributed Data(Traditional - Huge Generalization)

Copyright © InfiniteGraph

Page 17: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Distributed API

Application(s)

Partition 1 Partition 3Partition 2 Partition ...n

Processor Processor Processor Processor

Big Distributed Data(Graph)

Copyright © InfiniteGraph

Page 18: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Some customers and partners

Page 19: NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dots in Big Data

Thankyou !

Copyright © InfiniteGraph

[email protected]