GraphConnect 2014 SF: From Zero to Graph in 120: Scale

Scaling Neo4j Applica0ons

SAN FRANCISCO | 10.22.2014

powered by!

@iansrobinson

The Burden of Success

•  More users •  Larger datasets •  More concurrent requests •  More complex queries

Scaling is a Feature

•  It doesn’t come for free •  Condi0ons of success: – Understand current needs

•  Design for an order of magnitude growth

–  Itera0ve and incremental development – Unit tests

•  Bedrock of asserted behaviour – Performance tests

Overview

•  Scaling Reads – Latency – Throughput

•  Scaling Writes •  Hardware

Scaling Reads -‐ Latency

Query Latency

latency = f(search_area)

Query Latency

Search Area

search_area = f(domain_invariants)

Search Area

Absolute Every user has 50 friends

Search Area

Absolute Every user has 50 friends

Search Area

Absolute Every user has 50 friends Rela,ve Every user is friends with 10% of the user base

Search Area

Absolute Every user has 50 friends Rela,ve Every user is friends with 10% of the user base

Reducing Read Latency

•  The Blackadder solu0on

Reducing Read Latency

•  The Blackadder solu0on •  Improve the Cypher query •  Change the model •  Use an Unmanaged Extension

Improve Cypher Query

•  Small queries, separated by WITH•  Start from low-‐cardinality nodes

h\p://thought-‐bytes.blogspot.co.uk/2013/01/op0mizing-‐neo4j-‐cypher-‐queries.html h\p://wes.skeweredrook.com/pragma0c-‐cypher-‐op0miza0on-‐2-‐0-‐m06/

Change the Model

Goal Do less work (in the query) –  By exploring less of the graph

How? Iden0fy inferred rela-onships –  Replace with use-‐case specific shortcuts

Change the Model -‐ From

MATCH (:Person{username:'ben'}) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (colleague:Person)

Change the Model -‐ From

MATCH (:Person{username:'ben'}) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (colleague:Person)

Change the Model -‐ To

MATCH (:Person{username:'ben'}) -[:WORKED_WITH]- (colleague:Person)

Tradeoff

More expensive writes More data

Cheaper reads

When to add the new rela0onship? • With tx • Queue for subsequent tx •  Periodic/batch

Refactor Exis0ng Data

MATCH (p1:Person) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (p2:Person)WHERE NOT ((p1)-[:WORKED_WITH]-(p2))WITH DISTINCT p1, p2 LIMIT 10MERGE (p1)-[r:WORKED_WITH]-(p2)RETURN count(r)

Select Batch

Batch size

Add New Rela0onship

Con0nue While count(r) > 0

Use Unmanaged Extensions

REST API Extensions

/db/data/cypher /my-extension/service

RESTful Resource

@Path("/similar-skills")public class ColleagueFinderExtension { private static final ObjectMapper MAPPER = new ObjectMapper(); private final ColleagueFinder colleagueFinder; public ColleagueFinderExtension( @Context CypherExecutor cypherExecutor ) { this.colleagueFinder = new ColleagueFinder( cypherExecutor.getExecutionEngine() ); } @GET @Produces(MediaType.APPLICATION_JSON) @Path("/{name}") public Response getColleagues( @PathParam("name") String name ) throws IOException { String json = MAPPER .writeValueAsString( colleagueFinder.findColleaguesFor( name ) ); return Response.ok().entity( json ).build(); }}

JAX-‐RS Annota0ons

Inject Database/Cypher Execu0on Engine

1. Get Close to the Data

Applica0on

MATCH MATCH CREATE DELETE MERGE MATCH

Single request, many opera0ons –  Reduce network latencies

2. Mul0ple Implementa0on Op0ons

REST API Extensions

Cypher Traversal Framework Graph Algo Package Core API

3. Control Request/Response Format

{ users: [ { id: 1234}, { id: 9876} ] }

JSON, CSV, protobuf, etc

1a 03 08 96 01 Domain-‐specific representa0ons –  Compact –  Conserve bandwidth

4. Control HTTP Headers

GET /my-extension/service/top-10

Reverse Proxy

Applica0on

HTTP/1.1 200 OK Cache-Control: max-age=60

5. Integrate with Backend Systems

REST API Extensions

Applica0on

RDBMS LDAP

Migra0ng to Extensions

•  Re-‐implement original query inside extension •  Modify request/response formats and headers

•  Refactor implementa0on to use lower parts of the stack where necessary

•  Measure, measure, measure

Scaling Reads -‐ Throughput

Scale Horizontally For High Read Throughput

Applica0on

Master Slave Slave

Load Balancer

Applica0on

Master Slave Slave

Read Load Balancer

Write Load Balancer

Configure HAProxy as Read Load Balancer global daemon maxconn 256defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000msfrontend http-in bind *:80 default_backend neo4j-slavesbackend neo4j-slaves option httpchk GET /db/manage/server/ha/slave server s1 10.0.1.10:7474 maxconn 32 check server s2 10.0.1.11:7474 maxconn 32 check server s3 10.0.1.12:7474 maxconn 32 checklisten admin bind *:8080 stats enable

404 Not Found false

404 Not Found UNKNOWN

200 OK true

Master

Unknown

This Isn’t The Throughput You Were Looking For

Applica0on

Load Balancer

MATCH (c:Country{name:'Australia'})... MATCH (c:Country{name:'Zambia'})... MATCH (c:Country{name:'Norway'})...

Cache Sharding Using Consistent Rou0ng

Applica0on

Load Balancer

MATCH (c:Country{name:'Australia'})... MATCH (c:Country{name:'Zambia'})... MATCH (c:Country{name:'Norway'})...

A-‐I 1 J-‐R 2 S-‐Z 3

MATCH (c:Country{name:'Zimbabwe'})... MATCH (c:Country{name:'Japan'})... MATCH (c:Country{name:'Brazil'})...

Configure HAProxy for Cache Sharding global daemon maxconn 256defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000msfrontend http-in bind *:80 default_backend neo4j-slavesbackend neo4j-slaves balance url_param country_code server s1 10.0.1.10:7474 maxconn 32 server s2 10.0.1.11:7474 maxconn 32 server s3 10.0.1.12:7474 maxconn 32listen admin bind *:8080 stats enable

Scaling Writes -‐ Throughput

Factors Impac0ng Write Performance

•  Managing transac0onal state – Crea0ng and commilng are expensive opera0ons

•  Contending for locks – Nodes and rela0onships

Improving Write Throughput

•  Delay taking expensive locks •  Batch/queue writes

Delay Expensive Locks

•  Iden0fy contended nodes •  Involve them as late as possible in a transac0on

Add Linked List Item + Update Pointers

Locked

Add Linked List Item

Add Linked List

Add Pointers

Locked

Batch Writes

•  Mul0ple CREATE/MERGE statements per request – Good for integra0on with backend systems

•  Queue – Good for small, online transac0ons

Single-‐Threaded Queue

Write Write

Single Thread Batch

Queue Loca0on Op0ons

Applica0on Applica0on

Benefits of Batched Writes

•  Less transac0onal state management – Create/commit per batch rather than per write

•  No conten0on for locks – No deadlocks

•  Query consolida0on – Reduce the amount of work inside the database

Query Consolida0on

MATCH samMATCH jennyCREATE sam-[:KNOWS]-jennyMATCH samMATCH sarahCREATE sam-[:KNOWS]-sarahCREATE address1CREATE address2DELETE address1MATCH samCREATE sam-[:LIVES_AT]-address2

Eliminate Duplicate Lookups

MATCH samMATCH jennyCREATE sam-[:KNOWS]-jennyMATCH sarahCREATE sam-[:KNOWS]-sarahCREATE address1CREATE address2DELETE address1CREATE sam-[:LIVES_AT]-address2

Eliminate Unnecessary Writes

MATCH samMATCH jennyCREATE sam-[:KNOWS]-jennyMATCH sarahCREATE sam-[:KNOWS]-sarahCREATE address2CREATE sam-[:LIVES_AT]-address2

Tradeoff

Latency

Higher throughput

In-‐memory or durable queues? • Lost writes in event of crash • Transac0onal dequeue?

GraphConnect 2014 SF: From Zero to Graph in 120: Scale

Software

GraphConnect Europe 2016 - Faster Lap Times with Neo4j - Srinivas Suravarapu

Knowledge, Graphs & 3D CAD Systems - David Bigelow @ GraphConnect Chicago 2013

Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor - GraphConnect London 2013

GraphConnect 2014 SF: Using Graphs for Next-Gen Master Data Management at Pitney Bowes

Graphing Enterprise IT – Representing IT Infrastructure and Business Processes as a Graph - Alan Robertson @ GraphConnect SF 2013

Graph. Northwest Airline Flight Boston Hartford Atlanta Minneapolis Austin SF Seattle Anchorage

DIY Graph Search - Max De Marzi @ GraphConnect Boston + Chicago 2013

Natural Language Search with Neo4j - Kenny Bastani @ GraphConnect NY 2013

GraphConnect 2014 SF: From Zero to Graph in 120: Model

Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013

GraphConnect 2014 SF: Using a Graph Database to Ensure a Routable Road Network

The Virginia Tech U.S. Forest Service October 2016 Housing ... · New SF Residential contrasted against New SF Starts: 2010 through 2016 . In the above graph, new SF construction

Using Graph Databases in Real-time to Solve Resource Authorization at Telenor - Sebastian Verheughe @ GraphConnect SF 2013

GraphConnect 2014 SF: Dynamic Graphs: The Future of Neo4j Visualization

Anti-Fraud and eDiscovery using Graph Databases and Graph Visualization - Corey Lanum @ GraphConnect Boston 2013

GraphConnect NYC

Betting the Company on a Graph Database - Aseem Kishore @ GraphConnect Boston 2013

Avoiding Deadlocks: Lessons Learned with Zephyr Health Using Neo4j and MongoDB - Mahesh Chaudhari @ GraphConnect SF 2013

Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConnect Chicago 2013

GraphConnect Europe 2016 - Opening Keynote, Emil Eifrem