Graph Connect Europe: From Zero To Import

EMEA Marketing - March 2015!

From Zero to Graph: ImportMark Needham (@markhneedham)7th May 2015

Neo Technology, Inc Confidential#graphconnect

Chicago Crime dataset

Chicago Crime CSV file

imported into

The goal

Exploring the data

LOAD CSV WITH HEADERS FROM"file:///Users/markneedham/projects/neo4j-spark-chicago/Crimes_-_2001_to_present.csv" AS rowRETURN rowLIMIT 1

Exploring the data

Sketch a rough initial model

Import a sample: Crimes LOAD CSV WITH HEADERS FROM "file:///Users/markneedham/projects/neo4j-spark-chicago/Crimes_-_2001_to_present.csv" AS rowWITH row LIMIT 100MERGE (crime:Crime { id: row.ID, description: row.Description, caseNumber: row.`Case Number`, arrest: row.Arrest, domestic: row.Domestic});

Import a sample: Crime Types LOAD CSV WITH HEADERS FROM "file:///Users/markneedham/projects/neo4j-spark-chicago/Crimes_-_2001_to_present.csv" AS rowWITH row LIMIT 100MERGE (:CrimeType { name: row.`Primary Type`});

Import a sample: Crimes -> Crime Types LOAD CSV WITH HEADERS FROM "file:///Users/markneedham/projects/neo4j-spark-chicago/Crimes_-_2001_to_present.csv" AS rowWITH row LIMIT 100MATCH (crime:Crime { id: row.ID, description: row.Description})MATCH (crimeType:CrimeType { name: row.`Primary Type`})MERGE (crime)-[:TYPE]->(crimeType);

Add indexes

CREATE INDEX ON :Label(property)

Add indexes

CREATE INDEX ON :Label(property) CREATE INDEX ON :Crime(id);CREATE INDEX ON :Location(name);CREATE INDEX ON :CrimeType(name);CREATE INDEX ON :Location(name); ...

Periodic Commit

USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM file:///Users/markneedham/projects/neo4j-spark-chicago/Crimes_-_2001_to_present.csv

MERGE (crime:Crime { id: row.ID, description: row.Description})

Periodic Commit •  Neo4j keeps all transaction state in memory

which becomes problematic for large CSV files •  USING PERIODIC COMMIT flushes the

transaction after a certain number of rows •  Default is 1000 rows but it’s configurable •  Currently only works with LOAD CSV

LOAD CSV in summary •  ETL power tool •  Built into Neo4J since version 2.1

•  Can load data from any URL

•  Good for medium size data (up to 10M rows)

Bulk loading an initial data set •  Introducing the Neo4j Import Tool

•  Find it in the bin folder of your Neo4j download

•  Used to large sized initial data sets

•  Skips the transactional layer of Neo4j and writes store files directly

Expects files in a certain format

:ID(Crime) :LABEL description :ID(Beat) :LABEL

:START_ID(Crime) :END_ID(Beat)

Relationships

What we have…

Neo4j ready CSV files

Translation Phase required

Translation Phase

Spark all the things

Spark Job

processed by

spits out

Neo4j ready CSV files

imported into

The Spark Job

Submitting the Spark Job ./spark-1.3.0-bin-hadoop1/bin/spark-submit \--driver-memory 5g \--class GenerateCSVFiles \--master local[8] \target/scala-2.10/playground_2.10-1.0.jar

real 1m25.506suser 8m2.183ssys 0m24.267s

Submitting the Spark Job ./spark-1.3.0-bin-hadoop1/bin/spark-submit \--driver-memory 5g \--class GenerateCSVFiles \--master local[8] \target/scala-2.10/playground_2.10-1.0.jar

real 1m25.506suser 8m2.183ssys 0m24.267s

The generated files $ ls -1 /tmp/*.csv/tmp/beats.csv/tmp/crimeDates.csv/tmp/crimes.csv/tmp/crimesBeats.csv/tmp/crimesDates.csv/tmp/crimesLocations.csv/tmp/crimesPrimaryTypes.csv/tmp/dates.csv/tmp/locations.csv/tmp/primaryTypes.csv

Importing into Neo4j DATA=/tmpNEO=./neo4j-enterprise-2.2.1$NEO/bin/neo4j-import \--into $DATA/crimes.db \--nodes $DATA/crimes.csv \--nodes $DATA/beats.csv \--nodes $DATA/primaryTypes.csv \--nodes $DATA/locations.csv \--relationships $DATA/crimesBeats.csv \--relationships $DATA/crimesPrimaryTypes.csv \--relationships $DATA/crimesLocations.csv \--stacktrace IMPORT DONE in 36s 208ms

This talk brought to you by…

And that’s it…

Graph Connect Europe: From Zero To Import

Technology

· Web view13.Find the displacement - time graph for an object moving in a straight line according to the velocity - time graph. The displacement is initially zero. You do not have

1. Describe the end behavior of the graph y = 2x 5 – 3x 2 + 5. 2. Sketch a graph of 3 rd degree with a zero at -5 (multiplicity 2) and a zero at 0 (multiplicity

Graph construction and visualization Methodological Material · Elena Nikolaeva, Egon Elbre MTAT.03.251 Graph Mining. Hedi Peterson, Bioinformatics, fall 2010. Import network files

Some Properties of the Complement of the Zero-Divisor ...downloads.hindawi.com/archive/2011/591041.pdf · Some Properties of the Complement of the Zero-Divisor Graph of a Commutative

27-Oct-15 Graphs and Hypergraphs. Graph definitions A directed graph consists of zero or more nodes and zero or more edges An edge connects an origin

Graph. Undirected Graph Directed Graph Simple Graph

Optimization with Scipy (2)jonghyun/classes/S18/CEE696/files/08_scipy_optimize2.pdfimport scipy.optimize as opt import numpy as np import matplotlib.pyplot as plt 21. Finding zero

Python Call Graph•Static visualizations of the call graph using various tools such as Graphviz and Gephi. •Execute pycallgraph from the command line or import it in your code

Akademie v•ed Cesk¶e republiky• Flows in Graphs and ... · that every bridgeless graph admits a nowhere-zero 6-°ow. Jaeger [J1] also proved that every 4-edge-connected graph

GraphConnect 2014 SF: From Zero to Graph in 120: Scale

· Import from East (2.1.1+2_1.2+2.1.3+2.1.4) Import from Russia Federation Import from Ukraine Import from Uzbekistan Import from Kazakhstan Import from Turkmenistan Import from

23-Dec-15 Graphs and Hypergraphs. Graph definitions A directed graph consists of zero or more nodes and zero or more edges An edge connects an origin

PowerPoint Presentation · import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; import java.net.HttpURLConnection; import java.net.MalformedURLException;

Korea - Republic of Grain and Feed Annual 2016 Grain and ... · 2007. The import duty on all U.S. wheat is zero under the KORUS FTA. In CY 2016, the flour import tariff rate is applied

A relaying graph and special strong product for zero-error

Full file at ......9 The graph of a certain function y = f(x) and the zero of that function is given. Using this graph, find a) the x-intercept of the graph of y = f(x) and b) the

4.5 Quadratic Equations Zero of the Function- a value where f(x) = 0 and the graph of the function intersects the x-axis Zero Product Property- for all

Zero and non-zero eigenvector components graph matrices · 12 April 2012 Abstract This document is an up to date report of our latest eigenvector related insights. Our main drive

VAT IN THE UAE - International Law Firm in dubai · VAT IN THE UAE 9 03. Zero Rated and Exempt ZERO RATED AND EXEMPTIONS (Art. 44-52 VAT Law) Zero-Rated: supply and import of goods

import rdma: zero-copy networking with RDMA and Python