Upload
alessandro-nadalin
View
32.198
Download
1
Embed Size (px)
DESCRIPTION
Presentation given at the national PHP conference in Poland, in Kielce, October 2011, dealing with the introduction of graph databases in PHP, taking a practical look at OrientDB.
Citation preview
1
David FunaroAlessandro Nadalin
GraphDB in PHP
Agenda
2
•Theory•When to use a graph?•Why graphDB?•The graphDB community•OrientDB•OrientDB in PHP•Demo
Essential (Theory)
3
Essential (Theory)
3
Gra
phG =
Essential (Theory)
3
Ver
tex
(V,G
raph
G =
Essential (Theory)
A
3
Ver
tex
(V,G
raph
G =
Essential (Theory)
A
3
Ver
tex
(V,G
raph
G =
Edg
e
E)
Essential (Theory)
A
3
Ver
tex
(V,G
raph
G =
Edg
e
E)
Binary Relation
4
BA
Hates
Itchy Scratchy
Binary Relation
4
B
Vertex Vertex
Edge
A
Graph
5
B
D
E
G
FA
Undirected Graph
B
D
E
F
A
Example: Friendship 6
Directed Edge
7
B
Vertex Vertex
A
Directed Edge
7
B
Vertex Vertex
Edge
A
Directed Graph
8Example: Followee
D
FA
BA
Path
9
B
D
E
G
FA
Path
10
B D EG FA
Graph -> GraphDB
11
GraphDB is a database that use the Graph as its primary data structure
... when to use a graph ?
Web in ’99
13
Web in 2005
14
The social web
15
Your data is a graph
16
a tree is a graph
17
parent_id is a graph
18
Recommendations
19
John
Rome
Milan
Cinema A
Cinema B
Cinema C
Se7en
Mr Bean
Thriller
Fun
lives in
location
location
location
type
type
likes
shows
shows
shows
Recommendations
20
John
Rome
Milan
Cinema A
Cinema B
Cinema C
Se7en
Mr Bean
Thriller
Fun
lives in
location
location
location
type
type
likes
shows
shows
shows x
✓
x
Recommendations
21
John
Rome
Milan
Cinema A
Cinema B
Cinema C
Se7en
Mr Bean
Thriller
Fun
lives in
location
location
location
type
type
likes
shows
shows
shows x
✓
x
✓
Recommendations
22
John
Rome
Milan
Cinema A
Cinema B
Cinema C
Se7en
Mr Bean
Thriller
Fun
lives in
location
location
location
type
type
likes
shows
shows
shows x
✓
x
✓ ✓
Recommendations
23
John
Rome
Milan
Cinema A
Cinema B
Cinema C
Se7en
Mr Bean
Thriller
Fun
lives in
location
location
location
type
type
likes
shows
shows
shows x
✓
x
✓ ✓ x
Recommendations
24
John
Rome
Milan
Cinema A
Cinema B
Cinema C
Se7en
Mr Bean
Thriller
Fun
lives in
location
location
location
type
type
likes
shows
shows
shows x
✓
x
x x x ✓
Recommendations
25
John
Rome
Milan
Cinema A
Cinema B
Cinema C
Se7en
Mr Bean
Thriller
Fun
lives in
location
location
location
type
type
likes
shows
shows
shows x
✓
x
x x x ✓ ✓
Recommendations
26
John
Rome
Milan
Cinema A
Cinema B
Cinema C
Se7en
Mr Bean
Thriller
Fun
lives in
location
location
location
type
type
likes
shows
shows
shows x
✓
x
x x x ✓ ✓ ✓
Recommendations
27
John
Rome
Milan
Cinema A
Cinema B
Cinema C
Se7en
Mr Bean
Thriller
Fun
lives in
location
location
location
type
type
likes
shows
shows
shows x
✓
x
x x x ✓ ✓ ✓ ✓
Solve decision problems
Maximum flow
maximum flowGiven a dataset, calculate how to best organize it
travelling salesman problem
The pizza guy needs to deliver on A, B,C.
Decision base on distance, traffic, time and so on.
Shortest path
Identify "special" nodes of the graph
Given your dataset, organize some clusters
Are there some nodes which cannot belong to a cluster?
They probably have some properties different from the average
Given your dataset, organize some clusters
Are there some nodes which cannot belong to a cluster?
They probably have some properties different from the average
ACHTUNG!TERRORISTEN!
but ... why graphDB?
38
http://www.slideshare.net/slidarko/problemsolving-using-graph-traversals-searching-scoring-ranking-and-recommendation#
Representing a Graph in:
39
✓Relational Database
(mysql, oracle)
✓Document Oriented DB
(mongodb, couchdb)
✓XML Database
(MarkLogic, eXist-db)
http://www.slideshare.net/slidarko/problemsolving-using-graph-traversals-searching-scoring-ranking-and-recommendation#
Representing a Graph in:
39
where is the difference ?
40
A graph database is any storage system that provides index-free adjacency.
GraphDB
http://www.slideshare.net/slidarko/problemsolving-using-graph-traversals-searching-scoring-ranking-and-recommendation
Step by step example
42
Given a list of people, find their homepages
Tree-based DB WAY
43
1
Tree-based DB WAY
43
1
David Funaro
put in the Search Engine2
Tree-based DB WAY
43
1
find
http://davidfunaro.com
3
David Funaro
put in the Search Engine2
Tree-based DB WAY
43
1
find
http://davidfunaro.com
3
David Funaro
put in the Search Engine2
The cost to find a single friend HP grows as the friends HP tables grows
GraphDB WAY
44
it’s like that the GraphDB has an additional information(the ancor <a>)
GraphDB WAY
44
get the embedded information(index)
www.odino.org
1
it’s like that the GraphDB has an additional information(the ancor <a>)
GraphDB WAY
45
<a href=”http://odino.org”>Alessandro Nadalin
</a>
The Anchor work as a local index to reach the document = index-free
adjacency
Local cost
46
The local cost is O(k) = Constant
Local cost
47
The local cost is O(k) = Constant
Local cost
48
Local cost
48
Thus, as the graph grows in size, the cost of a local step remain the same
any database can implicity represent a graph
BUTonly a graph database make the graph
structure explicit
49
Benchmark
50
• 1 Million Vertex
• 4 Million Edge
• Scale Free Tolopogy
• Postgres VS Neo4J
• Both Hash and BTree
Deph RDBMS Graph
1
2
3
4
5
100ms 30ms
1000ms 500ms
10000ms 3000ms
100000ms
50000ms
N/A 100000ms
http://markorodriguez.com/2011/02/18/mysql-vs-neo4j-on-a-large-scale-graph-traversal/
Databases
community that is building and feeding the GraphDB ecosystem
ThinkerPopStack
GraphDB community
Blueprints is a collection of interfaces, implementations, ouplementations, and test suites for the property graph data
model. Blueprints is analogous to the JDBC, but for graph databases.
https://github.com/tinkerpop/blueprints/wiki/
data model and their implementation
provide a collection of "pipes" that are connected togheter to from processing
pipelines
a data flow Framework using Process Graph
a graph-based programming language.
a Turing-Complete graph-base programming language that compiles Gremlin syntax down to Pipes
a REST-full graph shell.
Allow blueprints graph to be exposed through a RESTful API (HTTP)
What's hot
OrientDB
Glossary
58
<10:05>RID
Cluster Position
Glossary
58
<10:05>RID
Cluster Position
CLASS
Main features
Inheritance
class Bike
class Vehicle
class Car
class Bike
class Vehicle
class Car
SELECT FROM Vehicle WHERE owner = 1:1
class Bike
class Vehicle
class Car
can return records of class Bike or Car
Traversal
SELECT FROM fellas WHERE any() traverse(0,-1) ( @rid = [Michelle @rid] )66
67SELECT FROM fellas WHERE any() traverse(0,-1) ( @rid = [Michelle @rid] )
SELECT FROM fellas WHERE any() traverse(0,2) ( @rid = [Michelle @rid] )SELECT FROM fellas WHERE any() traverse(0,2) ( @rid = [Michelle @rid] )
SELECT FROM fellas WHERE any() traverse(0,2) ( @rid = [Michelle @rid] )
SQL synthax
beyond SQL
SELECT FROM authors WHERE book.title = ...
ACID
speaks JSON
{ "schema": { "name": "Address" }, "result": [{ "@type": "d", "@rid": "#13:0", "@version": 6, "@class": "Address", "type": "Residence", "street": "Piazza Navona, 1", "city": "#14:0", "nick": "Luca2" }, { ... ...
Double Protocol
HTTP
HTTP
Universal
HTTP
Easy to interact with
binary
Blazing fast
binary
on-record SELECTs
SELECT FROM cats
SELECT FROM cats
SELECT FROM 11:0
SELECT FROM 11:0
SELECT FROM [11:0,11:1]
SELECT FROM [11:0,11:1]
SELECT FROM [11:0,12:0]
SELECT FROM [11:0,12:0]
stress-free setup
2 Mb
./orient/bin/server.sh
93
in-memory DB
or disk-persisted
Supports standards Supports standards
96
OrientDB
•Inheritance
•Traversal
•Sql syntax like
•ACID
•Speak JSON
•Double protocol
•on-record Select
•ThinkerPop Compliant
Oh, it's Java.
98
PHP ?
somebody started writing thebinary-protocol binding
https://github.com/AntonTerekhov/OrientDB-PHP( beta0.4.1, 28 April 2010 )
$db = new OrientDB($host, $port);
$record = $db->recordLoad('1:1', '*:-1');
// $record instance of OrientDBRecord
and others
Orient Library
104
... are writing a complete library
https://github.com/congow/Orient
Orient = PHP Library to work with OrientDB
105
Data Mapper
Query BuilderHTTP Binding
HTTP Binding
use Congow\Orient;use Congow\Orient\Foundation\Binding;
$driver = new Orient\Http\Client\Curl();$orient = new Binding($driver, '127.0.0.1', '2480', 'admin', 'admin', 'demo');
$response = $orient->query("SELECT FROM Address");
$output = json_decode($response->getBody());
foreach ($output->result as $address){ var_dump($address->street);}
use Congow\Orient;use Congow\Orient\Foundation\Binding;
$driver = new Orient\Http\Client\Curl();$orient = new Binding($driver, '127.0.0.1', '2480', 'admin', 'admin', 'demo');
$response = $orient->query("SELECT FROM Address");
$output = json_decode($response->getBody());
foreach ($output->result as $address){ var_dump($address->street);}
use Congow\Orient;use Congow\Orient\Foundation\Binding;
$driver = new Orient\Http\Client\Curl();$orient = new Binding($driver, '127.0.0.1', '2480', 'admin', 'admin', 'demo');
$response = $orient->query("SELECT FROM Address");
$output = json_decode($response->getBody());
foreach ($output->result as $address){ var_dump($address->street);}
use Congow\Orient;use Congow\Orient\Foundation\Binding;
$driver = new Orient\Http\Client\Curl();$orient = new Binding($driver, '127.0.0.1', '2480', 'admin', 'admin', 'demo');
$response = $orient->query("SELECT FROM Address");
$output = json_decode($response->getBody());
foreach ($output->result as $address){ var_dump($address->street);}
use Congow\Orient;use Congow\Orient\Foundation\Binding;
$driver = new Orient\Http\Client\Curl();$orient = new Binding($driver, '127.0.0.1', '2480', 'admin', 'admin', 'demo');
$response = $orient->query("SELECT FROM Address");
$output = json_decode($response->getBody());
foreach ($output->result as $address){ var_dump($address->street);}
{ "schema": { "name": "Address" }, "result": [{ "@type": "d", "@rid": "#13:0", "@version": 6, "@class": "Address", "type": "Residence", "street": "Piazza Navona, 1", "city": "#14:0", "nick": "Luca2" }, { ... ...
apart from ->query($SQL)
->get|delete|postClass($class)
->post|delete|put|getDocument($rid)
...and much more!
(connect, disconnect, ...)
Query Builder
use Congow\Orient\Query;
$query = new Query();$query->from(array('users'))->where('username = ?', "admin");
echo $query->getRaw(); // SELECT FROM users WHERE username = "admin"
use Congow\Orient\Query;
$query = new Query();$query->from(array('users'))->where('username = ?', "admin");
echo $query->getRaw(); // SELECT FROM users WHERE username = "admin"
use Congow\Orient\Query;
$query = new Query();$query->from(array('users'))->where('username = ?', "admin");
echo $query->getRaw(); // SELECT FROM users WHERE username = "admin"
use Congow\Orient\Query;
$query = new Query();$query->from(array('users'))->where('username = ?', "admin");
echo $query->getRaw(); // SELECT FROM users WHERE username = "admin"
$query->select(array('name', 'username', 'email'), false) ->from(array('12:0', '12:1'), false) ->where('any() traverse ( any() like "%danger%" )') ->orWhere("1 = ?", 1) ->andWhere("links = ?", 1) ->limit(20) ->orderBy('username') ->orderBy('name', true, true) ->range("12:0", "12:1");
SELECT name, username, email FROM [12:0, 12:1] WHERE any() traverse ( any() like "%danger%" ) OR 1 = "1" AND links = "1" ORDER BY name, username LIMIT 20 RANGE 12:0 12:1
Data Mapper
A Doctrine2 strange ODM
namespace Poland\PHPCon\Entity;
use Congow\Orient\ODM\Mapper\Annotations as ODM;
/*** @ODM\Document(class="Person")*/class Speaker{ /** * @ODM\Property( type="string") */ protected $name;
public function setName($name) { $this->name = $name; }
namespace Poland\PHPCon\Entity;
use Congow\Orient\ODM\Mapper\Annotations as ODM;
/*** @ODM\Document(class="Person")*/class Speaker{ /** * @ODM\Property(type="string") */ protected $name;
public function setName($name) { $this->name = $name; }
namespace Poland\PHPCon\Entity;
use Congow\Orient\ODM\Mapper\Annotations as ODM;
/*** @ODM\Document(class="Person")*/class Speaker{ /** * @ODM\Property(type="string") */ protected $name;
public function setName($name) { $this->name = $name; }
namespace Poland\PHPCon\Entity;
use Congow\Orient\ODM\Mapper\Annotations as ODM;
/*** @ODM\Document(class="Person")*/class Speaker{ /** * @ODM\Property(type="string") */ protected $name;
public function setName($name) { $this->name = $name; }
Domain Driven Design
{ "schema": { "name": "Speaker" }, "result": [{ "@type": "d", "@rid": "#1:0", "@version": 6, "@class": "Speaker", "name": "David Coallier" }, { ... ...
{ "schema": { "name": "Speaker" }, "result": [{ "@type": "d", "@rid": "#1:0", "@version": 6, "@class": "Speaker", "name": "David Coallier" }, { ... ...
$david = $mapper->hydrate(json_decode($speaker));
{ "schema": { "name": "Speaker" }, "result": [{ "@type": "d", "@rid": "#1:0", "@version": 6, "@class": "Speaker", "name": "David Coallier" }, { ... ...
$david instanceOf Poland\PHPCon\Entity\Speaker
Repository Pattern
$repo = $manager->getRepository('Speaker')
$speakers = $repo->findAll();
$speaker = $repo->find($rid);
$criteria = array('Name' => 'Lorna');
$lornas = $repo->findBy($criteria);
$criteria = array( 'Name' => 'Lorna', 'last_name' => 'Jane');
$lornaJ = $repo->findOneBy($criteria);
Know your boundaries
138
https://github.com/doctrine/common/tree/master/lib/Doctrine/Common/Persistence
139
Theory sucks.
140
Demo
Demo
142
id type page url
1 external NULL http://www.google.com
2 page 1 NULL
Menu items in RDBMS
Demo
143
rid title url
8:2 google google.com
Menu items in OrientDB
rid title page
9:1 home 1{ Link
PageLink ExternalLink
144
That’s all, folks!
144
David Funaro@ingdavidinohttp://davidfunaro.com
That’s all, folks!
144
David Funaro@ingdavidinohttp://davidfunaro.com
Alessandro Nadalin@_odino_
http://odino.org
That’s all, folks!
144
David Funaro@ingdavidinohttp://davidfunaro.com
Alessandro Nadalin@_odino_
http://odino.org
That’s all, folks!
Credits
http://www.flickr.com/photos/sayamindu/5677281218/sizes/l/in/photostream/http://farm1.static.flickr.com/182/471383865_79d04aec36_o.pnghttp://farm1.static.flickr.com/134/318947873_12028f1b66_b.jpg
http://www.flickr.com/photos/atomdocs/3275758118/sizes/o/in/photostream/http://www.flickr.com/photos/pattipics/5229478393/sizes/o/in/photostream/
http://www.flickr.com/photos/kongharald/366597251/sizes/o/in/photostream/http://www.everaldo.com/
http://www.flickr.com/photos/tusnelda/6140792529/sizes/l/in/photostream/http://www.flickr.com/photos/mondi/5368644355/sizes/l/in/photostream/
http://www.flickr.com/photos/jayneandd/4191106566/sizes/l/in/photostream/http://www.flickr.com/photos/jooon/2093253534/sizes/l/in/photostream/
http://www.flickr.com/photos/bluedharma/89186151/sizes/o/in/photostream/http://www.flickr.com/photos/exfordy/2747089295/sizes/l/in/photostream/
http://www.flickr.com/photos/nostri-imago/3137422976/sizes/o/in/photostream/http://www.flickr.com/photos/fionasjournal/379587818/sizes/z/in/photostream/
http://www.flickr.com/photos/nperlapro/1297392267/http://www.flickr.com/photos/fastphive/28428808/sizes/m/in/photostream/
http://www.flickr.com/photos/rnugraha/2003147365/sizes/o/in/photostream/http://www.flickr.com/photos/zigazou76/4412946911/sizes/l/in/photostream/http://www.flickr.com/photos/greatnet/4667555436/sizes/l/in/photostream/
http://www.flickr.com/photos/mnsc/2768391365/sizes/l/in/photostream/http://www.flickr.com/photos/christmaswithak/4675962453/sizes/l/in/photostream/
http://www.amazon.com/Trainspotting-Irvine-Welsh/dp/0393314804http://www.flickr.com/photos/franconadalin59/5778176872/sizes/l/in/photostream/
http://farm6.static.flickr.com/5176/5474445627_875d621689_b.jpghttp://farm3.static.flickr.com/2243/2189435082_a16d3c89ae_b.jpghttp://farm3.static.flickr.com/2647/3816311930_ac52cff491_o.jpg
http://i130.photobucket.com/albums/p266/feike1977/PES6-4-3-3defencesettings.jpghttp://images.usatoday.com/life/_photos/2006/11/30/numb3rs-topper.jpg
http://www.flickr.com/photos/jakecaptive/3205277810/sizes/l/in/photostream/