Upload
matthew-deiters
View
7.111
Download
0
Tags:
Embed Size (px)
DESCRIPTION
The social trend in the industry has shifted users’ expectations to highly personalized experiences, presenting them with only personally relevant information. This presentation should arm you with the skills to make your system more intelligent and the arsenal of designs and tools to do it with.To introduce the basics we will discuss how to implement very familiar recommendations like LinkedIn’s degrees of separation, Facebook’s friend suggestions & smart news feeds, and Amazon’s product recommendations. Also personal experiences from real world rails projects will be shared to better understand issues with performance, scale and limitations to certain tools. At the end of 45 minutes you will walk away with the ability to implement personalized recommendations in your app by understanding:* How to discover relationships in your data* Effectively model these relationships to infer personalized recommendations* Successful patterns for incorporate these recommendations in your rails application
Citation preview
You May Also Be Interested In:Implemen'ng User Recommenda'ons in Rails
Ma#hew Deiterstheagiledeveloper.com
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
How do they do that?
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
think about our data in new ways
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Understand the tools to do something interesBng
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Use those tools in your Rails applicaBon
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
"Everything should be made as simple as possible, but not simpler."
-‐ Albert Einstein
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
CreaBve RevoluBonthe Golden Age of Adver;sing
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Their vision of consuming life had liIle to do with actual experience of American consumers
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
An Academic approach
“Adver;sers followed the wishes of conserva;ve corporate clients, who wanted the safe, “scien;fc” adver;sing
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Technology Changed&
Approach Changed
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
In the late 40’s, only .05% of Americans had a TV
End of 50’s, sBll only about 50%
5 years later in 1962, 90% of Americans had a TV
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Adver;sing was comfortable “Male dominated” “academic” approach
Audience SegmentaBon
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
SimilariBes in Web Development Today
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
NoSQLDynamic Languages
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
We’ve been doing CRUD for 30 yearsboring for us and the user
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Our CreaBve RevoluBonHow can we change our approach
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Amazon is reportedly making ~25% of sales on personalized suggesBons
Net sales are at $7.13 billion in the first quarter Jan-‐Mar 2010 -‐ roughly 2 billion in 3 months
75%
25%
RecommendaBons = Money
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
#1 & #2 – StumbleUpon's rank among social media traffic sources in the US.
118% -‐ the growth in acBve users since 2009.
Almost 10 Million registered users
400 – average number of Bmes a user stumbles per month.
RecommendaBons = Traffic
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Discovering the relaBonships in your data
Modeling the relaBonships
Using graphs in your Rails App
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Precious
Dick Cheney
President Obama ”The greatest thing about Facebook, is that you can quote something and totally make up the source.” - George Washington
John Adams lol...so true
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Precious
Dick Cheney
President Obama ”The greatest thing about Facebook, is that you can quote something and totally make up the source.” - George Washington
John Adams lol...so true
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
http://www.hackdiary.com/2010/02/10/algorithmic-recruitment-with-github/Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Social Networking
Content
Website AnalyBcs
PredicBve Analysis
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Discovering the relaBonships in your data
Modeling the relaBonships
Using graphs in your Rails App
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
RelaBonal Databases != worksSQL is set based
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Lady GagaKevin Bacon
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Lady GagaKevin Bacon
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Example: Finding 2 degrees away for a user in SQL
-‐ 100 people executes in 0.01 sec
-‐ 1000 people executes in 0.1 sec
-‐ 4000 people executes in 17.78 sec
-‐ 60K over an hour
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
SQL Smell
Courtsey of http://techportal.ibuildings.com/2009/09/07/graphs-in-the-database-sql-meets-social-networks/
WITH RECURSIVE transitive_closure(a, b, distance, path_string) AS( SELECT a, b, 1 AS distance, a || '.' || b || '.' AS path_string, b AS direct_connection FROM edges2 WHERE a = 1 -- set the starting node
UNION ALL
SELECT tc.a, e.b, tc.distance + 1, tc.path_string || e.b || '.' AS path_string, tc.direct_connection FROM edges2 AS e JOIN transitive_closure AS tc ON e.a = tc.b WHERE tc.path_string NOT LIKE '%' || e.b || '.%' AND tc.distance < 3)
SELECT * FROM transitive_closure--WHERE b=3 -- set the target node
ORDER BY a,b,distance
SELECT b FROM (
WITH RECURSIVE transitive_closure(a, b, distance, path_string) AS( SELECT a, b, 1 AS distance, a || '.' || b || '.' AS path_string FROM edges2 WHERE a = 1 -- set the starting node
UNION ALL
SELECT tc.a, e.b, tc.distance + 1, tc.path_string || e.b || '.' AS path_string FROM edges2 AS e JOIN transitive_closure AS tc ON e.a = tc.b WHERE tc.path_string NOT LIKE '%' || e.b || '.%' AND tc.distance = 0)
SELECT b FROM transitive_closureUNION ALL
(WITH RECURSIVE transitive_closure(a, b, distance, path_string) AS( SELECT a, b, 1 AS distance, a || '.' || b || '.' AS path_string FROM edges2 WHERE a = 4 -- set the target node
UNION ALL
SELECT tc.a, e.b, tc.distance + 1, tc.path_string || e.b || '.' AS path_string FROM edges2 AS e JOIN transitive_closure AS tc ON e.a = tc.b WHERE tc.path_string NOT LIKE '%' || e.b || '.%' AND tc.distance = 0)
SELECT b FROM transitive_closure)) AS immediate_connectionsGROUP BY b
HAVING COUNT(b) > 1;
2 degrees more then 2 degrees
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
First Last
1
2
3
Luke Skywalker
Darth Vader
Princess Leah
How is Luke connected to Princess Leah?
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
First Last
1
2
3
Luke Skywalker
Darth Vader
Princess Leah
How is Luke connected to Princess Leah?
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
First Last
1
2
3
Luke Skywalker
Darth Vader
Princess Leah
How is Luke connected to Princess Leah?
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
First Last
1 Luke Skywalker
GraphsRela;onships are a First Class Ci;zen just like the data
2 Darth Vader
3 Princess Leah
Father
Father
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
First Last
1 Luke Skywalker
Rows Nodes
2 Darth Vader
3 Princess Leah
Father
Father
Node
Point
Actor
Vertex
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
First Last
1 Luke Skywalker
Edges
2 Darth Vader
3 Princess Leah
Father
Father
Edge
Rela;onship
Arc
Link
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Less complex, 100% natural
Luke
PrincessDarth
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Discovering the relaBonships in your data
Modeling the relaBonships
Using graphs in your Rails App
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
In Memory Ruby GraphGreat for small staBc datasets or ad hoc querying
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Rails App
RGL
RGL
Rails App
BackroundRB
RGL
or
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
# http://github.com/fmeyer/rgl
require 'rubygems'require 'rgl/adjacency'require 'rgl/dot'
graph = RGL::DirectedAdjacencyGraph.newgraph.add_edge 'mary', 'john'graph.add_edge 'mary', 'henery'graph.add_edge 'john', 'frank'graph.add_edge 'frank', 'henery'
# Use DOT to visualize this graph: # http://graphviz.org/Download_macos.phpgraph.write_to_graphic_file('jpg') `open graph.jpg`
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
20,000 nodes and 1 million edges = 300 MB
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Persistence & Dynamic Data
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Neo4j
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Neo4j.rb
Neo4j
Rails App
Neo4j.rbLucene
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
class Person include Neo4j::NodeMixin property :name has_n :friendsend
Neo4j.startNeo4j::Transaction.run do andreas = Person.new :name => 'andreas' john = Person.new :name => 'John', :age => 30 andreas.friends << johnend
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Neo4jr-‐socialMore REST, Less Java
Rails App
Neo4jr-social
Neo4j
Memcache
RDBMS
SolrLucene
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Neo4jr-‐socialThe SOLR of Graphs (uses Neo4jr-‐simple)
#> sudo gem install neo4jr-social
#> start-neo4jr-social
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
No JRuby Required
Packaged as self-‐contained jeIy webserver
Deployable WAR
Focused on SNA
Basic Built in Querying
Extensible ~/.neo4jr-‐social
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Facebook’s Friend SuggesBonsDemo
Johnathan
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
LinkedIn’s Degrees of SeparaBonDemo
Johnathan
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
http://wiki.github.com/tinkerpop/gremlin/
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
SQLAssembly Required
Graphs (Neo4j)Works out of the Box
Thursday, June 10, 2010
AllSimplePaths & ShortestPath(Who do we follow on Twiber in common)
Thursday, June 10, 2010
Thursday, June 10, 2010
AllSimplePaths & ShortestPath(Who do we follow on Twiber in common)
Thursday, June 10, 2010
Dijkstra
score: 20 score: 5
score: 5
score: 5 score: 5
score: 5
score: 10
score: 10
(Who do we follow on Twiber in common -‐ on steroids)
Thursday, June 10, 2010
Closeness Centrality(Who has the most followers on Twiber)
Thursday, June 10, 2010
Betweenness Centrality
http://www.stoweboyd.com/message/its-betweenness-that-matters-not-your-eigenvalue-the-dark-ma.html
(Who has more influen;al people following them on Twiber)
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
http://www.hackdiary.com/2010/02/10/algorithmic-recruitment-with-github/Thursday, June 10, 2010
Eigenvector Centrality(PageRank)
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Performance
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Sparse vs Dense
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Thursday, June 10, 2010
Person
started_on => 'Jan 1 2010'finished_on => 'Jan 7 2010'hours => 40type => 'worked_on'
Project
Metadata
Thursday, June 10, 2010
Person
started_on => 1265639038finished_on => 1266243897hours => 40type => 'worked_on_reorg'
Project
Metadata
Thursday, June 10, 2010
Breadth or Depth
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
What Next...
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Inspired to think about your data in new ways
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
GraphsUnderstand the tools to do something
interes;ng with it
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Use these tools in your Rails applicaBon
Thursday, June 10, 2010
NoSQL AugmentaBon
Application
Memcache
RDBMS
Neo4j
Solr
Lucene
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
CreaBve RevoluBon
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
QuesBons?
Thursday, June 10, 2010
theagiledeveloper.com @mdeiters
Links
•h#p://gist.github.com/431000
•h#p://github.com/andreasronge/neo4j
•h#p://github.com/mdeiters/neo4jr-‐simple
•h#p://github.com/mdeiters/neo4jr-‐social
•h#p://www.hackdiary.com/2010/02/10/algorithmic-‐recruitment-‐with-‐github/
•h#p://techportal.ibuildings.com/2009/09/07/graphs-‐in-‐the-‐database-‐sql-‐meets-‐social-‐networks/
Thursday, June 10, 2010