Upload
ryan-weald
View
1.226
Download
2
Embed Size (px)
DESCRIPTION
From Amazon, to Spotify, to thermostats, recommendation systems are everywhere. The ability to provide recommendations for your users is becoming a crucial feature for modern applications. In this talk I'll show you how you can use Ruby to build recommendation systems for your users. You don't need a PhD to build a simple recommendation engine -- all you need is Ruby. Together we'll dive into the dark arts of machine learning and you'll discover that writing a basic recommendation engine is not as hard as you might have imagined. Using Ruby I'll teach you some of the common algorithms used in recommender systems, such as: Collaborative Filtering, K-Nearest Neighbor, and Pearson Correlation Coefficient. At the end of the talk you should be on your way to writing your own basic recommendation system in Ruby.
Citation preview
People who liked this talk also liked … Building Recommendation Systems
Using Ruby
Ryan Weald, @rwealdLA RubyConf 2013
1
Who is this guy?
What does he know about recommendation
systems?
2
Data Scientist @Sharethrough
Native advertising platform
3
4
Outline
1) What is a recommendation system?
2) Collaborative filtering based recommendations
3) Content based recommendations
4) Hybrid systems - the best of both worlds
5) Evaluating your recommendation system
6) Resources & existing libraries
5
What this Talk is Not
• Everything there is to know about recommendation systems.
• Bleeding edge machine learning
• How to use a specific library
6
What is a recommendation system?
7
A program that predictsa user’s preferences using information about the user, other users, and the
items in your system.
8
9
Netflix
10
Spotify
11
Amazon
12
How do I build recommendations?
13
Two Main Categories of Algorithm
1. Collaborative Filtering (CF)
2. Content Based - Classification
14
Collaborative Filtering
Fill in missing user preferences using similar users or items
15
1. Memory Based - Uses similarity between users or items. Dataset usually kept in memory
2. Model Based - Model generated to “explain” observed ratings
Two Types of CF
16
(User x Item) Matrix + Similarity Function = Top-K most similar users
User Based CF
17
Video 1 Video 2 Video 3 Video 4 Video 5
User 1
User 2
User 3
User 4
User 5
0 1 0 5 0
1 2 1 0 5
2 5 0 0 2
5 4 4 1 1
2 4 ? ? 2
Collaborative Filtering
* 0 denotes not rated
18
Similarity Functions
• Pearson Correlation Coefficient
• Cosine Similarity
19
Pearson Correlation Coefficient
20
Calculating PCC
21
Calculating PCC
22
Calculating PCC
23
Calculating PCC
24
Calculating PCC
25
Calculating PCC
26
27
Using similarity to recommend items
28
Video 1 Video 2 Video 3 Video 4 Video 5
User 1
User 2
User 3
User 4
User 5
0 1 0 5 0
1 2 1 0 5
2 5 0 0 2
5 4 4 1 1
2 4 ? ? 2
Collaborative Filtering
* 0 denotes not rated
29
30
• Cold Start
• Data Sparsity
• Resource expensive
Problems With CF
31
Doesn’t the video content matter for recommendations?
32
Content Based Recommendations
Classify items based on features of the item. Pick other items from
same class to recommend.
33
Content Based Algorithms
• K-means clustering
• Random Forrest
• Support Vector Machines
• ...
• Insert your favorite ML algorithm
34
Content Based AlgorithmsType ofcontent
Duration MaturityRating
Video 1
Video 2
Video 3
Video 4
Video 5
comedy 60 G
action 120 G
comedy 34 PG-13
romantic 15 R
sports 120 G
35
K-means Clustering
Group items into K clusters. Assign new item to a cluster and
pick items from that cluster
36
K-means Clustering
37
• Unsupervised Learning is hard
• Training data limited or expensive
• Doesn’t take user into account
• Limited by features of content
Problems With Content Based Recommendations
38
Hybrid Recommendations
Combine collaborative filtering with content based algorithm to achieve
greater results
39
Hybrid Recommendations
Content Based Recommender
CF BasedRecommender
Combiner Reco
Input
Input
40
Hybrid Recommendations
41
Hybrid Recommendations
InputCF
RecommenderContent
RecommenderReco
42
Hybrid Recommendations
CFRecommender
Content Recommender
Input Reco
43
Evaluating Recommendation Quality
• Precision vs. Recall
• Clicks
• Click through rate
• Direct user feedback
44
Precision vs. Recall
45
Precision vs. Recall
46
Summary of What We’ve Learned
• Collaborative Filtering using similar users
• Content clustering using k-means
• Combining 2 algorithms to boost quality
• How to evaluate your recommender
47
Don’t Reinvent the Wheel
• Apache Mahout
• JRuby mahout gem
• SciRuby
• Recommenderlab for R
48
Resources & Further Reading
• Recommender Systems: An Introduction
• Linden, Greg, Brent Smith, and Jeremy York.
"Amazon. com recommendations: Item-to-item
collaborative filtering."
• Resnick, Paul, et al. "GroupLens: an open architecture
for collaborative filtering of netnews."
• ACM RecSys Conference Proceedings
49
We’re Hiringhttp://bit.ly/str-engineering
50