38
Combining Predictions for Combining Predictions for Accurate Accurate Recommender Systems Recommender Systems M. Jahrer 1 , A. Töscher 1 , R. Legenstein 2 1 Commendo Research & Consulting 2 Institute for Theoretical Computer Science, Graz University of Technology KDD ‘10 KDD ‘10 2010. 11. 26. Summarized and Presented by Sang-il Song, IDS Lab., Seoul National University

Combining Predictions for Accurate Recommender Systems

  • Upload
    oro

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

Combining Predictions for Accurate Recommender Systems. M. Jahrer 1 , A. Töscher 1 , R. Legenstein 2 1 Commendo Research & Consulting 2 Institute for Theoretical Computer Science, Graz University of Technology KDD ‘10 2010. 11. 26. - PowerPoint PPT Presentation

Citation preview

Page 1: Combining Predictions for Accurate  Recommender Systems

Combining Predictions for Combining Predictions for Accurate Accurate

Recommender SystemsRecommender Systems

M. Jahrer1, A. Töscher1, R. Legenstein2

1Commendo Research & Consulting2Institute for Theoretical Computer Science, Graz University of Technology

KDD ‘10KDD ‘10

2010. 11. 26.

Summarized and Presented by Sang-il Song, IDS Lab., Seoul National University

Page 2: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

ContentsContents

The Netflix Prize

Neflix Dataset

Challenge of Recommendation

Review: Collaborative Techniques

Motivation

Blending Techniques

Linear Regression

Binned Linear Regression

Neural Network

Bagged Gradient Boosted Decision Tree

Kernel Ridge Regression

K-Nearest Neighbor Blending

Results

Conclusion

2

Page 3: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

The Netflix PrizeThe Netflix Prize

Open competition for the best collaborative filtering algorithm

The objective is to improve the performance of Netflix’s own recommendation algorithm by 10%

3

Page 4: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Netflix DatasetNetflix Dataset

480,189 users

17,770 movies

100,480,507 ratings (training data)

Each rating is formed as <user, movie, date of grade, grade>

4

Page 5: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Recommendation ProblemRecommendation Problem

m1

m2

m3

m4

m5

… mN

u1 3 ? 2 1

u2 ? 4 ?

u3 ? 2 3 2

u4 1 ?

u5 5 5

uM ? 1 2

5

Page 6: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Measure of CF algorithmMeasure of CF algorithm

Root Mean Square Error (RMSE)

is estimated rating by algorithm

N is size of test dataset

The original Netflix Algorithm, called “Cinematch”, achieved an RMSE of about 0.95

6

error

Page 7: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Challenges of Recommender SystemChallenges of Recommender System

Size of Data

Places premium on efficient algorithms

Stretched memory limits of standard PCs

99% of data are missing

Eliminates many standard prediction methods

Certainly not missing at random

Countless Factors may affect ratings

Large imbalance in training data

Number of ratings per user or movie varies by several orders of magnitude

Information to estimate individual parameters varies widely

7

Reference R. Bell – Lesson From the Netflix Prize

Page 8: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Collaborative Filtering TechniquesCollaborative Filtering Techniques

Memory based Approach

KNN user-user

KNN item-item

Model based Approach

Singular Value Decomposition (SVD)

Asymmetric Factor Model (AFM)

Restricted Boltzmann Machine (RBM)

Global Effect (GE)

Combination: Residual Training

8

Page 9: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

KNN user-userKNN user-user

Traditional Approach for Collaborative Filtering

Methods

Find k similar users with user u

Aggregate their ratings for item i

9

Page 10: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

KNN user-userKNN user-user

10

Page 11: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

KNN item-itemKNN item-item

Symmetric Approach to KNN user-user

Just flip user and item sides

Methods

Find k similar items with item i

Aggregate their ratings for user u

11

Page 12: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

KNN item-itemKNN item-item

12

Page 13: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

SVD (matrix factorization)SVD (matrix factorization)

Singular Value Decomposition

Dimension Reduction Technique by Matrix Factorization

Capturing Latent Semantics

13

Page 14: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

SVD ExampleSVD Example

14

i1 i2 i3 i4 i5

u1 2 0 3 5 0

u2 1 2 0 1 4

u3 3 0 4 4 0

u4 2 0 1 5 0

u5 0 5 0 0 5

f1 f2 f3

u1 .59 -.11 -.01

u2 .18 .51 -.18

u3 .60 -.11 .65

u4 .50 -.08 -.73

u5 .09 .85 .12

R = is factorized into

f1 f2 f3

f1 10 0 0

f2 08.2

0

f3 0 02.2

i1 i2 i3 i4 i5

f1 .40 .08 .45 .78 .11

f2 -.02 .64 -.10 -.10 .76

f3 .13 .11 .81 -.55 -.05

x x

Page 15: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Asymmetric Factorization ModelAsymmetric Factorization Model

An Extension of SVD

Item is represented by feature vector (same as SVD)

User is represented by items (different from SVD)

15

Feature 1

Featrue 2

Feature 3

0.8 0.1 1.2

Feature 1

Featrue 2

Feature 3

0.2 1.5 0

Feature 1

Featrue 2

Feature 3

0.2 0.2 0

Item 1 Item 2 Item 3

userFeature

1Featrue

2Feature

3

0.4 0.3 1.2

Page 16: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Restricted Boltzmann Machine (RBM)Restricted Boltzmann Machine (RBM)

Neural Network with one input layer and one hidden layer

Handling sparsity problem of data very well

16

Page 17: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Global EffectsGlobal Effects

Motivated from Data normalization

Based on user and item features

support (number of votes)

mean rating

mean standard deviation

Effective when applied to residuals of other algorithms

17

Page 18: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Residual TrainingResidual Training

A popular method to combine CF algorithms

Several models are trained by sequentially

18

Model 1 Model 2 Model 3

Page 19: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

MotivationMotivation

Combinations of different kinds of collaborative filtering

leads to significant performance improvements over individual algorithms

19

Page 20: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

“Thanks to Paul Harrison's collaboration, a simple mix of our solutions improved our result from 6.31 to 6.75”

Rookies

Page 21: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

“My approach is to combine the results of many methods (also two-way interactions between them) using linear regression on the test set. The best method in my ensemble is regularized SVD with biases, post processed with kernel ridge regression”

Arek Paterek

http://rainbow.mimuw.edu.pl/~ap/ap_kdd.pdf

Page 22: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

“When the predictions of multiple RBM models and multiple SVD models are linearly combined, we achieve an error rate that is well over 6% better than the score of Netflix’s own system.”

U of Toronto

http://www.cs.toronto.edu/~rsalakhu/papers/rbmcf.pdf

Page 23: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Gravity

home.mit.bme.hu/~gtakacs/download/gravity.pdf

Page 24: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

“Our common team blends the result of team Gravity and team Dinosaur Planet.”

Might have guessed from the name…

When Gravity and Dinosaurs Unite

Page 25: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

And, yes, the top team which is from AT&T…

“Our final solution (RMSE=0.8712) consists of blending 107 individual results. “

BellKor / KorBell

Page 26: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Blending ProblemBlending Problem

26

Alg 1 Alg 2 Alg 3 Alg 4Ratin

g

Data 1

3 3.3 3.2 2.5 3

Data 2

2.2 2.4 3 1.9 2

Data 3

2.8 3.2 3 2.9 ?

Data 4

1 1.1 1.3 2 3

Data 5

0.9 1.1 1.2 1 ?

Page 27: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Blending MethodsBlending Methods

Linear Regression (baseline)

Binned Linear Regression

Neural Network

Bagged Gradient Boosted Decision Tree

Kernel Ridge Regression

K-Nearest Neighbor Blending

27

Page 28: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Linear RegressionLinear Regression

Baseline

Assume a quadratic error function

Find optimal linear combination weight w

By solving the least squares problem

Weight w can be calculated with ridge regression

28

Page 29: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Binned Linear RegressionBinned Linear Regression

A Simple Extension of Linear Regression

Training dataset can be divided into B disjoint subjects

Training dataset may be very huge

Each subset can be used to learn different weight wb

Training set can be split by using following criteria:

Support (number of votes)

Time

Frequency (number of ratings from a user at day t).

29

Page 30: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Neural Network (NN)Neural Network (NN)

Efficient for huge data sets

30

Alg 1 Alg 2 Alg 3 Alg 4

Rating

Page 31: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Bagged Gradient Boosted Decision Tree Bagged Gradient Boosted Decision Tree (BGBDT)(BGBDT)

Single Decision Tree

Discretized output => limits its ability to model smooth functions

The number of possible outputs corresponds to the number of leaves

A Single tree is trained recursively by splitting always that leaf which provides the output value for the largest number of training samples

Bagging

Training Nbag copies of the model slightly different training set

(Stochastic Gradient) Boosting

Each model learns only a fraction of the desired function Ω

31

Page 32: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

BGBDTBGBDT

32

Page 33: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Kernel Ridge Regression Blending Kernel Ridge Regression Blending (KRR)(KRR)

Kernel Ridge Regression

Regularized least square method for classification and regression

Similar to an SVM

– But, emphasis on points which don’t close to the decision boundary

Suitable for a small number of features and many training data sets.

Training complexity: O(n3)

Space requirement s: O(n2)

33

Page 34: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

K-Nearest Neighbor Blending (KNN)K-Nearest Neighbor Blending (KNN)

Find k Similar Training Data Samples <user,item>

Aggregate the target value

34

Alg1 Alg2 Alg3 Alg4 Rating

Sample 1

4 3.2 3.2 3.6 ?

Sample 2

3 2.7 2 2.9 3

Sample 3

1 1.2 0.8 0.9 1.5

Sample 4

4 3 3.3 3.3 3.3

Sample 5

2 2 2 2 2

Page 35: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

Experimental SetupExperimental Setup

18 CF Algorithms

4 versions of AFM

4 versions of GE

4 versions of KNN-item

2 versions of RBM

4 versions of SVD

1,400,000 samples

Running at 3.8 GHz CPU with 12GB main memory

35

Page 36: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

ResultsResults

36

Page 37: Combining Predictions for Accurate  Recommender Systems

Copyright 2010 by CEBT

ConclusionsConclusions

The combinations of Collaborative Filtering Algorithms outperforms the single collaborative filtering algorithms

37

Page 38: Combining Predictions for Accurate  Recommender Systems

Thank youThank you

38