Matrix Factorization Technique for Recommender Systems

IntroductionMatrix Factorization Methods

Netflix Prize CompetitionConclusion

MATRIX FACTORIZATION TECHNIQUE FORRECOMMENDER SYSTEMS

Oluwashina Aladejubelo

Universite Joseph Fourier,Grenoble, France

June 6, 2015

Oluwashina Aladejubelo Matrix Factorization Techniques for Recommender Systems

About Me

Bachelor of Science, Ambrose Alli University, Nigeria(2004-2008)

IT Business Analyst, Virgin Nigeria Airlines (2009-2011)

Team Lead/Software Architect, Speckless InnovationsLimited (2011-2014)

Master of Informatics (M2 MOSIG), Universit JosephFourier, Grenoble (2014-2015)

Master Thesis on ”Distributed Large-Scale Learning” withPr. Massih-Reza Amini.

Overview

1 Introduction

2 Matrix Factorization Methods

3 Netflix Prize Competition

4 Conclusion

1 IntroductionRecommender SystemsContent Filtering ApproachCollaborative Filtering ApproachContent vs Collaborative Filtering

2 Matrix Factorization MethodsMatrix Factorization Model (MFM)Stochastic Gradient DescentAlternating Least SquaresAdding BiasesAdditional Input SourceTemporal DynamicsVarying confidence levels

4 Conclusion

Recommender Systems

Recommender systems analyze patterns of user interest inproducts to provide personalized recommendations

They seek to predict the rating or preference that user wouldgive to an item

Recommender Systems

Such systems are very useful for entertainment products suchas movies, music, and TV shows.

Many customers will view the same movie and each customeris likely to view numerous different movies.

Huge volume of data arise from customer feedbacks which canbe analyzed to provide recommendations

Content Filtering Approach

creating profile for each user or product to characterize itsnature.programs associate users with matching products.

it requires gathering external information that may not beavailable

Collaborative Filtering Approach

depends on past user behaviour, e.g. previous transactions orproduct rating

does not rely on creation of explicit profiles

Collaborative Filtering Approach

the primary areas of collaborative filtering are neighborhoodmethods and latent factor models

neighborhood is based on computing the relationshipsbetween items or users

latent factor models tries to explain by characterizing bothitems and users on say, 20 to 100 factors inferred from theratings patterns

Content vs Collaborative Filtering

Collaborative filtering address data aspects that are difficult toprofile.

it is generally more accurate

suffers from cold startup problem (new product / new user) inwhich case content filtering is better

4 Conclusion

Matrix Factorization Model (MFM)

some of the most successful realizations of latent factormodels are based on matrix factorization

it characterizes both items and users by vectors of factorsinferred from item rating patterns

high correspondence between item and user factors leads to arecommendation

MFM maps both users & items to a joint latent factor spaceof dimensionality f

the user-item interactions are modeled as inner products inspace f

each item i is associated with a vector qi ∈ Rf

each user u is associated with a vector pu ∈ Rf

the approximate user rating is given by

r̂ui = qTi Pu (1)

carelessly addressing only the relatively few known entries ishighly prone to overfitting

observed ratings can be modeled directly with regularizationas follows

minq∗,p∗∑

(u,i)∈κ

(rui − qTi pu)2 + λ(||qi ||2 + ||pu||2) (2)

κ is a set of (u, i) pairs for which rui is knownOluwashina Aladejubelo Matrix Factorization Techniques for Recommender Systems

Stochastic Gradient Descent (SGD) - Simon Funk; 2006

SGD approach can be used for solving the equation (2)

For each given training case, the system predicts ru i andcomputes the prediction error

eui = rui − qTi pu

it modifies the parameters by a magnitude proportional to γin the opposite direction of the gradient, yielding∈ Rf

qi ← qi + γ.(eui .pu − γ.qi )pu ← pu + γ.(eui .qi − γ.pu)

combines ease with a relatively fast runtime

Alternating least squares

Because both qi and pu are unknown, equation (2) is notconvex

if we fix one of the unknowns the quadratic optimization canbe solved optimally

when all pu are fixed the system recomputes the qi by solvinga least-squares problem and vice versa

each step decreases the minimization problem untilconvergence

massively parallelizable

Adding Biases

rating values are also affected by biases independent of anyinteraction

a first-order approximation of the bias involved in rating rui is

bui = µ+ bi + bu (3)

µ denotes the average rating, bu and bi are the observeddeviations of user u on item i

therefore,

r̂ = µ+ bi + bu + qTi pu (4)

equation(2) also becomes,

minq∗,p∗,b∗

∑(u,i)∈κ

(rui−µ−bu−bi−qTi pu)2+λ(||qi ||2+||pu||2+b2u+b2

i ) (5)

Additional Input Sources

cold start problem could be as a result of user supplying veryfew ratings-difficulty to conclude on their taste

behavioural information such as purchase and browsing historycan be used for implicit feedback

let’s say N(u) denotes the set of itels for which user uexpressed an implicit preference

a new set of item factors is given by xi ∈ Rf

a user who showed a preference for items in N(u) ischaracterized by the vector

∑i∈N(u)

normalizing the sum we have,

|N(u)|−0.5∑

i∈N(u)

another information source is known as user attribute, e.g.demographics, gender, age, income level and so on

let A(u) denote set of attributes of a user u

a distinct factor vector ya ∈ Rf corresponds to each attributeto describe a user through the set of user-associatedattributes: ∑

a∈A(u) ya

the matrix factorization model should intergrate all signalsources, with ehanced representation:

r̂ui = µ+ bi + bu + qTi [pu + |N(u)−0.5∑

i∈N(u)

xi +∑

a∈A(u)

ya] (6)

items can get a similar treatment

Temporal Dynamics

in reality customers’ inclinations evolve, leading them toredefine their taste

it is therefore important to accommodate this temporal effectsreflecting the dynamic, time-drifting nature of user-iteminteractions

the following terms vary over time: item biases, bi (t); userbiases, bu(t); and user preferences, pu(t)

equation (4) therefore becomes,

r̂(t) = µ+ bi (t) + bu(t) + qTi pu(t) (7)

Varying Confidence Level

other factors like massive advertisement can influenceobserved ratings, which do not reflect long-term characteristics

hence the need for a weighting scheme or confidence

confidence can stem from available numerical values thatdescribe the frequency of actions, e.g. how much time theuser watched a show

in matrix factorization less weight is given to less meaningfulaction

Varying Confidence Level

if confidence in observing ru i is denoted as cu i , then the modelenhances equation (5) to account for confidence as follows

minq∗,p∗,b∗

∑(u,i)∈κ

cui (rui−µ−bu−bi−qTi pu)2+λ(||qi ||2+||pu||2+b2u+b2

i ) (8)

4 Conclusion

Netflix Prize Competition

in 2006, Netflix announced a contest to improve the state ofits recommender system

training data comprised of 100 million ratings sapnning500,000 annonymous customers’ rating of 17,000 movies

each movie was rated on a scale of 1 to 5 stars

test data was 3million ratings

the metrics was 10 percent or more root-mean-square error(RMSE) performance better than Netflix algorithm

Netflix Prize Competition

4 Conclusion

Conclusion

matrix factorization techniques have become a dominantmethodology within collaborative filtering recommenders

experience with the Netflix competion has shown that theydeliver accuracy superior to classical nearest-neighbortechniques

they integrate many crucial aspects of the data, such asmultiple forms of feedback, temporal dynamics and confidencelevels.

Reference

Y. Koren, R. Bell and C. Volinsky: Matrix Factorization Techniquesfor Recommender Systems, AT&T Labs-Research, 2009

THANK YOU!

Matrix Factorization Technique for Recommender Systems

Education

Matrix Factorization Techniques For Recommender Systems · Paper Backgrounds 3 Matrix Factorization Techniques For Recommender Systems Yehuda Koren, Yahoo Research Robert Bell and

Matrix Factorization In Recommender Systems

Matrix Factorization Techniques for Recommender Systemsstaff.ustc.edu.cn/~ynyang/group-meeting/2014/matrix... · · 2014-10-22Matrix Factorization Techniques for Recommender Systems

A Novel Non-Negative Matrix Factorization Method for ......A Novel Non-Negative Matrix Factorization Method for Recommender Systems Mehdi Hosseinzadeh Aghdam, Morteza Analoui∗and

Scalable Machine Learninggobie.csb.pitt.edu/SML/MatrixFactorization.pdf · Scalable Machine Learning Matrix factorization. ... I Representation learning. Recommender systems There

Recommender System 고급기술sigai.or.kr/workshop/AI-for-everyone/2016/slides... · 2016-12-03 · • Context-incorporated Modeling and Techniques: ... Context-Aware Matrix Factorization

Matrix Factorization Techniques For Recommender Systems · Matrix Factorization Techniques For Recommender Systems Collaborative Filtering Markus Freitag, Jan-Felix Schwarz 28 April

Compact Matrix Factorization With Dependent Subspacesopenaccess.thecvf.com/...Compact_Matrix_Factorization_CVPR_2017… · Compact Matrix Factorization with Dependent Subspaces Viktor

Towards Interactive Recommending in Model-based Collaborative Filtering Systems · 2019. 9. 18. · Recommender Systems; Matrix Factorization; User Experience ACM Reference Format:

Matrix Factorisation / Spotify · Spotify Improvements for the Matrix Factorization Model Netflix Prize Competition 2. Recommender Systems 3. Content Filtering Create a profile for

Robust Nonnegative Matrix Factorization

Matrix Factorization with Knowledge Graph Propagation for ...MATRIX FACTORIZATION (MF) 29 Reasoning with Matrix Factorization Word Relation Model Slot Relation Model word relation

Parallel Matrix Factorization for Recommender Systemsrofuyu/papers/kais-pmf.pdf · to the design of fast and scalable methods for large-scale matrix factorization problems [2, 3,

Empirical Bayes Matrix Factorization

HHMF: hidden hierarchical matrix factorization for ... · Keywords Hierarchical matrix factorization · Collaborative ﬁltering ·Recommender systems 1Introduction Recommender systems

Nonnegative Matrix Factorization - Complexity, Algorithms ... · Nonnegative Matrix Factorization, Neural Computation 2012. Householder XIX Nonnegative Matrix Factorization: Complexity,

MATRIX FACTORIZATION AND LIFTING

Matrix Factorization For Topic Models - Derek Greene · Matrix Factorization For Topic Models Dr. Derek Greene Insight Latent Space Workshop. Non-negative Matrix Factorization •

Deep Matrix Factorization Models for Recommender Systems · Deep Matrix Factorization Models for Recommender Systems Hong-Jian Xue, Xin-Yu Dai, Jianbing Zhang, Shujian Huang, Jiajun

Parallel Matrix Factorization for Recommender Systemsinderjit/public_papers/kais...Parallel Matrix Factorization for Recommender Systems 5 Fig. 1. Comparison between ALS, DSGD, and