1
Collaborative Filtering: Latent Variable Model
LIU Tengfei
Computer Science and Engineering DepartmentApril 13, 2011
2
Outline Overview of CF approaches
Model based approach-latent variable model Probabilistic latent semantic analysis (PLSA) Other latent variable models
Summary
4
Overview of CF Approaches CF Categories
Memory-based CF Conduct certain forms of nearest neighbor search in
order to predict the rating for particular use-item pair.
Model-based CF Train a compact model that explains the given data
so that ratings could be predicted via the model.
5
Outline Overview of CF approaches
Model based approach Probabilistic latent semantic analysis (PLSA) Other latent variable model
Summary
6
Model based approach Question:
What is the shortcomings of memory based methods?
Reasons: Suboptimal solution problem Little knowledge learned from data Computationally expensive in local-neighbor
search ……
7
Probabilistic Latent Semantic Analysis The Problem
We want to predict the rating r that user u may assign to item i
Why latent variable model? Consider a simple case:
User x like/dislike item y “because of” some reason The reason can not be observed, but may exist We introduce a latent variable to model it
8
Probabilistic Latent Semantic Analysis Question: what rating that user u is likely to give to item i?
Can we describe it with probability? The probability that the rating a user give to an item is
decomposed into a sum of products.
z is the latent variable
Probability that class z (can be seen as community in CF) would assign score r to item i. Mixing proportion
10
Probabilistic Latent Semantic Analysis Model as a Gaussian distribution
Mixing proportion can be modeled as a categorical distribution
12
Probabilistic Latent Semantic Analysis Model parameters can be learnt by maximizing the
following log likelihood of observed data
This can be readily solved using EM algorithm
13
Probabilistic Latent Semantic Analysis Question 1:
how to learn the model parameters by EM algorithm?
Question 2: how to understand EM algorithm?
14
Other latent variable models Probabilistic latent preference analysis
(PLPA)
Reference: NN. Liu et al, Probabilistic Latent Preference
Analysis for Collaborative Filtering, CIKM’09
15
Outline Overview of CF approaches
Model based approach-latent variable model Probabilistic latent semantic analysis (PLSA) Other latent variable models
Summary
16
Summary CF is popular
Memory based method Advantages and shortcomings
Model based method Latent variable model
Probabilistic latent semantic analysis
Other latent variable models
18
Reference Thomas Hofmann, Collaborative Filtering via Gaussian Probabilis
tic Latent Semantic Analysis, SIGIR 2003 Thomas Hofmann, Latent Semantic Models for Collaborative Filt
ering, In ACM Transactions on Information Systems, 2004 Abhinandan Das et al, Google News Personalization: Scalable On
line Collaborative Filtering, WWW 2007 NN. Liu et al, Probabilistic Latent Preference Analysis for Collabo
rative Filtering, CIKM’09 Xiaoyuan Su et al, A Survey of collaborative Filtering Techniques,
2009