Upload
alan-said
View
605
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Recommender Systems need to deal with different types of users who represent their preferences in various ways. This difference in user behaviour has a deep impact on the final performance of the recommender system, where some users may receive either better or worse recommendations depending, mostly, on the quantity and the quality of the information the system knows about the user. Specifically, the inconsistencies of the user impose a lower bound on the error the system may achieve when predicting ratings for that particular user. In this work, we analyse how the consistency of user ratings (coherence) may predict the performance of recommendation methods. More specifically, our results show that our definition of coherence is correlated with the so-called magic barrier of recommender systems, and thus, it could be used to discriminate between easy users (those with a low magic barrier) and difficult ones (those with a high magic barrier). We report experiments where the rating prediction error for the more coherent users is lower than that of the less coherent ones. We further validate these results by using a public dataset, where the magic barrier is not available, in which we obtain similar performance improvements.
Citation preview
The Magic Barrier of Recommender Systems:
No Magic, Just ra;ngs
Alejandro Bellogín, Alan Said, Arjen de Vries @abellogin, @alansaid, @arjenpdevries
UMAP 2014 2014-‐07-‐08
What is the magic barrier?
2
The Magic Barrier The upper bound on recommenda;on accuracy Recommenda;on accuracy is only as good as the quality of the underlying data Users are noisy / incoherent in their ra;ng behavior
3
How to find the magic barrier? 1. Collect infinite re-‐ra;ngs for each
item from your users. 2. Find standard devia;on of ra;ng/
re-‐ra;ng inconsistencies (noise) [Said et al. UMAP ‘12]
Realis;cally, an infinite amount of re-‐ra;ngs is not probable
4
Coherence
5
What is coherence? Coherent user
we have two users that have rated the exact same set of items, although their ratingsare slightly different. Specifically, the user in Figure 1a gives more consistent ratingsto items sharing the same features, which in this case corresponds to the movie’s gen-res. In the long term we argue that such a user will have a more consistent (less noisy)behaviour, since her taste for each item feature seem to be well defined.
Action Comedy
Horror
(a) A coherent user (c(u) = �1.28)
Action Comedy
Horror
(b) A not coherent user (c(u) = �5.95)
Fig. 1: Example of a coherent vs. not coherent user. Our definition of coherence takesinto account the rating’s deviation within each item feature, which in this example con-sists of three genres: action, comedy, and horror.
3.2 A Simple Definition for User Coherence
Following the rationale presented before, we define the user coherence based on a setof item features F as:
c(u) = �X
f2F�f (u) (2)
where �f (u) =qP
i2I(u,f) (r(u, i)� r̄f (u))2 corresponds to the user’s rating devia-tion within a specific feature f , having an associated mean rating for that feature r̄f (u),which simply corresponds to the average rating within the set of items rated by user u
that belong to feature f , denoted here as I(u, f). We refer to this formulation as basic
coherence.With this formulation, the coherence c(u) captures the variance of a user’s ratings
relative to the feature space in which the items are defined. Moreover, it also incorpo-rates a negative sign to indicate that the larger the variance, the less coherent (or moreincoherent) a user should be.
The key aspect of this function, hence, is that we are accounting for the rating de-viation of the user with respect to a particular feature space. Besides, such definitionallows for a more general case, where the space F could be – instead of (textual) itemfeatures – any embedding of the items into a space F , such as an item clustering orthe latent factors of the items, and also other functions apart from standard deviation tostatistically summarise the user’s ratings for a given feature, as will be described in thenext section.
Incoherent user
we have two users that have rated the exact same set of items, although their ratingsare slightly different. Specifically, the user in Figure 1a gives more consistent ratingsto items sharing the same features, which in this case corresponds to the movie’s gen-res. In the long term we argue that such a user will have a more consistent (less noisy)behaviour, since her taste for each item feature seem to be well defined.
Action Comedy
Horror
(a) A coherent user (c(u) = �1.28)
Action Comedy
Horror
(b) A not coherent user (c(u) = �5.95)
Fig. 1: Example of a coherent vs. not coherent user. Our definition of coherence takesinto account the rating’s deviation within each item feature, which in this example con-sists of three genres: action, comedy, and horror.
3.2 A Simple Definition for User Coherence
Following the rationale presented before, we define the user coherence based on a setof item features F as:
c(u) = �X
f2F�f (u) (2)
where �f (u) =qP
i2I(u,f) (r(u, i)� r̄f (u))2 corresponds to the user’s rating devia-tion within a specific feature f , having an associated mean rating for that feature r̄f (u),which simply corresponds to the average rating within the set of items rated by user u
that belong to feature f , denoted here as I(u, f). We refer to this formulation as basic
coherence.With this formulation, the coherence c(u) captures the variance of a user’s ratings
relative to the feature space in which the items are defined. Moreover, it also incorpo-rates a negative sign to indicate that the larger the variance, the less coherent (or moreincoherent) a user should be.
The key aspect of this function, hence, is that we are accounting for the rating de-viation of the user with respect to a particular feature space. Besides, such definitionallows for a more general case, where the space F could be – instead of (textual) itemfeatures – any embedding of the items into a space F , such as an item clustering orthe latent factors of the items, and also other functions apart from standard deviation tostatistically summarise the user’s ratings for a given feature, as will be described in thenext section.
6
DefiniAon of Coherence Ra;ng consistency within a feature space (genres, themes, etc.)
c u( ) = − σ ff∈F∑ u( )
σ f u( ) = r u, i( )− rf u( )( )2
i∈I u, f( )∑
Item features
User’s ra;ng devia;on within a feature
7
Ra;ng devia;on within a feature space instead of
infinite re-‐ra;ngs
Is this a good es;mate of the magic barrier?
8
Experiments EX1: Is ra;ng coherence a good predictor for the Magic Barrier? EX2: Can we use ra;ng coherence to improve recommenda;on results?Especially when we do not have re-‐ra;ngs/magic barrier?
9
RaAng Coherence vs. Magic Barrier Rank users according to magic barrier/ra;ng coherence: • High/low magic barrier • Low/high ra;ng coherence in the
“genre” feature space Find correla;on (Spearman) between magic barrier/coherence ranking *Dataset from moviepilot.de, contains re-‐ra;ngs from 306 users [Said et al. UMAP ‘12]
10
Magic Barrier & Ra;ng Coherence correla;on
11
Experiments EX1: Is ra;ng coherence a good predictor for the Magic Barrier? EX2: Can we use ra;ng coherence to improve recommenda;on results? Especially when we do not have re-‐ra;ngs/magic barrier?
12
Use raAng coherence for recommendaAon Divide users into coherent and incoherent – with higher an lower magic barrier Movielens 1M Train each group separately Evaluate each group separately
13
RecommendaAon and EvaluaAon Algorithm: K-‐NN, Pearson Training sets • “Easy” users split • “Difficult” users split • All users split Evalua;on sets • “Easy” users split • “Difficult” users split • All users split
Training Easy
Training Difficult
Test Easy
Test Difficult
14
EvaluaAon Comparisons All -‐ All All – Easy All – Difficult Easy – Easy Difficult – Difficult
Training Easy
Training Difficult
Test Easy
Test Difficult
RMSE: 1.090 User coverage: 100%
15
EvaluaAon Comparisons All -‐ All All – Easy All – Difficult Easy – Easy Difficult – Difficult
Training Easy
Training Difficult
Test Easy
Test Difficult
RMSE: 0.974 User coverage: 50%
16
EvaluaAon Comparisons All -‐ All All – Easy All – Difficult Easy – Easy Difficult – Difficult
Training Easy
Training Difficult
Test Easy
Test Difficult
RMSE: 1.195 User coverage: 50%
17
EvaluaAon Comparisons All -‐ All All – Easy All – Difficult Easy – Easy Difficult – Difficult
Training Easy
Training Difficult
Test Easy
Test Difficult
RMSE: 0.933 User coverage: 50%
18
EvaluaAon Comparisons All -‐ All All – Easy All – Difficult Easy – Easy Difficult – Difficult
Training Easy
Training Difficult
Test Easy
Test Difficult
RMSE: 1.226 User coverage: 50%
19
Results Subset RMSE User Coverage
All 1.090 100%
All-‐Easy 0.974 50%
All-‐Diff 1.195 50%
Easy-‐Easy 0.933 50%
Diff-‐Diff 1.226 50%
20
Results Subset RMSE User Coverage
All 1.090 100%
All-‐Easy 0.974 50% (easy)
All-‐Diff 1.195 50% (diff)
Easy-‐Easy 0.933 50% (easy)
Diff-‐Diff 1.226 50% (diff)
RMSE average: 1.084
21
Results Subset RMSE User Coverage
All 1.090 100%
All-‐Easy 0.974 50% (easy)
All-‐Diff 1.195 50% (diff)
Easy-‐Easy 0.933 50% (easy)
Diff-‐Diff 1.226 50% (diff)
RMSE average: 1.064
22
Experiments EX1: Is ra;ng coherence a good predictor for the Magic Barrier? EX2: Can we use ra;ng coherence to improve recommenda;on results? Especially when we do not have re-‐ra;ngs/magic barrier?
23
Conclusions Ra;ng coherence within an item’s feature space is related to the magic barrier Ra;ng coherence can be used to predict users’ ra;ng inconsistencies A user’s ra;ng inconsistencies tell us how well our system performs for that user – i.e. whether we need to op;mize further
24
Thanks!
Ques;ons?
More recommender systems: www.recsyswiki.com
25
Image credits • Slide 1 -‐ hqp://www.burdens.net.au/product/esf-‐barriers • Slide 2 -‐ hqp://www.freepicturesweb.com/pictures/2010-‐07/5881.html • Slide 4 -‐ hqp://redgannet.blogspot.nl/2010/08/moreletakloof-‐johannesburg-‐jnb.html • Slide 5 -‐ hqp://en.wikipedia.org/wiki/Standard_devia;on#Interpreta;on_and_applica;on • Slide 9, 12, 23 -‐ hqp://www.withautodesk.com/html/science-‐fair-‐ideas-‐for-‐physical-‐science-‐inven;ons.html • Slide 13 -‐ hqp://fineartamerica.com/featured/16-‐cogs-‐les-‐cunliffe.html • Slide 24 -‐ hqps://benchprep.com/blog/the-‐ap-‐english-‐exam-‐wri;ng-‐compelling-‐conclusions/
26