Upload
cataldo-musto
View
594
Download
0
Tags:
Embed Size (px)
DESCRIPTION
UMAP 2014 Presentation
Citation preview
Combining Distributional Semantics and Entity Linking for Context-aware
Content-based RecommendationCataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis
(Università degli Studi di Bari ‘Aldo Moro’, Italy - SWAP Research Group)
UMAP 2014 22th Conference on User Modeling,
Adaptation and Personalization Aalborg (Denmark)
July 8, 2014
Content-based Recommender SystemsSuggest items similar to those the user liked in the past
(I bought Converse should, I’ll continue buying similar sport shoes)
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 2
Content-based Recommender Systems
Xuser profile items
Recommendation are generated by matching the features stored in the user profile with those describing the items to be recommended.
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 3
♥
Content-based Recommender Systems(Some) Limitations
Poor Semantic Representation Poor Contextual Modeling
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 4
?
Lack of Semantics“I love turkey. It’s my choice for these holidays!
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 5
Lack of Contextual Modeling
Ashtead?
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
in Aalborg: brewery recommendations
6
Lack of Contextual Modeling
Many content-based recommendation engines
do not handle contextual information (e.g. user location)
1370km !far away :-)
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 7
contextual eVSMa context-aware content-based recommendation framework based on distributional semantics and
entity linking
Our contribution
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 8
Contextual eVSMWorkflow
Semantic Content Analyzer
Context-aware Profiler
Recommender
Items
User Profiles
User Ratings
Contextual Data
Item Description
Context-aware Recommendations
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 9
Contextual eVSM3 main components
Semantic !Content Analyzer!
Context-aware !Profiler!
Recommender!
Items
User Profiles
User Ratings
Contextual Data
Item Description
Context-aware Recommendations
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 10
!
• Input: items to be recommended (along with their textual description)
• Output: semantic representation • Novelty: we exploited
• Entity Linking algorithms!• Distributional Semantics Models
Contextual eVSMSemantic Content Analyzer
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 11
Contextual eVSM• Entity Linking Algorithms!
• Input: free text. • items description, in our setting
• Output: identification of the most relevant entities mentioned in the text.
• We adopted: • tag.me(1), • DBpedia Spotlight(2), • Wikipedia Miner(3)
Semantic Content Analyzer :: Entity Linking
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
(1) http://tagme.di.unipi.it
(2) http://spotlight.dbpedia.org
(3) http://wikipedia-miner.cms.waikato.ac.nz
12
Contextual eVSMSemantic Content Analyzer :: Entity Linking::Example
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Textual Description (e.g. Wikipedia abstract)
Processed Text
13
Contextual eVSMSemantic Content Analyzer :: Entity Linking::Example
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Very transparent and human readable content representation
Tag.me output
14
Contextual eVSMSemantic Content Analyzer :: Entity Linking::Example
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Tag.me output
non-trivial NLP tasks (stopwords removal, n-grams identification, named entities recognition and disambiguation) are automatically performed
15
Very transparent and human readable content representation
Contextual eVSMSemantic Content Analyzer :: Entity Linking::Example
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Tag.me output
Each entity is a reference to a Wikipedia page http://en.wikipedia.org/wiki/The_Wachowskis
not a simple textual feature!
16
Contextual eVSMSemantic Content Analyzer :: Entity Linking::Example
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
We enriched this entity-based representation !by exploiting the Wikipedia categories’ tree
17
Contextual eVSMSemantic Content Analyzer :: Entity Linking::Representation
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
The final representation of each item is
obtained by merging the
entities identified in the text with all
the Wikipedia categories each entity is linked to.
+Entities Wikipedia CategoriesFeatures =
18
Contextual eVSMSemantic Content Analyzer :: Entity Linking::Representation
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
The final representation of each item is
obtained by merging the
entities identified in the text with all
the Wikipedia categories each entity is linked to.
+Entities Wikipedia CategoriesFeatures =
Problem: Even such a rich, transparent and human-readable representation
does not handle semantics
19
Contextual eVSMSemantic Content Analyzer :: Distributional Semantics
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
“meaning
is its use”L.Wittgenstein
(Austrian philosopher)
20
Contextual eVSMSemantic Content Analyzer :: Distributional Semantics (*)
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
by analyzing large corpora of textual data it is possible to infer information about the
usage (about the meaning) of the terms
Insight
similar meaning
co-occurrence co-occurrence
co-occurrence co-occurrence(*) Firth, J.R. A synopsis of linguistic theory 1930-1955. In Studies in Linguistic Analysis, pp.
1-32, 1957.
21
Contextual eVSMSemantic Content Analyzer :: Distributional Semantics::WordSpace
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
beerwine
mojito
dog
22
Vector-space representation is based on term co-occurences
Contextual eVSMSemantic Content Analyzer :: Distributional Semantics
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
e1 e2 e3 e4 e5 e6 e7 e8 e9Keanu Reeves ✔ ✔ ✔ ✔ ✔
Al Pacino ✔ ✔
American Writers ✔ ✔ ✔ ✔
Laurence Fishburne ✔ ✔ ✔ ✔
Our Semantic Content Analyzer learns a vector-space item representation based on distributional semantics models
23
Contextual eVSMSemantic Content Analyzer :: Distributional Semantics
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
e1 e2 e3 e4 e5 e6 e7 e8 e9Keanu Reeves ✔ ✔ ✔ ✔ ✔
Al Pacino ✔ ✔
American Writers ✔ ✔ ✔ ✔
Laurence Fishburne ✔ ✔ ✔ ✔
Vector-space Semantic Representation is learnt according to entities co-occurrences in textual descriptions
24
Contextual eVSMSemantic Content Analyzer :: Distributional Semantics
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Unexpected connections between entities can be learnt in a total
unsupervised way thanks to Distributional Semantics
25
Contextual eVSMSemantic Content Analyzer :: Distributional Semantics
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
e1 e2 e3 e4 e5 e6 e7 e8 e9Keanu Reeves ✔ ✔ ✔ ✔ ✔
Al Pacino ✔ ✔
American Writers ✔ ✔ ✔ ✔
Laurence Fishburne ✔ ✔ ✔ ✔
e.g. Keanu Reeves and Al Pacino both starred in Drama movies
26
Contextual eVSMSemantic Content Analyzer :: Distributional Semantics
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
How to exploit Distributional Semantics !
to represent items to be recommended?
Question
27
Contextual eVSMSemantic Content Analyzer :: Distributional Semantics
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
e1 e2 e3 e4 e5 e6 e7 e8 e9Keanu Reeves ✔ ✔ ✔ ✔ ✔
Drama ✔ ✔
American Writers ✔ ✔ ✔ ✔
Laurence Fishburne ✔ ✔ ✔ ✔
semantic representation of the items is obtained by combining the vector-space representation of the features which
describe them.28
Contextual eVSMSemantic Content Analyzer :: Distributional Semantics
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
e1 e2 e3 e4 e5 e6 e7 e8 e9Keanu Reeves ✔ ✔ ✔ ✔ ✔
Al Pacino ✔ ✔
American Writers ✔ ✔ ✔ ✔
Laurence Fishburne ✔ ✔ ✔ ✔
Matrix ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔
29
Contextual eVSMSemantic Content Analyzer :: Distributional Semantics
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Matrix
Matrix Revolutions
Donnie Darko
Up!
It is possible to perform similarity
calculations between items according to their
semantic representation
30
!• Input:
• user preferences (ratings) • contextual information
• Fixed set of contextual dimensions (company, mood, task, etc.)
• Fixed set of values (e.g. company=alone, friends, girlfriend, etc.)
• Output: contextual user profiles • Novelty: we introduced a Context-aware
Profiling Strategy based on Distributional Models
Contextual eVSMContext-aware Profiler
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 31
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: Strategy
Let’s go straight to the formula
32
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: Strategy
Let u be the target user
Let ck be a contextual variable (e.g. task, mood, etc.)
Let vj be its value (e.g. task=running, mood=sad, etc.)
33
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: Strategy
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
A context-aware profile can be learnt by combining two components in a linear fashion
34
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: Strategy
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
a non-contextual representation of user preferences
a vector space representation of the context itself
35
A context-aware profile can be learnt by combining two components in a linear fashion
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: Strategy :: WRI(u)
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
WRI(u) = ∑ di*r(u,i)MAXi=1
|L|NON-CONTEXTUAL USER
PREFERENCES
36
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: Strategy :: WRI(u)
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
WRI(u) = ∑ di*r(u,i)MAXi=1
|L|
items the user liked
37
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: Strategy :: WRI(u)
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
WRI(u) = ∑ di*r(u,i)MAXi=1
|L| vector-space representation of the item built by Semantic Content
Analyzer
38
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: Strategy :: WRI(u)
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
WRI(u) = ∑ di*r(u,i)MAXi=1
|L|
normalized rating
39
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: Strategy :: context(u,ck,vj)
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
context(u,ck,vj) = ∑ di* r(u,i,ck,vj)
MAXi=1
|L(ck,vj)| Vector-space representation of
the context
40
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
context(u,ck,vj) = ∑ di* r(u,i,ck,vj)
MAXi=1
|L(ck,vj)| items the user liked in that specific context
Context-aware User Profiler :: Strategy :: context(u,ck,vj)
41
r(u,i,ck,vj)
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
context(u,ck,vj) = ∑ di* MAXi=1
|L(ck,vj)| vector space representation
of the item
Context-aware User Profiler :: Strategy :: context(u,ck,vj)
42
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
context(u,ck,vj) = ∑ di* r(u,i,ck,vj)
MAXi=1
|L(ck,vj)| normalized rating in that specific context
Context-aware User Profiler :: Strategy :: context(u,ck,vj)
43
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: Strategy
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
Ratio: context is just a factor which can influence user’s perception of an item
44
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: Strategy
if the user did not express any preference in that specific contextual setting, context(u,ck,vj) = 0 !
—> non contextual recommendation
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
Ratio: context is just a factor which can influence user’s perception of an item
45
X
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: Strategy
Otherwise parameter α is exploited to tune a specific component of the formula
Ratio: context is just a factor which can influence user’s perception of an item
46
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: How do we come to this formula?
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
47
C-WRI(u,ck,vj) = α * WRI(u) + (1-α) * context(u,ck,vj)
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: How do we come to this formula?
Insight: it exists a set of terms that is more descriptive of items relevant in that specific context
for a romantic dinner, e.g. candlelight, seaview, violin
48
e.g. task = dinner, company=girlfriend
Context is represented on the ground of the items the user
liked in that specific contextual setting
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Context-aware User Profiler :: Our formula inherits this insight
49
r(u,i,ck,vj)
MAXi=1
|L(ck,vj)|
context(u,ck,vj) = ∑ di*
Context is represented on the ground of the items the user
liked in that specific contextual setting
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 50
r(u,i,ck,vj)
MAX
Items are represented on the ground of the co-occurrences between entities
i=1
|L(ck,vj)|
context(u,ck,vj) = ∑ di*
Context-aware User Profiler :: Our formula inherits this insight
Context is represented on the ground of the items the user
liked in that specific contextual setting
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 51
r(u,i,ck,vj)
MAX
Items are represented on the ground of the co-occurrences between entities
i=1
|L(ck,vj)|
context(u,ck,vj) = ∑ di*
the resulting representation of
the context is such that a bigger weight is given to the
entities which typically occur in the description of
the items relevant in that specific context
Context-aware User Profiler :: Our formula inherits this insight
context(u,ck,vj) = ∑ di*
Contextual eVSM
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 52
r(u,i,ck,vj)
MAX
Thanks to Distributional Semantics Models it is possible to build a vector-space representation of the context which emphasize the importance of those terms,
since they are more used (—> more important) in that specific contextual setting.
i=1
|L(ck,vj)|
Context-aware User Profiler :: Our formula inherits this insight
Contextual eVSMRecommendation step
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Skyfall
WRI(u)
Austin Powers
Up!
The goal of our context-aware
profiling strategy is to perturb the
representation of user preferences and to provide him with context-aware
recommendations
53
non-contextual preferences
Contextual eVSMRecommendation step
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Skyfall
C-WRI(u)
Austin Powers
Up!
The goal of our context-aware
profiling strategy is to perturb the
representation of user preferences and to provide him with context-aware
recommendations
54
contextual preferences (e.g. company = friends)
Experimental EvaluationResearch Hypothesis
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 55
1. Does C-eVSM outperform its non-contextual counterpart?
2. Does the novel representation based on entity linking and distributional semantics outperform a simple keyword-based one?
3. How does our model perform with respect to the current literature?
Experimental EvaluationDescription of the dataset
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 56
• Movie recommendation!• Subset of IMDB data • 202 movies (textual features crawled
from Wikipedia) • 62 users and 1457 ratings!
• 4 contextual dimensions!• TIME (weekend, weekday) • PLACE (theather, home) • COMPANION (alone, friends, boyfriend,
family) • MOVIE-RELATED (release week or not)
Experimental EvaluationDesign of the Experiment
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 57
• Dataset and experimental settings replicate Adomavicius’ experiment (*)!
• Evaluation over 9 different contextual settings!
• Home, Friends, Non-release, Weekend, Weekday, GBFriends, TheatherWeekend and TheatherFriends
• Metric: F1-Measure
• Experimental protocol: bootstrapping!• 29/30th of the data as training • 1/30th as test • Randomly generated, 500 runs
(*) G.Adomavicius et al. , Incorporating contextual information in recommender systems using a multi-dimensional
approach. ACM Trans. Inf. Systems, 2005
Experimental EvaluationeVSM configurations
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 58
• Non-contextual baseline: eVSM!
• WRI profiling strategy
• WQN profiling strategy
• Context-aware framework: C-eVSM!
• C-WRI profiling strategy
• C-WQN profiling strategy
• Three values for parameter α!
• 0.2 , 0.5, 0.8
8 configurations for each run
Experimental EvaluationeVSM configurations
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 59
• Non-contextual baseline: eVSM!
• WRI profiling strategy
• WQN profiling strategy
• Context-aware framework: C-eVSM!
• C-WRI profiling strategy
• C-WQN profiling strategy
• Three values for parameter α!
• 0.2 , 0.5, 0.8
• WQN!
• Alternative profiling strategy (*)
• Models negative user feedbacks as well
• Combines positive and negative preferences by means of a Quantum Negation (**) Operator
(*) C. Musto, G. Semeraro, P. Lops, and M. de Gemmis. Random indexing and negative user preferences for enhancing content-based recommender systems. In EC-Web 2011, volume 85 of LNBIP, pages 270–281. 2011.
(**) D. Widdows. Orthogonal negation in vector spaces for modelling word-meanings and document retrieval. In ACL, pages 136–143, 2003.
Experiment 1
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 60
Comparison of C-eVSM vs eVSM (keyword-based)
Experiment 1
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 61
Selection of Results :: HOME segmentWRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
45 48,75 52,5 56,25 60
58,8
57,82
54,81
53,62
50,6
48,23
46,62
47,62
contextual eVSM improves the F1 measure
Experiment 1
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 62
Selection of Results :: HOME segmentWRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
45 48,75 52,5 56,25 60
58,8
57,82
54,81
53,62
50,6
48,23
46,62
47,62
contextual eVSM improves the F1 measure
paired t-test (p<0.05)
baseline
baseline
Experiment 1
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 63
Selection of Results :: HOME segmentWRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
45 48,75 52,5 56,25 60
58,8
57,82
54,81
53,62
50,6
48,23
46,62
47,62
α=0.8 is better than α=0.5
WRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
42,0 45,3 48,5 51,8 55,0
54,39
50,04
45,93
53,18
50,11
50,54
44,91
49,43
Experiment 1
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 64
Selection of Results :: FRIENDS segment
Similar outcomes: C-eVSM outperforms eVSM
paired t-test (p<0.05)
WRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
42,0 45,3 48,5 51,8 55,0
54,39
50,04
45,93
53,18
50,11
50,54
44,91
49,43
Experiment 1
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 65
Selection of Results :: FRIENDS segment
α=0.2 does not improve F1-measure
WRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
42,0 45,8 49,5 53,3 57,0
56,78
52,55
48,67
55,94
52,18
49,05
48,24
48,95
Experiment 1
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 66
Selection of Results :: NON-RELEASE segment
C-WQN with α=0.8 is typically the best-performing configuration
paired t-test (p<0.05)
WRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
42,0 45,8 49,5 53,3 57,0
56,78
52,55
48,67
55,94
52,18
49,05
48,24
48,95
Experiment 1
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 67
Selection of Results :: NON-RELEASE segment
Outcome: context has just to slightly influence user preferences
Experiment 1
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 68
Outcomes
• Contextual eVSM outperforms eVSM • 8 segments out of 9
• Little statistical significance • Negation is useful when dataset is well-balanced • Higher α values lead to a better F1
• Best-performing configurations are C-WQN-0.8 (4 times), C-WRI-0.8 (1 times), C-WRI-0.5 (3 times)
Experiment 2
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 69
Comparison of entity-based vs keyword-based content representation
Experiment 2
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 70
WRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
40,0 47,5 55,0 62,5 70,061,30
61,96
54,81
57,53
56,75
56,38
46,62
56,13
58,80
57,82
53,37
53,62
50,60
48,23
44,56
47,62 KeywordsEntities
Selection of Results :: HOME segment
Semantic representation improves F1 in all the configurations
Experiment 2
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 71
Selection of Results :: HOME segment
Gaps are significant in 5 out of 8 configurations
WRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
40,0 47,5 55,0 62,5 70,061,3
61,96
54,81
57,53
56,75
56,38
46,62
56,13
58,80
57,82
53,37
53,62
50,60
48,23
44,56
47,62 KeywordsEntities
WRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
40,0 47,5 55,0 62,5 70,061,3
61,96
54,81
57,53
56,75
56,38
46,62
56,13
58,80
57,82
53,37
53,62
50,60
48,23
44,56
47,62 KeywordsEntities
Experiment 2
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 72
Selection of Results :: HOME segment
Again, higher α values lead to the best F1-measure scores
WRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
43,0 48,5 54,0 59,5 65,058,37
57,2
52,82
58,25
55,68
56,24
49,19
56,17
54,39
50,04
45,93
53,18
50,11
50,54
44,91
49,43 KeywordsEntities
Experiment 2
73
Selection of Results :: FRIEND segment
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
+6,42% improvement, gap always significant
WRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
43,0 48,5 54,0 59,5 65,058,37
57,2
52,82
58,25
55,68
56,24
49,19
56,17
54,39
50,04
45,93
53,18
50,11
50,54
44,91
49,43 KeywordsEntities
Experiment 2
74
Selection of Results :: FRIEND segment
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Negation+ α Higher values ➝ best configuration
WRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
43,0 48,5 54,0 59,5 65,062,16
57,81
54,72
56,45
58,11
57,21
55,82
56,34
52,64
51,40
46,65
50,71
52,87
53,95
52,79
50,91 KeywordsEntities
Experiment 2
75
Selection of Results :: THEATER segment
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Best perfoming segment: +6,49% improvement over keywords
WRI
C-WRI-0.2
C-WRI-0.5
C-WRI-0.8
WQN
C-WQN-0.2
C-WQN-0.5
C-WQN-0.8
43,0 48,5 54,0 59,5 65,062,16
57,81
54,72
56,45
58,11
57,21
55,82
56,34
52,64
51,40
46,65
50,71
52,87
53,95
52,79
50,91 KeywordsEntities
Experiment 2
76
Selection of Results :: THEATER segment
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
C-WQN is the best perfoming configuration: +9,52%
Experiment 2
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 77
Outcomes
• Novel semantic representation outperforms the keyword-based one • 7 segments out of 9 • +4% on average, eanging from +1,34% to +6,49%
• Important gaps in terms of F1-measure • Entity-based outperforms keywords in 65 segments out of 90 (72%) • Statistically significant gap in 52 out of 90 of the comparisons (58%)
• Negation and higher α values lead to a better F1 • Best-performing configurations are C-WQN-0.8 (3 times), C-WQN-0.5 (2
times), C-WRI-0.5 (2 times)
Experiment 3
Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014 78
Comparison to context-aware CF algorithm(*)
(*) G.Adomavicius et al. , Incorporating contextual information in recommender systems using a multi-dimensional approach. ACM Trans. Inf. Systems, 2005
Home
Friends
Weekend
Theater
Nonrelease
Weekday
GBFriends
Theater-Weekend
Theater-Friends
35,0 43,8 52,5 61,3 70,060,7
64,1
48
37,9
43,2
60,8
54,2
48,2
39,19
55,96
54,95
50,72
48,02
57,01
61,16
60,39
58,37
61,96c-eVSMCACF
Experiment 3
79Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Comparison to context-aware CF algorithm
Contextual eVSM overcomes CACF in 7 segments out of 9
✔
Home
Friends
Weekend
Theater
Nonrelease
Weekday
GBFriends
Theater-Weekend
Theater-Friends
35,0 43,8 52,5 61,3 70,060,7
64,1
48
37,9
43,2
60,8
54,2
48,2
39,19
55,96
54,95
50,72
48,02
57,01
61,16
60,39
58,37
61,96c-eVSMCACF
Experiment 3
80Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Comparison to context-aware CF algorithm
Gap is statistically significant in 5 segments out of 7
Recap
81Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Recap
82Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Contextual eVSM: context-aware recommendation framework Content Representation based on Distributional Semantics and Entity Linking Profile Learning based on a perturbation of non-contextual preferences with a
semantic representation of the context!Experimental session confirmed the effectiveness of the framework as well as of
the novel semantic representation!Framework overcomes a context-aware collaborative filtering baseline
Future Research
83Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Future Research
84Cataldo Musto, Giovanni Semeraro, Pasquale Lops, Marco de Gemmis. Combining Distributional Semantics and Entity Linking for Context-aware Content-based Recommendations. UMAP 2014, Aalborg (Denmark), July 8, 2014
Evaluation against different datasets and stronger baselines;
Exploitation of Linked Data and Open Knowledge Sources for content representation;
Evaluation of Novelty, Diversity and Serendipity of the Recommendations;
questions?Cataldo Musto, Ph.D [email protected]