View
257
Download
0
Tags:
Embed Size (px)
Citation preview
Entity Recommendations Using Hierarchical Knowledge Bases
Workshop on Knowledge Discovery and Data Mining Meets Linked Open Data (KNOW@LOD)at ESWC2015, May 31, 2015
Siva Kumar Cheekula, Pavan Kapanipathi, Derek Doran, Prateek Jain, Amit Sheth
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing
Wright State University, Dayton - OH
• Content Based (Users’ Ratings)– Types• Term based • Entity based • Knowledge based/Graph based
– Provides the most personalized results• Collaborative (Other users’ ratings)• Hybrid (Both content and collaborative)– Works best
Entity Recommendations
6
• Structured representation of relationships among concepts
• Allows better understanding of the user interests
• Cold start problem is addressed well
Knowledge Base Enabled Recommendations
7
Interstellar (film)
Gravity (film)
Relationships
Directe
d by
Starring
Produced by
Distrib
uted by
Knowledge Base Enabled Recommendations
8
Interstellar (film)
Gravity (film)
Relationships
Directe
d by
Starring
Produced by
Distrib
uted by
Knowledge Base Enabled Recommendations
Too many properties to address
9
Ball Games
Football
Games
Is-a
Focus specifically on hierarchical relationships
Hierarchical Relations
10
• Human memory has been argued to be structured as a hierarchy of concepts (Semantic Network)
• Spreading activation theory has been utilized to simulate search on semantic network
• This theory has not been well explored for user interest modeling and entity recommendations
Hierarchical Relations
11
• Domain Coverage• Crowd-sourced knowledge base–Regular updates–Driven by strict guidelines
Wikipedia
14
Spreading Activation
Hierarchical Interest Graphs
1
2 3
3 4
Entity Un-rated by user
23
Reverse Spreading
Wikipedia Category Hierarchy
Approach
15
Hierarchical Interest Graphs
1
2 3
3 4
Entity Un-rated by user
23
Reverse Spreading
Approach
Spreading Activation
Wikipedia Category Hierarchy
16
Internet
Semantic Search Linked Data Metadata
Technology
World Wide Web
Semantic Web
Structured Information
0.8 0.2 0.6Scores for Interests
User Interests
0.7
0.5
0.4
0.3
Activation Function
17
• Wikipedia Category Structure is transformed to a hierarchy (Wikipedia Hierarchy)– 0.8 Million Categories with approx 2 Million relationships
between them – Category “Main Topic Classification” is root of the hierarchy
as it subsumes 98% of the categories– Each Category is assigned a Hierarchy level based on
shortest path from root node• Spreading Activation Theory on Wikipedia Hierarchy• Experimented with different Activation Functions
– Bell, Bell Log, Priority Intersect
Hierarchical Interest Graphs
18
Romance Films
Titanic Anna_Karenina Forever Young
Films
Romantic Drama Films
1990s romantic drama films
Structured Information
5 4 4Ratings of
Movies
Movies
4
3
3
1
19
Spreading Activation
Hierarchical Interest Graphs
1
2 3
3 4
Entity Un-rated by user
23
Reverse Spreading
Wikipedia Category Hierarchy
Approach
20
1990s Romantic
Drama Films
Romantic Drama Films
4
3
Function to rank the entities based on category scores
The Remains
of the DayJude
5 4
Ranking Unrated Entities
21
The Terminator
1984 Films
1980s Action Films
Films Directed by James Cameron
Action Horror Films
Robot Films
4
3
2
3
4
Reverse Spreading
• Sum the activation values of each parent of an entity 22
The Terminator
1984 Films
1980s Action Films
Films Directed by James Cameron
Action Horror Films
Robot Films
4
3
2
3
4
16
Summing
Sum
23
• Ranking is biased towards categories having larger number of entities
• Example:– Category “English Language Films” has good
interesting score for user – Results are topped by the entities connected to
“English Language Films” for user
Summing – Drawbacks
24
• Parent category activation value is normalized by it’s out degree• Bias towards generic categories is addressed
The Terminator
1984 Films
Footloose
3
1 1 1
Ghostbusters
Degree Based Normalization
Out Degree = 3Spread Weight =
3/3 = 1
26
• Priority of a category to an entity as edge weight
The Terminator
1984 Films
Footloose
3
1 2 1
3 1.5 3
Ghostbusters
Category Priority
27
Ai = Σ(Aj / (Oj * Pij))
Reverse Spreading Activation Function
Activation Value of the Entity i (Movie)
Activation Value of the Categories of “i”
Outdegree of Category j
Priority of Category j to Entity “i”
28
• Focus on “5” rated movies for test as per Cremonesi et al. – New approach for evaluation of recommendation systems
• Dataset - Movielens– Movies -- 3158– Users -- 1476– 0.9 m ratings
• Baseline – Ziegler et al.– Hybrid(Content & Collaborative) technique to recommend books
using Amazon book taxonomy– We compared with the content based part of Ziegler et al
Evaluation
29
Summing Summing with Degree Normalization
Summing with Degree Normalization and Priority
Decay BellLog Top 1 -- 0.01 Top 20 -- 0.15
Top 1 -- 0.01 Top 20 -- 0.18
Top 1 -- 0.07 Top 20 -- 0.21
Bell Log Intersect Top 1 -- 0.0 Top 20 -- 0.09
Top 1 -- 0.01Top 20 -- 0.20
Top 1 -- 0.07 Top 20 -- 0.24
Ziegler et al. Top 1 -- 0.0Top 20 -- 0.002
Recall @Top 1 – Top 20
30
• While normalization is necessary, we are looking for better features for normalization
• Priority works, however we find the category-entity priority added by users on Wikipedia needs a quality check
• We are looking into collaborative Hierarchical Interest Graphs
Future Work
Paper at http://knoesis.org/library/resource.php?id=2150
Thank you
ContactSiva Kumar ([email protected])Pavan Kapanipathi ([email protected]) @pavankapsAmit Sheth ([email protected]) @amit_p