Upload
gu-wendong
View
3.785
Download
0
Embed Size (px)
Citation preview
Tag & Tag-based Recommenders
IBM Research – China
Presenter: Xiatian Zhang (张夏天)
Team:
赵石顽 张夏天 袁 泉
About Me
2000-2004, B.S. Math, Central South University
2004-2007, M.S. Computer Science, BUPT
2007-Present, Researcher, Working on Recommender Systems and Data Mining
Agenda
Social Tagging System and Its Features
Tag Recommender
Tag-based Recommender
Social Tagging
A folksonomy is a system of classification derived from the practice and method of collaboratively creating and managing tags to annotate and categorize content; this practice is also known as collaborative tagging, social classification, social indexing, and social tagging. Folksonomy is a portmaneau of folk and taxonomy.
Social Tagging boomed from 2004, with the wave of Web 2.0.– Delicious
– Citeulike
– Bibsonomy
– Youtube
– Flickr
– Dogear – A internal social book marking system in IBM
– …
Some Insights of Tagging System
Shilad Sen et.al., tagging, communities, vocabulary, evolution, CSCW’06
– Modeling vocabulary evolution
– Tagging system features
– Based on Movielens recommender system
– Personal tendency and community influence
– Tag displaying strategies and their effects
– Tag utility
Modeling vocabulary evolution
Tagging System Features
Design Features
– Tag Sharing
– Tag Selection
– Item Ownership
– Tag Scope– Broad– Narrow
Tag Class
– Factual Tag
– Subjective Tag
– Personal Tag
Tagging System in Movielens
Personal Tendency
How strongly do investment and habit affect personal tagging behavior?
– 1. Habit and investment influence user’s tag applications.
– 2. Habit and investment influence grows stronger as users apply more tags.
– 3. Habit and investment cannot be the only factors thatcontribute to vocabulary evolution.
Community Influence
How does the tagging community influence personal vocabulary?
– 1. Community influence affects a user’s personal vocabulary.
– 2. Community influence on a user’s first tag is stronger for users who have seen more tags.
Tag Displaying Strategies Effects
Tag Utility
Tag Recommender
Purpose
– Encourage users to tag more frequently, apply more tags to an individual resource, reuse common tags
– Make user use tags not previously considered.
– Eliminate Redundant tags– Promote a core tag vocabulary steering the user toward adopting
certain tags while not imposing any strict rules. – Avoid ambiguous tags in favor of tags that offer greater information
value.
Tag Recommender – Technologies
Naive Methods
– Most Popular Tags on Resources– Most Popular Tags on Users– Most Popular Tags on Resources and Users
Classical Collaborative Filtering
– User-KNN– Item-KNN
Adapted KNN Methods
– Extend User-Item Matrix– Degrade User-Item-Tag Relationship
Content-based Method
Tensor Method
– Tensor Factorization
Graph Based
– FolkRank
Our Work
Adapted KNN – Extend UI Matrix
Adapted KNN – Degrade User-Item-Tag relationship
Process
– TF/IDF on UI, UT, IT
– P-Core Processing– Remove noise data
– Extract User Model by Hebbian Deflation
Tensor Factorization
FolkRank
)( )(
)(/)1()(
ij pMp j
ji pL
pPRdNdpPR
PageRank
Personalized PageRank
FolkRank
1. Compute global PageRank by (1)2. Then for each <user, item> pair, compute personalized PageRank by (2)
– p[i] = 1, but p [u] = 1 + |U| and p [r] = 1 + |R|.3. FolkRank = Personalized PageRank - PageRank
)( )(
)()1()(
ij pMp j
jii pL
pPRdpdpPR
(1)
(2)
Our Work
Explored and Exploring Methods
– Non-classical Tensor Fusion Factorization
– Multi-label Classification by Random Decision Trees, High Speed
– The performance of both two methods are close to FolkRank
Current Progress
– Shiwan develop a simple graph model
– Best precision and recall on several datasets compared to other methods
– We are writing paper targeting ACM RecSys 2010
Tag-based Recommender
Our Work
– IUI 2008 Paper, Improved Recommendation based on Collaborative Tagging Behaviors
– Explored Methods– Tensor Factorization– Non-classical Tensor and Matrix Fusion Factorization
Other Works
– Shilad Sen, Jesse Vig, and John Riedl, Tagommenders: Connecting Users to Items through Tags, WWW 2009
IUI 2008 Paper Overview
We invent a new collaborative filtering approach TBCF (Tag-based Collaborative Filtering) based on the semantic distance among tags assigned by different users to improve the effectiveness of neighbor selection.
That is, two users could be considered similar not only if they rated the items similarly, but also if they have similar cognitions over these items.
Example– Both Bob and Tom may rate the movie Avatar with 5 stars, which indicates they
all like this movie very much. – Nevertheless, as a 3D fan, Bob appreciates this movie for its high quality 3D
animations, while Tom may think that it is a wonderful action movie.
Tag-based Collaborative Filtering
1. Calculate the semantic similarity of tags based on WordNet (for the tags not included in WordNet, calculate the edit-distance instead)
2. Calculate the similarity between tag sets3. Calculate the similarity between user u and v by summing up the similarity of tag
sets on common pages (tagged by both u & v)4. Find the top-N nearest neighbors of the active user to make the prediction5. Return the top-M predicted items to the active user
Tag-based User-Item Matrix
Item1 Item2 Item3 Item4
Alice Art, photo Home, Products Writing, Design Learning, Education
Daniel Photo, Album, Image
Ø Typewriter Tutorial, Training
Sherry Ø Cleaning Ø Language, Study
Maggie Photography Ø Ovens Ø
Steps
Tag Similarity Calculation
Tag similarity– WordNet
– LSA/PLSA
Tag set similarity– Hungarian method
WordNet Concept Tree
If x and y are contained in WordNet, dis(x,y) is the shortest path length between x and y.
Word similarity in WordNet
Experimental Evaluation
Random generated subset Average Precision TBCF
Average Precisioncosine
500 0.208 0.121
2000 0.182 0.118
4000 0.202 0.173
6000 0.209 0.180
Algorithm Average Precision Average Ranking
TBCF 0.27 2.8
cosine 0.13 1.5
Data SetExtract total 8000 users, 5315 pages and 7670 tags from web logs.
Tagommenders: Connecting Users to Items through Tags
Q & A