Upload
eustacia-bates
View
226
Download
3
Embed Size (px)
Citation preview
1
Information Filtering &
Recommender Systems
(Lecture for CS410 Text Info Systems)
ChengXiang Zhai
Department of Computer Science
University of Illinois, Urbana-Champaign
2
Short vs. Long Term Info Need
• Short-term information need (Ad hoc retrieval)
– “Temporary need”, e.g., info about fixing a bug
– Information source is relatively static
– User “pulls” information
– Application example: library search, Web search,…
• Long-term information need (Filtering)
– “Stable need”, e.g., your hobby
– Information source is dynamic
– System “pushes” information to user
– Applications: news filter, movie recommender, …
3
Part I: Content-Based Filtering(Adaptive Information Filtering)
4
Adaptive Information Filtering
• Stable & long term interest, dynamic info source
• System must make a delivery decision immediately as a document “arrives”
FilteringSystem
…
my interest:
5
A Typical AIF System
...Binary
Classifier
UserInterestProfile
User
Doc Source
Accepted Docs
Initialization
Learning FeedbackAccumulated
Docs
utility func
User profile text
Linear Utility = 3* #good - 2 *#bad#good (#bad): number of good (bad) documents delivered to user
Are the coefficients (3, -2) reasonable? What about (10, -1) or (1, -10)?
6
Three Basic Problems in AIF• Making filtering decision (Binary classifier)
– Doc text, profile text yes/no
• Initialization
– Initialize the filter based on only the profile text or very few examples
• Learning from
– Limited relevance judgments (only on “yes” docs)
– Accumulated documents
• All trying to maximize the utility
7
Extend a Retrieval System for Information Filtering
• “Reuse” retrieval techniques to score documents
• Use a score threshold for filtering decision
• Learn to improve scoring with traditional feedback
• New approaches to threshold setting and learning
8
A General Vector-Space Approach
doc vector
profile vector
Scoring Thresholding
yes
no
FeedbackInformation
VectorLearning
ThresholdLearning
threshold
UtilityEvaluation
9
Difficulties in Threshold Learning
36.5 Rel33.4 NonRel32.1 Rel29.9 ?27.3 ?…...
=30.0
• Censored data (judgments only available on delivered documents)
• Little/none labeled data
• Exploration vs. Exploitation
No judgments are available for these documents
10
Empirical Utility Optimization
• Basic idea– Compute the utility on the training data for each candidate
score threshold
– Choose the threshold that gives the maximum utility on the training data set
• Difficulty: Biased training sample!– We can only get an upper bound for the true optimal
threshold
– Could a discarded item be possibly interesting to the user?
• Solution:– Heuristic adjustment (lowering) of threshold
11
optimalθ
Beta-Gamma Threshold Learning
Cutoff position(descending order of doc scores)
Utility
0 1 2 3 … K ...
zeroθ
, N
examplestrainingN
e N
#
*β-1(βα γ*
, [0,1]
The more examples,the less exploration(closer to optimal)
optimalzero θ*α-1(θ*αθ
Encourage exploration up to zero
12
Beta-Gamma Threshold Learning (cont.)
• Pros
– Explicitly addresses exploration-exploitation tradeoff (“Safe” exploration)
– Arbitrary utility (with appropriate lower bound)
– Empirically effective
• Cons
– Purely heuristic
– Zero utility lower bound often too conservative
13
Part II: Collaborative Filtering
14
What is Collaborative Filtering (CF)?
• Making filtering decisions for an individual user based on the judgments of other users
• Inferring individual’s interest/preferences from that of other similar users
• General idea
– Given a user u, find similar users {u1, …, um}
– Predict u’s preferences based on the preferences of u1, …, um
15
CF: Assumptions
• Users with a common interest will have similar preferences
• Users with similar preferences probably share the same interest
• Examples– “interest is IR” => “favor SIGIR papers”
– “favor SIGIR papers” => “interest is IR”
• Sufficiently large number of user preferences are available
16
CF: Intuitions
• User similarity (Kevin Chang vs. Jiawei Han)
– If Kevin liked the paper, Jiawei will like the paper
– ? If Kevin liked the movie, Jiawei will like the movie
– Suppose Kevin and Jiawei viewed similar movies in the past six months …
• Item similarity– Since 90% of those who liked Star Wars also liked
Independence Day, and, you liked Star Wars
– You may also like Independence Day
The content of items “didn’t matter”!
17
The Collaboration Filtering Problem
u1
u2
…
ui
...
um
Users: U
Objects: O
o1 o2 … oj … on
3 1.5 …. … 2
2
1
3
Xij=f(ui,oj)=?
?
The task
Unknown function f: U x O R
• Assume known f values for some (u,o)’s
• Predict f values for other (u,o)’s
• Essentially function approximation, like other learning problems
Ratings
18
Memory-based Approaches
• General ideas:
– Xij: rating of object oj by user ui
– ni: average rating of all objects by user ui
– Normalized ratings: Vij = Xij – ni
– Memory-based prediction of rating of object oj by user ua
• Specific approaches differ in w(a,i) -- the distance/similarity between user ua and ui
1 1
ˆ ˆ ˆ( , ) 1/ ( , )m m
aj ij aj aj ai i
v k w a i v x v n k w a i
19
User Similarity Measures
• Pearson correlation coefficient (sum over commonly rated items)
• Cosine measure
• Many other possibilities!
jiij
jaaj
jiijaaj
pnxnx
nxnx
iaw22 )()(
))((
),(
n
jij
n
jaj
n
jijaj
c
xx
xx
iaw
1
2
1
2
1),(
20
Improving User Similarity Measures
• Dealing with missing values: set to default ratings (e.g., average ratings)
• Inverse User Frequency (IUF): similar to IDF
21
Summary• Filtering is “easy”
– The user’s expectation is low
– Any recommendation is better than none
– Making it practically useful
• Filtering is “hard”
– Must make a binary decision, though ranking is also possible
– Data sparseness (limited feedback information)
– “Cold start” (little information about users at the beginning)
22
What you should know
• Filtering, retrieval, and browsing are three basic ways of accessing information
• How to extend a retrieval system to do adaptive filtering (adding threshold learning)
• What is exploration-exploitation tradeoff
• Know how memory-based collaborative filtering works