1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,

1

Information Filtering &

Recommender Systems

(Lecture for CS410 Text Info Systems)

ChengXiang Zhai

Department of Computer Science

University of Illinois, Urbana-Champaign

2

Short vs. Long Term Info Need

• Short-term information need (Ad hoc retrieval)

– “Temporary need”, e.g., info about fixing a bug

– Information source is relatively static

– User “pulls” information

– Application example: library search, Web search,…

• Long-term information need (Filtering)

– “Stable need”, e.g., your hobby

– Information source is dynamic

– System “pushes” information to user

– Applications: news filter, movie recommender, …

3

Part I: Content-Based Filtering(Adaptive Information Filtering)

4

Adaptive Information Filtering

• Stable & long term interest, dynamic info source

• System must make a delivery decision immediately as a document “arrives”

FilteringSystem

…

my interest:

5

A Typical AIF System

...Binary

Classifier

UserInterestProfile

User

Doc Source

Accepted Docs

Initialization

Learning FeedbackAccumulated

Docs

utility func

User profile text

Linear Utility = 3* #good - 2 *#bad#good (#bad): number of good (bad) documents delivered to user

Are the coefficients (3, -2) reasonable? What about (10, -1) or (1, -10)?

6

Three Basic Problems in AIF• Making filtering decision (Binary classifier)

– Doc text, profile text yes/no

• Initialization

– Initialize the filter based on only the profile text or very few examples

• Learning from

– Limited relevance judgments (only on “yes” docs)

– Accumulated documents

• All trying to maximize the utility

7

Extend a Retrieval System for Information Filtering

• “Reuse” retrieval techniques to score documents

• Use a score threshold for filtering decision

• Learn to improve scoring with traditional feedback

• New approaches to threshold setting and learning

8

A General Vector-Space Approach

doc vector

profile vector

Scoring Thresholding

yes

no

FeedbackInformation

VectorLearning

ThresholdLearning

threshold

UtilityEvaluation

9

Difficulties in Threshold Learning

36.5 Rel33.4 NonRel32.1 Rel29.9 ?27.3 ?…...

=30.0

• Censored data (judgments only available on delivered documents)

• Little/none labeled data

• Exploration vs. Exploitation

No judgments are available for these documents

10

Empirical Utility Optimization

• Basic idea– Compute the utility on the training data for each candidate

score threshold

– Choose the threshold that gives the maximum utility on the training data set

• Difficulty: Biased training sample!– We can only get an upper bound for the true optimal

threshold

– Could a discarded item be possibly interesting to the user?

• Solution:– Heuristic adjustment (lowering) of threshold

11

optimalθ

Beta-Gamma Threshold Learning

Cutoff position(descending order of doc scores)

Utility

0 1 2 3 … K ...

zeroθ

, N

examplestrainingN

e N

#

*β-1(βα γ*

, [0,1]

The more examples,the less exploration(closer to optimal)

optimalzero θ*α-1(θ*αθ

Encourage exploration up to zero

12

Beta-Gamma Threshold Learning (cont.)

• Pros

– Explicitly addresses exploration-exploitation tradeoff (“Safe” exploration)

– Arbitrary utility (with appropriate lower bound)

– Empirically effective

• Cons

– Purely heuristic

– Zero utility lower bound often too conservative

13

Part II: Collaborative Filtering

14

What is Collaborative Filtering (CF)?

• Making filtering decisions for an individual user based on the judgments of other users

• Inferring individual’s interest/preferences from that of other similar users

• General idea

– Given a user u, find similar users {u1, …, um}

– Predict u’s preferences based on the preferences of u1, …, um

15

CF: Assumptions

• Users with a common interest will have similar preferences

• Users with similar preferences probably share the same interest

• Examples– “interest is IR” => “favor SIGIR papers”

– “favor SIGIR papers” => “interest is IR”

• Sufficiently large number of user preferences are available

16

CF: Intuitions

• User similarity (Kevin Chang vs. Jiawei Han)

– If Kevin liked the paper, Jiawei will like the paper

– ? If Kevin liked the movie, Jiawei will like the movie

– Suppose Kevin and Jiawei viewed similar movies in the past six months …

• Item similarity– Since 90% of those who liked Star Wars also liked

Independence Day, and, you liked Star Wars

– You may also like Independence Day

The content of items “didn’t matter”!

17

The Collaboration Filtering Problem

u1

u2

…

ui

...

um

Users: U

Objects: O

o1 o2 … oj … on

3 1.5 …. … 2

2

1

3

Xij=f(ui,oj)=?

?

The task

Unknown function f: U x O R

• Assume known f values for some (u,o)’s

• Predict f values for other (u,o)’s

• Essentially function approximation, like other learning problems

Ratings

18

Memory-based Approaches

• General ideas:

– Xij: rating of object oj by user ui

– ni: average rating of all objects by user ui

– Normalized ratings: Vij = Xij – ni

– Memory-based prediction of rating of object oj by user ua

• Specific approaches differ in w(a,i) -- the distance/similarity between user ua and ui

1 1

ˆ ˆ ˆ( , ) 1/ ( , )m m

aj ij aj aj ai i

v k w a i v x v n k w a i

19

User Similarity Measures

• Pearson correlation coefficient (sum over commonly rated items)

• Cosine measure

• Many other possibilities!

jiij

jaaj

jiijaaj

pnxnx

nxnx

iaw22 )()(

))((

),(

n

jij

n

jaj

n

jijaj

c

xx

xx

iaw

1

2

1

2

1),(

20

Improving User Similarity Measures

• Dealing with missing values: set to default ratings (e.g., average ratings)

• Inverse User Frequency (IUF): similar to IDF

21

Summary• Filtering is “easy”

– The user’s expectation is low

– Any recommendation is better than none

– Making it practically useful

• Filtering is “hard”

– Must make a binary decision, though ranking is also possible

– Data sparseness (limited feedback information)

– “Cold start” (little information about users at the beginning)

22

What you should know

• Filtering, retrieval, and browsing are three basic ways of accessing information

• How to extend a retrieval system to do adaptive filtering (adding threshold learning)

• What is exploration-exploitation tradeoff

• Know how memory-based collaborative filtering works

Documents

1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,