Is593-Lecture04 Recommendation Systems

Embed Size (px)

Citation preview

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    1/51

    King Saud University

    College of Computer & Information Sciences

    IS 593 Selected Topics in E-Commerce

    Lecture 4Recommendation Systems

    Dr. Mourad YKHLEF

    The slides content is derived and adapted from many references

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    2/51

    Content

    Introduction Collaborative Filtering

    Content-based Filtering Other directions

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    3/51

    Definition

    A system that predicts a users rating or

    preference to an item.

    Help people discover interesting or

    informative stuff that they wouldn't have

    thought to search for.

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    4/51

    Recommender Systems

    E-Commerce: Amazon.com; eBay; Levis, Ski-

    europe.com.

    News: Digg; GroupLens.

    File: Credence.

    Movie: Nettfilx.com; Moviefinder.com.

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    5/51

    Recommender Algorithms

    We category the existing recommender algorithms

    into:

    Content-based Recommendation;

    Collaborative Filtering Recommendation; Heuristic-based or Memory-based; Model-based.

    Knowledge-based Recommendation;

    Hybrid Recommendation.

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    6/51

    Content

    Introduction Collaborative Filtering

    Content-based Filtering Other directions

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    7/51

    What is collaborative filtering (CF) ?

    The most popular technique for recommender system

    Making filtering decisions for an individual userbased on thejudgments of other users

    General Idea

    Given an active user u, find friends {u1, , um}

    Recommend new items to the active user based on the

    opinions of his/her friends

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    8/51

    Collaborative Filtering Matrix8

    Jack Oliver Susan Bob

    Item 1 5 2 4

    Item 2 4 2 2

    Item 3 4 2

    Item 4 2 4

    Item 5 5 2 4

    2

    4

    3.5

    1.5

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    9/51

    CF : Intuitions9

    User similarity

    Suppose Jamie and I liked similar items in the past sixmonths

    If Jamie liked Item 4, I will also like it

    Item similarity

    Since 90% of those who liked Item 4 also liked

    Item 7, and, you liked Item 4

    You may also like Item 7

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    10/51

    Short History of CF10

    1992 GroupLens Project : (Usernet News) 1994

    1995 Ringo: music (later Firefly, purchased by Microsoft?) Bellcore: Video Recommender

    1996 Recommender Systems Workshop in SIGIR

    After 1996 substantial integration with machine learning, information

    filtering

    Increasing commercial application After 2000

    Web recommender system

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    11/51

    Commercial applications amazon.com11

    Input: One artist/author name

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    12/51

    Commercial applications Amazon.com12

    Output: List of Recommendations

    Explore / Refine Recommendations

    Search usingRecommendations

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    13/51

    Commercial applications Sleeper13

    Input: Ratings of 10 books for all users

    Use of continuous Rating Bar

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    14/51

    Collaborative

    Filtering

    Non-probabilistic

    Algorithms

    ProbabilisticAlgorithms

    User-basednearestneighbor

    Item-based

    nearestneighbor

    Reducing

    dimensionality

    Bayesian-network models

    EM algorithm

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    15/51

    MUser-based collaborative filteringAproaches15

    General Ideas for user-based

    Find top K friends for active user a based on usersimilarity- Sim (a, i)

    Make prediction for the active user based on the ratings ofthose top K friends, Pre(a,i) = f( ratings of K users, i)

    Specific approaches differ in

    Sim(a,i) -- the distance/similarity between two users

    Such as Pearson correlation coefficient and Cosine similarity

    Many other possibilities

    Pre(a,i) = f( ratings of K users, i)

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    16/51

    Step 1 : Find K nearest friends by Pearson Coefficient

    F Algorithm

    16

    k,i k L,i L

    i N

    kL2

    2k L

    k,i k L,i Li N

    k,i L,i

    k

    (R -R )(R -R )cov(k,L)

    W = =

    (R -R ) (R -R )

    Where, N is the number of items co-rated by user K and user L;

    R and R is the user K and L's rating on item i , respectively;

    R and

    LR is the average ratings of user k and L , respectively.

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    17/51

    Step 2 : Predict users rating on item

    17

    i , m k , i

    i f r i e n d s

    k , m

    k , i

    i f r i e n d s

    i , m i k , i

    i f r i e n d s

    k , m k

    k , i

    i f r i e n d s

    k , m

    k

    k , i

    R W

    P r e =W

    ( R - R ) W

    P r e = R +W

    w h e r e R i s U s e r k 's r a t i n g s o n i t e m m ;

    R i s t h e a v e r a g e r a t i n g s o f u s e r K ;

    W i s t h e r e l a t i o n s h i p b e t w e e n u s e

    o r

    r k a n d i .

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    18/51

    18

    Pearson correlation coefficient

    ken

    Lee

    meg

    k,i k L,i Li N

    ken-lee kL2

    2k Lk,i k L,i L

    i N

    2 2 2 2

    (1 5 2 4)

    R 34

    (4 2 5 1)R 3

    4

    (2 4 3)R 3

    3

    (R -R )(R -R )cov(k,L)W =W = =

    (R -R ) (R -R )

    (1 3) (4 3) (5 3) (2 3) (2 3) (5 3) (4 3) (1 3)

    (1 3) (5 3) (2 3) (4 3) (4

    + + += =

    + + += =

    + += =

    + + + =

    + + +

    2 2 2 23) (2 3) (5 3) (1 3)

    0.8

    + + +

    =

    User

    Items Ken Lee Meg Nan

    1 1 4 2

    2 5 2 4

    3 3 5

    4 2 5

    5 4 1

    6 ? 2 5 ?

    So use the same method we can get Wken-meg=+1.0 . We do not needcalculate Wken-nan ,because Nan do not gave mark for Item six.

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    19/51

    Pearson CF Algorithm

    19

    Predict Kens preference on Item 6

    UserItem Ken Lee Meg Nan

    1 1 4 2

    2 5 2 4

    3 3 5

    4 2 5

    5 4 1

    6 ? 2 5 ?

    i,m i k ,i

    i frien d s

    , 6 k ,m k k ,i

    i frien ds

    (R -R )W

    P r R = R +W

    (5 3) (2 3) 2 ( 0 .8 )3 3 4 .5 6

    1 0 .8

    K en

    ken m eg ken lee

    ken m eg ken lee

    e

    W W

    W W

    =

    + = + = + =

    + +

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    20/51

    Problem of Pearson CF20

    Data sparseness

    Misleading friends

    User

    Item Ken Lee Meg Nan

    1 1 52 2

    3 3

    4 5

    5 4

    6 ? 2 ?

    User

    Item Ken Lee Meg Nan

    1 1 1 12 5 5 5

    3

    4 2 1

    5 4 3

    6 ? 1 5 5

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    21/51

    21

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    22/51

    Item-based collaborative filtering

    Basic idea:

    Use the similarity between items (and not users) to make predictions

    Example:

    Look for items that are similar to Item5

    Take Alice's ratings for these items to predict the rating for Item5

    Item1 Item2 Item3 Item4 Item5

    Alice 5 3 4 4 ?

    User1 3 1 2 3 3

    User2 4 3 4 3 5

    User3 3 3 1 5 4

    User4 1 5 5 2 1

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    23/51

    The cosine similarity measure

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    24/51

    Making predictions

    A common prediction function:

    Neighborhood size is typically also limited to a specific size

    Not all neighbors are taken into account for the prediction

    An analysis of the MovieLens dataset indicates that "in most real-world

    situations, a neighborhood of 20 to 50 neighbors seems reasonable"

    (Herlocker et al. 2002)

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    25/51

    Another example for

    user-based and

    item-based

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    26/51

    26

    UserUser--User recommendationsUser recommendations

    Users who bought X like Y.Users who bought X like Y.

    Each user is represented by a vectorvector

    indicating his ratings for each productratings for each product. Users with a small distancesmall distancebetween

    each other are similarsimilar.

    Find a similar user and recommendthings they like that you havent rated.

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    27/51

    27

    Step 1: Represent input dataStep 1: Represent input data

    u1 u2 u3 u4 u5 u6

    item1 5 1 5 4 0 3

    item2 3 3 1 1 5 1item3 0 1 ? 2 1 4

    item4 1 1 4 1 1 2

    item5 3 2 5 0 0 3item6 4 3 0 0 4 0

    item7 0 1 5 1 1 1

    Ranking matrix

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    28/51

    28

    Step 2.1. Calculate similarity betweenvectors

    Cosine formula

    Pearson correlation

    Step 2: Find nearest neighboursStep 2: Find nearest neighbours

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    29/51

    29

    UserUser--User Collaborative filteringUser Collaborative filtering

    u1 u2 u3 u4 u5 u6

    item1

    5 1 5 4 0 3

    item2 3 3 1 1 5 1

    item3 0 1 ? 2 1 4

    item4 1 1 4 1 1 2item5 3 2 5 0 0 3

    item6 4 3 0 0 4 0

    item7 0 1 5 1 1 10.630.63 0.760.76 0.710.71 0.220.22 0.930.93Similarity measureSimilarity measure

    0.63 = 5*5+3*1+1*4+3*5+4*0+0*5 / sqrt(5*5 + 3*3+1 +9+16 )*sqrt(25+1+16+25+25)= 47/(sqrt(60)*sqrt(92))

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    30/51

    30

    ItemItem--Item Collaborative filteringItem Collaborative filtering

    u1 u2 u3 u4 u5 U6item1 5 1 5 4 0 3

    item2 3 3 1 1 5 1

    item3 0 1 ? 2 1 4item4 1 1 4 1 1 2

    item5 3 2 5 0 0 3

    item6 4 3 0 0 4 0

    item7 0 1 5 1 1 1

    Similarity measure

    0.62

    0.9

    0.850.85

    0.64

    0.230.23

    0.44

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    31/51

    31

    Step 2: Find nearest neighboursStep 2: Find nearest neighbours

    Step 2.2. Define neighbourhood (sizeStep 2.2. Define neighbourhood (size LL))

    sort and take first L

    aggregate neighbourhood(at each step take

    the closest to the centroid)

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    32/51

    32

    u1 u2 u3 u4 u5 U6

    item1 5 1 5 4 0 3

    item2 3 3 1 1 5 1

    item3 0 1 ? 2 1 4item4 1 1 4 1 1 2

    item5 3 2 5 0 0 3

    item6 4 3 0 0 4 0

    item7 0 1 5 1 1 10.630.63 0.760.76 0.710.71 0.220.22 0.930.93Similarity measure

    2.5

    Weighted sum

    (1x0.76+2x0.71+4x0.93)

    (0.76+0.71+0.93)

    = 2.5

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    33/51

    33

    u1 u2 u3 u4 u5 U6

    item1 5 1 5 4 0 3

    item2 3 3 1 1 5 1

    item3 0 1 ? 2 1 4item4 1 1 4 1 1 2

    item5 3 2 5 0 0 3

    item6 4 3 0 0 4 0

    item7 0 1 5 1 1 1

    Similarity measureSimilarity measure

    0.620.62

    0.90.9

    0.850.85

    0.640.64

    0.230.23

    0.440.44

    4.6

    Weighted sum

    (4x0.9+5x0.85+5x0.64)

    (0.9+0.85+0.64)

    = 4.6

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    34/51

    Lora Aroyo 34

    Step 3: Generate RecommendationsStep 3: Generate Recommendations

    Most frequent items scans the neighbourhoodscans the neighbourhood andcalculates the frequency for each item can be combined with the rating value

    Association rules recommendation expands the number of items based on

    association rules upon what has beenrecommended by the neighboursrecommended by the neighbours

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    35/51

    Lora Aroyo 35

    ItemItem--Item: ChallengesItem: Challenges

    Getting the users to tell you what they likeGetting the users to tell you what they like Both financial and time reasons not to.

    Getting enough data to make novel predictionsGetting enough data to make novel predictions.

    What users really want are recommendations for thingstheyre not aware of.

    Most effective when you have metadatametadata that lets youautomatically relate itemsautomatically relate items.

    Genre, actors, director, etc. Also best when decoupled from paymentdecoupled from payment

    Users should have an incentive to rate items truthfully.

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    36/51

    Lora Aroyo 36

    UserUser--User RecommendationsUser Recommendations

    Advantages: Users dont need to rate much. No info about products needed.

    Easy to implement

    Disadvantages: Pushes users toward the middle products with

    more ratings carry more weight.

    How to deal with new products? Many products and few users -> lots of things dont

    get recommended.

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    37/51

    Netflix Prize Contest

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    38/51

    Model Based - Probabilistic methods

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    39/51

    Calculation of probabilities in simplistic approach

    Item1 Item2 Item3 Item4 Item5

    Alice 1 3 3 2 ?

    User1 2 4 2 2 4

    User2 1 3 3 5 1

    User3 4 5 2 3 3

    User4 1 1 5 2 1

    More to consider

    Zeros (smoothing required)

    like/dislike simplification possible

    X = (Item1 =1, Item2=3, Item3= 3, item4=2)

    of Alice

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    40/51

    Content

    Introduction Collaborative Filtering

    Content-based Filtering Other directions

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    41/51

    Content-Based Recommending

    Recommendations are based on information on

    the content of items rather than on other usersopinions.

    Uses machine learning algorithms to induce a

    profile of the users preferences from examplesbased on a featural description of content.

    Lots of systems

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    42/51

    Advantages of Content-Based Approach

    No need for data on other users.

    No cold-start or sparsity problems.

    Able to recommend to users with unique tastes.

    Able to recommend new and unpopular items

    No first-rater problem.

    Can provide explanations of recommended items bylisting content-features that caused an item to be

    recommended.

    Well-known technology The entire field ofClassification Learning is at (y)our disposal!

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    43/51

    You detail

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    44/51

    Disadvantages of Content-Based Method

    Requires content that can be encoded as meaningful

    features. Users tastes must be represented as a learnable

    function of these content features.

    Unable to exploit quality judgments of other users.

    Unless these are somehow included in the content

    features.

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    45/51

    Content

    Introduction Collaborative Filtering

    Content-based Filtering Other directions

    Temporal Dynamics in the

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    46/51

    p yRecommendations

    Item-side effects: Product perception and popularity are constantly

    changing Seasonal patterns influence items popularity

    User-side effects:

    Customers ever redefine their taste Transient, short-term bias; anchoring

    Drifting rating scale

    Change of rater within household

    Time Sensiti e Recommenders

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    47/51

    Time Sensitive Recommenders

    Koren (2009)

    Collaborative Filtering with Temporal Dynamics

    He use factor models to separate differentaspects of the ratings to observe changes in:

    1. Rating scale of individual users

    2. Popularity of individual items3. User preferences

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    48/51

    Extending Capabilities of Recommender Systems

    Comprehensive understanding of users and items

    Multidimensionality of Recommendations: add additionalcontextual information to the User X Item space

    Multcriteria Ratings: find Pareto optimal solutions, take alinear combination of multiple criteria, optimize the mostimportant criterion, consecutively optimize one criterion ata time

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    49/51

    Social Recommender System

    Motivation: Social Overload

    Information Overload

    Interaction Overload

    Target the social media domain

    Aim at coping with the challenge of social overload

    Aim at increasing adoption and engagement

    Often apply personalization techniques

    Utilize social network and content, incorporate short-term

    interest and long-term interest, Accuracy vs. Serendipitytradeoff

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    50/51

    Social Recommender System

    Tag Recommendation

    People Recommendation

    Community Recommendation

    Recommendation for groups

    Recommenders in Enterprise

    Recommenders in Activity Stream

    Problems

    Cold start

    Trust and distrust (Reputation), explanaition

    Temporal Aspects in Social Recommendation

  • 7/30/2019 Is593-Lecture04 Recommendation Systems

    51/51

    Mobile Recommender System

    Offer personalized, context-sensitive recommendations

    Models Context-Dependent Recommendations Distributed Models, e.g. P2P

    Proactive Recommendations

    Difficulties: Data is more complex

    Transplantation problemrecommendations may not apply in allregions