Recommender system

Preview:

DESCRIPTION

The presentation discusses about how the Recommender System works for ex: how Amazon recommends books for customers when they login to Amazon website.

Citation preview

Recommender System

How does it work ?

Group 4: Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Instructed by Dr.Tim Reichert

2 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Outline

Introduction Collaborative filtering algorithms

User-based Item-based Similarity algorithms Apache Mahout demo

Case study – Amazon Demo – Music recommender system

3 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Introduction

Source: http://www.cguru.info/information_technology_branch.htm

Technologies help people do many jobs ..

… and also in searching information

4 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Go to web directories

5 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Or use search engines

6 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

So what is the Problem ?

Find

Knew what you

need

Search

Knew the keywords ?RECOMMENDER

SYSTEM

7 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Recommender system

Predict & produce the most relevant

recommendations to its audiences

based on their tastes.

8 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Where’s it applied ?

9 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Breaking news

Source: http://blogs.wsj.com/digits/2014/01/17/amazon-wants-to-ship-your-package-before-you-buy-it/

Amazon Wants to

Ship Your Package

Before You Buy It

What is the secret behind ?

10 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Recommender Approaches

Item Hierarchy

Attribute-based recommendations

Collaborative filtering – User-user Similarity

Collaborative filtering – Item-item Similarity

Social + Interest Graph Based

Model Based

11 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Recommender Approaches

Item Hierarchy

Attribute-based recommendations

Collaborative filtering – User-user Similarity

Collaborative filtering – Item-item Similarity

Social + Interest Graph Based

Model Based

12 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Recommender Approaches

Item Hierarchy

Attribute-based recommendations

Collaborative filtering – User-user Similarity

Collaborative filtering – Item-item Similarity

Social + Interest Graph Based

Model Based

13 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Recommender Approaches

Item Hierarchy

Attribute-based recommendations

Collaborative filtering – User-user Similarity

Collaborative filtering – Item-item Similarity

Social + Interest Graph Based

Model Based

14 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Recommender Approaches

Item Hierarchy

Attribute-based recommendations

Collaborative filtering – User-user Similarity

Collaborative filtering – Item-item Similarity

Social + Interest Graph Based

Model Based

15 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Recommender Approaches

Item Hierarchy

Attribute-based recommendations

Collaborative filtering – User-user Similarity

Collaborative filtering – Item-item Similarity

Social + Interest Graph Based

Model Based

16 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

What is Collaborative filtering ?

Method of making automatic predictions

About the interests of a user by collecting preferences or taste information from many users.

?

17 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

User-based CF

1/25/2014

18 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

User-based CF

1/25/2014

• General Idea• Algorithm• K-Nearest Neighbors

19 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

I'm gonna rent a film to watch with my boyfriend this week. Do you have any suggestion ?

I'm gonna rent a film to watch with my boyfriend this week. Do you have any suggestion ?

20 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

What kind of film does he like ?What kind of film does he like ?

I'm gonna rent a film to watch with my boyfriend this week. Do you have any suggestion ?

I'm gonna rent a film to watch with my boyfriend this week. Do you have any suggestion ?

21 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

I don't know but his best friend really like insect collecting

I don't know but his best friend really like insect collecting

22 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

I don't know but his best friend really like insect collecting

I don't know but his best friend really like insect collecting

Maybe your boyfriend is similar to him. You guys can watch spiderman !?

Maybe your boyfriend is similar to him. You guys can watch spiderman !?

23 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

I don't know but his best friend really like insect collecting

I don't know but his best friend really like insect collecting

Maybe your boyfriend is similar to him. You guys can watch spiderman !?

Maybe your boyfriend is similar to him. You guys can watch spiderman !?

NOT IN OUR SCOPE

24 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

I'm gonna rent a film to watch with my boyfriend this week. Do you have any suggestion ?

I'm gonna rent a film to watch with my boyfriend this week. Do you have any suggestion ?

25 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

What kind of film does he like ?What kind of film does he like ?

I'm gonna rent a film to watch with my boyfriend this week. Do you have any suggestion ?

I'm gonna rent a film to watch with my boyfriend this week. Do you have any suggestion ?

26 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

I don't know but he really enjoys our last film, iron man

I don't know but he really enjoys our last film, iron man

27 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

I don't know but he really enjoys our last film, iron man

I don't know but he really enjoys our last film, iron man

Really, my boyfriend also likes it and his favourite one is the amazing Spiderman so maybe you guys can try it

Really, my boyfriend also likes it and his favourite one is the amazing Spiderman so maybe you guys can try it

28 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

General Idea

1/25/2014

Similar

Recommend

29 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Algorithm

1/25/2014

… …

8 8 03

30 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Algorithm

1/25/2014

… …

8 8 03

31 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Algorithm

1/25/2014

… …

8 8 03

…5

5

32 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Algorithm

1/25/2014

… …

8 8 03

33 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

K-Nearest Neighbors

1/25/2014

Neighborhood of most similar users is computed first

Only items known to those users are considered

34 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

K-Nearest Neighbors

1/25/2014

8 8 03

35 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Item-Based CF

1/25/2014

• General Idea• Why we need ?• Algorithm

36 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

I'm gonna rent a film to watch with my boyfriend this week. Do you have any suggestion ?

I'm gonna rent a film to watch with my boyfriend this week. Do you have any suggestion ?

37 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

What kind of film does he like ?What kind of film does he like ?

I'm gonna rent a film to watch with my boyfriend this week. Do you have any suggestion ?

I'm gonna rent a film to watch with my boyfriend this week. Do you have any suggestion ?

38 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

I don't know but he really enjoys our last film, iron man

I don't know but he really enjoys our last film, iron man

39 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

I don't know but he really enjoys our last film, iron man

I don't know but he really enjoys our last film, iron man

Really, if you enjoy ironman then you should try ironman 2

Really, if you enjoy ironman then you should try ironman 2

40 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

General Idea

1/25/2014

Item-based recommendation is derived from how similar items are to items, instead of users to users.

Similar

Recommen

d

41 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Why we need ?

1/25/2014

So MANY users !!!

42 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014Human is COMPLEX ?

43 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

10 000 users like

Like

8 000 users like

44 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Algorithm

1/25/2014

……

8 5 16

Check every item that has no preference

For each of them, calculate the similarity between it and every item that has preference

… …

45 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Similarities

1/25/2014

• Pearson correlation• Euclidean distance• Tanimoto

46 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Pearson Correlation

1/25/2014

A hypothesis that how tall you are effects your self esteem

47 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Pearson Correlation

1/25/2014

The Pearson correlation is a number between –1 and 1

Measure of the strength of a linear association between two variables

48 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Pearson Correlation

1/25/2014

• Doesn’t take into account the number of items in which two users’ preferences overlap

49 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Pearson Correlation

1/25/2014

• Doesn’t take into account the number of items in which two users’ preferences overlap

• If two users overlap on only one item, no correlation can be computed

50 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Tanimoto

1/25/2014

Ignore preference values entirely.

• It’s the ratio of the size of the intersection to the size of the union of their preferred items

• When two users’ items completely overlap, the result is 1.0

• When they have nothing in common, it’s 0.0

51 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Tanimoto

1/25/2014

= AB / ( A + B - AB)

53 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Tanimoto

1/25/2014

Only use while underlying data contains only Boolean preferences

Too much noise in preferences

Mahout Basic Demo

Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

What is Apache Mahout?

• Open Source from Apache• Mahout is a Java library

o Implementing Machine Learning techniques• Recommendation• Clustering• Classification

Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Why do we prefer Mahout ?

• Apache License• Good Community & Documentation

• Scalableo Based on Hadoop (not mandatory!)

Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Physical Storage(database, files …)

Data Model

Recommender

Application

Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Recommendation in Mahout

• Input: raw data (user preferences)• Output: Preference estimation• Step 1

o Mapping raw data into a DataModel Mahout-compliant

• Step 2o Tuning recommender components

• Similarity measure, neighborhood, …

• Step 3o Recommend

Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Recommendation Components

• Five Java interfaceso DataModel interface:

• MySQLJDBCDataModel, FileDataModel …o UserSimilarity interface

• Methods to calculate the degree of correlation between two users

o ItemSimilarity interface• Methods to calculate the degree of correlation between two

itemso UserNeighborhood interface

• Methods to define the concept of ‘neighborhood’o Recommender interface

• Methods to implement the recommendation step itself

Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Similarity Metrics

• SIMILARITY_COOCCURRENCE • SIMILARITY_LOGLIKELIHOOD • SIMILARITY_TANIMOTO_COEFFICIENT• SIMILARITY_CITY_BLOCK • SIMILARITY_COSINE• SIMILARITY_PEARSON_CORRELATION        • SIMILARITY_EUCLIDEAN_DISTANCE

61 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

CASE STUDY

“Much is made of what the likes of Facebook, Google and Apple know about users. Truth is, Amazon may know more. And the massive retailer proves it every day “ - JP Mangalindan, Writer [*]

References: http://tech.fortune.cnn.com/2012/07/30/amazon-5/

62 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

CASE STUDY

1/25/2014

• Amazon recommendation system is based on a number of simple elements:o what a user has bought in the past and recently viewedo which items a user has in virtual shopping carto items the user has rated and liked,o what other customers have viewed and purchased

• The retail giant's call this "item-to-item collaborative filtering“ • And used this algorithm to heavily customize the browsing experience

for returning customers

References: http://tech.fortune.cnn.com/2012/07/30/amazon-5/

63 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

• The recommendation system worked, and Amazon reported very successfullyo 29% sales increase to $12.83 billion during its 2nd fiscal quarter (as of July 26,

2012 )o Compare to $9.9 billion during the same time last year

• Amazon has integrated recommendations into nearly every part of the purchasing process from product discovery to checkout

• "Our mission is to delight our customers by allowing them to serendipitously discover great products.“ an Amazon spokesperson

CASE STUDY

1/25/2014References: http://tech.fortune.cnn.com/2012/07/30/amazon-5/

64 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

CASE STUDY – Amazon recommendations services

References http://www.google.com/patents/US7921042

65 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

CASE STUDY - Generation of Similar Items Table

References http://www.google.com/patents/US7921042 (Fig.1)http://www.google.com/patents/US7113917 (Fig.3,4)

The recommendation services components include:- a recommendation

process- and an off-line table

generation process- a similar items table

66 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014References http://www.google.com/patents/US7921042 (Fig.1)

http://www.google.com/patents/US7113917 (Fig.2)

CASE STUDY - Generation of Recommendation

67 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014References http://www.google.com/patents/US7921042 (Fig.1)

http://www.google.com/patents/US7113917 (Fig.5)

CASE STUDY - Generation of Recommendation

68 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

References http://www.google.com/patents/US7921042 (Fig.1)http://www.google.com/patents/US7113917 (Fig.7)

CASE STUDY - Generation of Recommendation

69 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

EVALUATION

Experimental Settings• Offline Experiments

- Performed by using a pre-collected data set of users choosing or rating items- Simulate the behavior of users that interact with a recommendation system.- Assume that the user behavior when the data was collected will be similar enough to the user

behavior when the recommender system is deployed,- Make reliable decisions based on the simulation.

• User Studies - conducted by recruiting a set of test subject, - and asking and observing them to perform several tasks requiring an interaction with the

recommendation system.- We can then check whether the recommendations are used, and whether people read different

stories with and without recommendations then ask them whether recommend were relevant

• Online Experiments - measuring the change in user behavior when interacting with different recommendation

systems.- if users of one system follow the recommendations more often, or if some utility gathered from

users of one system exceeds utility gathered from users of the other system, then we can conclude that one system is superior to the otherReferences: Microsoft research :Evaluating Recommendation Systems -

Guy Shani and Asela Gunawardana

70 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

1/25/2014

EVALUATION

References: Microsoft research :Evaluating Recommendation Systems - Guy Shani and Asela Gunawardana

Reliable conclusion1. Confidence and p-values2. Multiple tests

Measure Metrics3. User Preference & Prediction Accuracy: voting from user

o Root Mean Squared Error (RMSE)o Measuring Usage Prediction

4. Coverage: Item Space & User Space5. Novelty: recommendations for items that the user did not know about6. Utility: the recommendation engine can be judged by the revenue

that it generates for the website

71 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

References

• Personalized recommendations of items represented within a database http://www.google.com/patents/US7113917

• Computer processes for identifying related items and generating personalized item recommendations http://www.google.com/patents/US7921042

• Microsoft research :Evaluating Recommendation Systems - Guy Shani and Asela Gunawardana

• Amazon Recommendation – Industry report

1/25/2014

72 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Summary

• Recommender Systems• User-based vs Item-based• Similarity metrics

o Depend on data to choose the most suitable

• Evaluation and challenges• Apache mahout

o A Java library implements machine learning techniques

1/25/2014

73 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Mahout Music Recommend Demo

IF YOU LIKE BRITNEY, YOU WILL

LOVE ….

Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Architecture of Recommender

DemoFriendLikes.csv

DataModel

FacebookRecommender Recommend

er

FacebookRecommenderSOAP

Glassfish Java 6 EE Server

facebook-recommender-demo.war

SOAP

75 Nguyen Dao Tan Bao Nguyen Thi Ngoc Phu Cao Dinh Qui Pham Huy Thanh

Q&A

THANK YOU!

1/25/2014

Recommended