20
2015/2/27 Scaling-up Item-based Collaborative Filtering Recommendation Algorithm based on Hadoop Jing Jiang, Jie Lu, Guangquan Zhang, Guodong Long 2011 IEEE World Congress Services

Collaborative Filtering Recommendation Algorithm based on Hadoop

Embed Size (px)

Citation preview

Page 1: Collaborative Filtering Recommendation Algorithm based on Hadoop

2015/2/27

Scaling-up Item-based Collaborative Filtering Recommendation Algorithm based on Hadoop

Jing Jiang, Jie Lu, Guangquan Zhang, Guodong Long 2011 IEEE World Congress Services

Page 2: Collaborative Filtering Recommendation Algorithm based on Hadoop

outline

✤ Collaborative Filtering

✤ scaling-up item-based CF

✤ experimentation and evaluation

Page 3: Collaborative Filtering Recommendation Algorithm based on Hadoop

Collaborative Filtering

✤ Collaborative filtering (CF) techniques have achieved widespread success in E-commerce nowadays.

Page 4: Collaborative Filtering Recommendation Algorithm based on Hadoop

Collaborative Filtering

✤ Collaborative filtering is a method of making automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating). from wiki

Page 5: Collaborative Filtering Recommendation Algorithm based on Hadoop

Collaborative Filtering

1. Weight all users with respect to similarity with active user

2. Select a subset of users to use as a set of predictors

3. Compute a prediction from a weighted combination of selected neighbors’ ratings

Page 6: Collaborative Filtering Recommendation Algorithm based on Hadoop

1. Weight all users with respect to similarity with active user

2. Select a subset of users to use as a set of predictors

3. Compute a prediction from a weighted combination of selected neighbors’ ratings

simple compute

Nathan [5,1,5]

Joe [5,2,5]

John [2,5,2.5]

Al [2,2,4]

use cosine compute similarity

cos (Nathan,Joe) 0.99

cos (Nathan,John) 0.64

cos (Nathan,Al) 0.91

Page 7: Collaborative Filtering Recommendation Algorithm based on Hadoop

1. Weight all users with respect to similarity with active user

2. Select a subset of users to use as a set of predictors

3. Compute a prediction from a weighted combination of selected neighbors’ ratings

simple compute

cos (Nathan,Joe) 0.99

cos (Nathan,John) 0.64

cos (Nathan,Al) 0.91

(0.99*4+0.64*3+0.91*2)/(0.99+0.64+0.91) = 3.03

0.99

0.910.64

? = 3.03

Page 8: Collaborative Filtering Recommendation Algorithm based on Hadoop

Collaborative Filtering

✤ User-Based CF

✤ Item-Based CF

compute similarity base on user

compute similarity base on item

Page 9: Collaborative Filtering Recommendation Algorithm based on Hadoop

Collaborative Filtering

✤ User-Based CFcompute similarity base on user

if predict user A to item4 rating user B to item4 rating is 5 user F to item4 rating is 1

user A to item4 =

5 * similarities (user A, user B) + 1 * similarities (user A, user F)

similarities (user A, user B) + similarities (user A, user F)

Page 10: Collaborative Filtering Recommendation Algorithm based on Hadoop

Collaborative Filtering

✤ Item-Based CFcompute similarity base on item

if predict user A to item4 rating user A to item2 rating is 1 user A to item3 rating is 1

user A to item4 =

1 * similarities (item2, item4) + 1 * similarities (item3, item4)

similarities (item2, item4) + similarities (item3, item4)

Page 11: Collaborative Filtering Recommendation Algorithm based on Hadoop

scaling-up item-based CF

divide CF algorithm into two steps as follows:

Similarity computation

Prediction and Recommendation

pearson correlation(1,-1)

j

Page 12: Collaborative Filtering Recommendation Algorithm based on Hadoop

scaling-up item-based CF

pearson correlation(1,-1)

j

Covariance

Page 13: Collaborative Filtering Recommendation Algorithm based on Hadoop

scaling-up item-based CF

Similarity computation

apple milk toast

sam 2 0 4

john 5 5 3

tim 2 4 ?

u

i j

j

Ri = (2+5+2)/3 Rj = (4+3)/2

Page 14: Collaborative Filtering Recommendation Algorithm based on Hadoop

scaling-up item-based CF

Similarity computation

apple milk toast

sam 2 0 4

john 5 5 3

tim 2 4 ?

u

j i

Ru(sam) = (2+0+4)/3

Rj = (2+5+2)/3 Ri = (4+3)/2

Page 15: Collaborative Filtering Recommendation Algorithm based on Hadoop

scaling-up item-based CF

The three parts of intensive computation are:

(1) computing the average rating for each item

(2) computing the similarity between item pairs

(3) computing predicted items for the target user

Page 16: Collaborative Filtering Recommendation Algorithm based on Hadoop

item i by user j

map item i

1 2 3

Page 17: Collaborative Filtering Recommendation Algorithm based on Hadoop

1

where means the set of users who rated the item k and item l

Page 18: Collaborative Filtering Recommendation Algorithm based on Hadoop

2

similarity

3

map user jmap user j

Page 19: Collaborative Filtering Recommendation Algorithm based on Hadoop

experimentation and evaluation

3 nodes

nodes with Intel P4 CPU, 1G RAM, 80G disk

All the machines were connectedwith one 100Mbps switch.

Page 20: Collaborative Filtering Recommendation Algorithm based on Hadoop

experimentation and evaluation

13

20