Upload
allen-sussman
View
57
Download
1
Tags:
Embed Size (px)
Citation preview
Allen Sussman
Find movies for two
Can we find a movie we’ll both actually like?
Each person enters movies they like and twolu finds movies they’ll both
like
Movies->
CCluelue
KKidsids
JJawsaws
BBabeabe
BBigig
11 5 3 4 5N
one
22N
one5 1 5
None
33 3N
one1 4 3
44 1 5 1 4 3
55 2 4 1 4 5
Use
rs->
CCluelue
KKidsids
JJawsaws
BBabeabe
BBigig
CCluelue
10
.20
.30
.40
.5
KKidsids
0.2
10
.30
.20
.3
JJawsaws
0.3
0.3
10
.20
.3
BBabeabe
0.4
0.2
0.2
10
.5
BBigig
0.5
0.3
0.3
0.5
1
Movies Similarity Matrix
Say User 1 likes Clue
User 2 likes BabeCC
luelueKK
idsidsJJ
awsawsBB
abeabeBB
igig
CCluelue
10
.20
.30
.40
.5
KKidsids
0.2
10
.30
.20
.3
JJawsaws
0.3
0.3
10
.20
.3
BBabeabe
0.4
0.2
0.2
10
.5
BBigig
0.5
0.3
0.3
0.5
1
CCluelue
1
0.2
0.3
0.4
0.5
BBabeabe
0.4
0.2
0.2
1
0.5
f( , )=
Largest number is for the movie Big. Users should watch it!
Ratings Table
Algorithm: Collaborative Filtering
CCluelue
KKidsids
JJawsaws
BBabeabe
BBigig
CCluelue
10
.20
.30
.40
.5
KKidsids
0.2
10
.30
.20
.3
JJawsaws
0.3
0.3
10
.20
.3
BBabeabe
0.4
0.2
0.2
10
.5
BBigig
0.5
0.3
0.3
0.5
1
CCluelue
0.6
KKidsids
0.2
JJawsaws
0.225
BBabeabe
0.6
BBigig
0.5
CCluelue
0.6
KKidsids
0.2
JJawsaws
0.225
BBabeabe
0.6
BBigig
0.5
Cross-Validation
• For each pair of users in test set, compare recommendations to combined ratings
Allen Sussman, Ph.D.
Training Set
Test Set
Use
rs->
Movies->
Ratings Table
Cross-ValidationMovies Similarity
Matrix
Test Set Features
Ground
Truth
Consider two users in test set
User 1User 2
Use algorithm and similarity matrix on
CCluelue
KKidsids
JJawsaws
BBabeabe
BBigig
11 4 3 1 5 2
22 4 5 1 5 2
Ground Truth
CCluelue
KKidsids
JJawsaws
BBabeabe
BBigig
Y Y Y
My Recommendations
P N
TCl
ueBi
g
FJa
wsBa
be
Ground
TruthFeature
s
to predict then compare predictions and
truth
0.6
0.2
0.225
0.6
0.5
Movies->
CCluelue
KKidsids
JJawsaws
BBabeabe
BBigig
11 5 3 4 5N
one
22N
one5 1 5
None
33 3N
one1 4 3
44 1 5 1 4 3
55 2 4 1 4 5
Use
rs->
CCluelue
KKidsids
JJawsaws
BBabeabe
BBigig
CCluelue
10
.20
.30
.40
.5
KKidsids
0.2
10
.30
.20
.3
JJawsaws
0.3
0.3
10
.20
.3
BBabeabe
0.4
0.2
0.2
10
.5
BBigig
0.5
0.3
0.3
0.5
1
Movies Similarity Matrix
Say User 1 likes Clue
User 2 likes BabeCC
luelueKK
idsidsJJ
awsawsBB
abeabeBB
igig
CCluelue
10
.20
.30
.40
.5
KKidsids
0.2
10
.30
.20
.3
JJawsaws
0.3
0.3
10
.20
.3
BBabeabe
0.4
0.2
0.2
10
.5
BBigig
0.5
0.3
0.3
0.5
1
CCluelue
1
0.2
0.3
0.4
0.5
BBabeabe
0.4
0.2
0.2
1
0.5
f( , )= 0
.6
0.2
0.225
0.6
0.5
Largest number is for the movie Big. Users should watch it!f(s1,s2)=mean(s1,s2)-
α*diff(s1,s2)
f(s1,1,s1,2,…,s2,1,s2,2,…) = mean(s1,1,s1,2,…,s2,1,s2,2,…)- α*std(s1,1,s1,2,…,s2,1,s2,2,…)-β*diff(mean(s1,1,s1,2,
…),mean(s2,1,s2,2,…))
For multiple input movies,
Ratings Table
Algorithm
0.6
0.2
0.225
0.6
0.5
Movies->
CCluelue
KKidsids
JJawsaws
BBabeabe
BBigig
11 5 3 4 5N
one
22N
one5 1 5
None
33 3N
one1 4 3
44 1 5 1 4 3
55 2 4 1 4 5
Use
rs->
CCluelue
KKidsids
JJawsaws
BBabeabe
BBigig
CCluelue
10
.20
.30
.40
.5
KKidsids
0.2
10
.30
.20
.3
JJawsaws
0.3
0.3
10
.20
.3
BBabeabe
0.4
0.2
0.2
10
.5
BBigig
0.5
0.3
0.3
0.5
1
Movies Similarity Matrix
Say User 1 likes Clue
User 2 likes BabeCC
luelueKK
idsidsJJ
awsawsBB
abeabeBB
igig
CCluelue
10
.20
.30
.40
.5
KKidsids
0.2
10
.30
.20
.3
JJawsaws
0.3
0.3
10
.20
.3
BBabeabe
0.4
0.2
0.2
10
.5
BBigig
0.5
0.3
0.3
0.5
1
CCluelue
1
0.2
0.3
0.4
0.5
BBabeabe
0.4
0.2
0.2
1
0.5
f( , )= 0
.6
0.2
0.225
0.6
0.5
Largest number is for the movie Big. Users should watch it!
Ratings Table
Algorithm
CCluelue
KKidsids
JJawsaws
BBabeabe
BBigig
CCluelue
10
.20
.30
.40
.5
KKidsids
0.2
10
.30
.20
.3
JJawsaws
0.3
0.3
10
.20
.3
BBabeabe
0.4
0.2
0.2
10
.5
BBigig
0.5
0.3
0.3
0.5
1