24
Factorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University W.K. Pan (CSSE, SZU) FM 1 / 23

Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Factorization Machine

Weike Pan

College of Computer Science and Software EngineeringShenzhen University

W.K. Pan (CSSE, SZU) FM 1 / 23

Page 2: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Outline

1 Notations and Problem Definition

2 MethodFactorization Machine for RatingsFactorization Machine for Ratings and Content

3 Discussion

4 Conclusion

5 References

W.K. Pan (CSSE, SZU) FM 2 / 23

Page 3: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Notations and Problem Definition

Reference

libFM [Rendle, 2012]

W.K. Pan (CSSE, SZU) FM 3 / 23

Page 4: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Notations and Problem Definition

Notations (1/4)

Table: Some notations.

n user numberm item numberu ∈ {1,2, . . . ,n} user IDi , i ′ ∈ {1,2, . . . ,m} item IDrui observed rating of user u on item iG,e.g.,G = {1,2,3,4,5} grade score set (or rating range)R ∈ {G∪?}n×m rating matrix (or explicit feedback)

yui ∈ {0,1} yui =

{

1, if (u, i , rui ) is observed

0, if (u, i , rui ) is not observedR = {(u, i , rui )} observed rating records (training data)p =

u,i yui = |R| number of observed ratingsp/n/m density (or sometimes called sparsity)

W.K. Pan (CSSE, SZU) FM 4 / 23

Page 5: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Notations and Problem Definition

Notations (2/4)

Table: Some notations.

µ ∈ R global average rating valuebu ∈ R user biasbi ∈ R item biasd ∈ R number of latent dimensionsUu· ∈ R

1×d user-specific latent feature vectorU ∈ R

n×d user-specific latent feature matrixVi · ∈ R

1×d item-specific latent feature vectorV ∈ R

m×d item-specific latent feature matrixRte = {(u, i , rui )} rating records of test datar̂ui predicted rating of user u on item iT iteration number in the algorithm

W.K. Pan (CSSE, SZU) FM 5 / 23

Page 6: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Notations and Problem Definition

Notations (3/4)

Table: Some notations.

f i ∈ Rf×1 description of item i

W.K. Pan (CSSE, SZU) FM 6 / 23

Page 7: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Notations and Problem Definition

Notations (4/4)

Table: Some notations.

w0 ∈ R model parameter for zero-order interactionw ∈ R

z×1 model parameter for first-order interactionP ∈ R

z×d model parameter for second-order interaction

W.K. Pan (CSSE, SZU) FM 7 / 23

Page 8: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Method Factorization Machine for Ratings

Representation (1/3)

W.K. Pan (CSSE, SZU) FM 8 / 23

Page 9: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Method Factorization Machine for Ratings

Representation (2/3)

The original rating matrix R ∈ Gn×m is represented by a design matrix

X and a rating vector r ,

X ∈ {0,1}p×(n+m), r ∈ Gp×1 (1)

where p =∑n

u=1∑m

i=1 yui is the number of ratings in R.

W.K. Pan (CSSE, SZU) FM 9 / 23

Page 10: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Method Factorization Machine for Ratings

Representation (3/3)

For a rating record (u, i , rui):

The corresponding row of X is

x = [. . . 0 . . . 1︸︷︷︸

u

. . . 0 . . . 1︸︷︷︸

n+i

. . . 0 . . .] ∈ {0,1}1×(n+m)

where the uth and (n + i)th entries are 1s, i.e., xu = xn+i = 1, andall other entries are 0s.

The corresponding entry of r is

rui

W.K. Pan (CSSE, SZU) FM 10 / 23

Page 11: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Method Factorization Machine for Ratings

Prediction Rule (1/3)

The prediction rule of user u on item i

r̂ui = w0 +

n+m∑

j=1

wjxj +

n+m∑

j=1

n+m∑

j ′=j+1

xjxj ′wjj ′ (2)

where w0,wj ,wjj ′ ∈ R.

The above prediction rule includes zero-, first- and second orderinteractions.

W.K. Pan (CSSE, SZU) FM 11 / 23

Page 12: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Method Factorization Machine for Ratings

Prediction Rule (2/3)

The second order interaction is usually approximated via the innerproduct of two vectors,

wjj ′ = Pj ·PTj ′· (3)

where Pj ·,Pj ′· ∈ R1×d .

W.K. Pan (CSSE, SZU) FM 12 / 23

Page 13: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Method Factorization Machine for Ratings

Prediction Rule (3/3)

For the rating data only (without content information), we have

r̂ui = w0 +

n+m∑

j=1

wjxj +

n+m∑

j=1

n+m∑

j ′=j+1

xjxj ′Pj ·PTj ′·

= w0 + wu + wn+i + Pu·PTn+i ,·

⇒ µ+ bu + bi + Uu·V Ti ·

where P = [UT VT ]T ∈ R(n+m)×d .

Observation: for FM with rating only, it is equivalent to RSVD

W.K. Pan (CSSE, SZU) FM 13 / 23

Page 14: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Method Factorization Machine for Ratings

Question

Why do we need FM

W.K. Pan (CSSE, SZU) FM 14 / 23

Page 15: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Method Factorization Machine for Ratings and Content

Representation

When we have some content information of each item or eachuser (e.g., an item’s description or a user’s profile)

W.K. Pan (CSSE, SZU) FM 15 / 23

Page 16: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Method Factorization Machine for Ratings and Content

Prediction Rule

The prediction rule of user u on item i

r̂ui = w0 +z∑

j=1

wjxj +z∑

j=1

z∑

j ′=j+1

xjxj ′Pj ·PTj ′· (4)

where z = n + m + f .

W.K. Pan (CSSE, SZU) FM 16 / 23

Page 17: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Method Factorization Machine for Ratings and Content

Objective Function

We have the objective function,

minΘ

n∑

u=1

m∑

i=1

yui [12(rui − r̂ui)

2 + reg(w ,P)]

where reg(w ,P) = αw2

∑zj=1 δ(xj 6= 0)w2

j +αp2

∑zj=1 δ(xj 6= 0)‖Pj ·‖

2

is the regularization term used to avoid overfitting.And Θ = {w0,w ,P} are model parameters to be learned.

W.K. Pan (CSSE, SZU) FM 17 / 23

Page 18: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Method Factorization Machine for Ratings and Content

Gradient

Denoting fui =12(rui − r̂ui)

2 + reg(w ,P), for each (u, i , rui ) ∈ R, we havethe gradients,

∇w0 =∂fui

∂w0= −eui (5)

∇wj =∂fui

∂wj= −euixj + αw wj ,∀xj 6= 0 (6)

∇Pj · =∂fui

∂Pj ·= −euixj

z∑

j ′ 6=j

xj ′Pj ′· + αpPj ·,∀xj 6= 0 (7)

where eui = rui − r̂ui .

W.K. Pan (CSSE, SZU) FM 18 / 23

Page 19: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Method Factorization Machine for Ratings and Content

Update Rule

For each (u, i , rui ) ∈ R, we have the update rules,

w0 = w0 − γ∇w0 (8)

wj = wj − γ∇wj ,∀xj 6= 0 (9)

Pj · = Pj · − γ∇Pj ·,∀xj 6= 0 (10)

where γ is the learning rate.

W.K. Pan (CSSE, SZU) FM 19 / 23

Page 20: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Method Factorization Machine for Ratings and Content

Algorithm

1: Initialize model parameters Θ2: for t = 1, . . . ,T do3: for t2 = 1, . . . ,p do4: Randomly pick up a rating from R5: Calculate the gradients in Eq.(5-7)6: Update the parameters in Eq.(8-10)7: end for8: Decrease the learning rate γ ← γ × 0.99: end for

Figure: The SGD algorithm for FM.

Note that the above algorithm is slightly different from that inlibFM [Rendle, 2012].

W.K. Pan (CSSE, SZU) FM 20 / 23

Page 21: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Discussion

Discussion

Can FM incorporate an item’s taxonomy information?

Can FM incorporate a user’s profile?

Can FM incorporate a user’s social connections?

Can FM incorporate the context information such as location andtime?

W.K. Pan (CSSE, SZU) FM 21 / 23

Page 22: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Conclusion

Conclusion

FM can incorporate auxiliary data seamlessly.

W.K. Pan (CSSE, SZU) FM 22 / 23

Page 23: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

Conclusion

Homework

Use the libFM software at http://www.libfm.org/

Read the libFM paper [Rendle, 2012]

Read chapter 3 of Recommender Systems: An Introduction

W.K. Pan (CSSE, SZU) FM 23 / 23

Page 24: Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization Machine Weike Pan College of Computer Science and Software Engineering Shenzhen University

References

Rendle, S. (2012).

Factorization machines with libfm.ACM Transactions on Intelligent Systems and Technology, 3(3):57:1–57:22.

W.K. Pan (CSSE, SZU) FM 23 / 23