Weike Pan - Shenzhen Universitycsse.szu.edu.cn/staff/panwk/recommendation/CF/FM.pdfFactorization...

Preview:

Citation preview

Factorization Machine

Weike Pan

College of Computer Science and Software EngineeringShenzhen University

W.K. Pan (CSSE, SZU) FM 1 / 23

Outline

1 Notations and Problem Definition

2 MethodFactorization Machine for RatingsFactorization Machine for Ratings and Content

3 Discussion

4 Conclusion

5 References

W.K. Pan (CSSE, SZU) FM 2 / 23

Notations and Problem Definition

Reference

libFM [Rendle, 2012]

W.K. Pan (CSSE, SZU) FM 3 / 23

Notations and Problem Definition

Notations (1/4)

Table: Some notations.

n user numberm item numberu ∈ {1,2, . . . ,n} user IDi , i ′ ∈ {1,2, . . . ,m} item IDrui observed rating of user u on item iG,e.g.,G = {1,2,3,4,5} grade score set (or rating range)R ∈ {G∪?}n×m rating matrix (or explicit feedback)

yui ∈ {0,1} yui =

{

1, if (u, i , rui ) is observed

0, if (u, i , rui ) is not observedR = {(u, i , rui )} observed rating records (training data)p =

u,i yui = |R| number of observed ratingsp/n/m density (or sometimes called sparsity)

W.K. Pan (CSSE, SZU) FM 4 / 23

Notations and Problem Definition

Notations (2/4)

Table: Some notations.

µ ∈ R global average rating valuebu ∈ R user biasbi ∈ R item biasd ∈ R number of latent dimensionsUu· ∈ R

1×d user-specific latent feature vectorU ∈ R

n×d user-specific latent feature matrixVi · ∈ R

1×d item-specific latent feature vectorV ∈ R

m×d item-specific latent feature matrixRte = {(u, i , rui )} rating records of test datar̂ui predicted rating of user u on item iT iteration number in the algorithm

W.K. Pan (CSSE, SZU) FM 5 / 23

Notations and Problem Definition

Notations (3/4)

Table: Some notations.

f i ∈ Rf×1 description of item i

W.K. Pan (CSSE, SZU) FM 6 / 23

Notations and Problem Definition

Notations (4/4)

Table: Some notations.

w0 ∈ R model parameter for zero-order interactionw ∈ R

z×1 model parameter for first-order interactionP ∈ R

z×d model parameter for second-order interaction

W.K. Pan (CSSE, SZU) FM 7 / 23

Method Factorization Machine for Ratings

Representation (1/3)

W.K. Pan (CSSE, SZU) FM 8 / 23

Method Factorization Machine for Ratings

Representation (2/3)

The original rating matrix R ∈ Gn×m is represented by a design matrix

X and a rating vector r ,

X ∈ {0,1}p×(n+m), r ∈ Gp×1 (1)

where p =∑n

u=1∑m

i=1 yui is the number of ratings in R.

W.K. Pan (CSSE, SZU) FM 9 / 23

Method Factorization Machine for Ratings

Representation (3/3)

For a rating record (u, i , rui):

The corresponding row of X is

x = [. . . 0 . . . 1︸︷︷︸

u

. . . 0 . . . 1︸︷︷︸

n+i

. . . 0 . . .] ∈ {0,1}1×(n+m)

where the uth and (n + i)th entries are 1s, i.e., xu = xn+i = 1, andall other entries are 0s.

The corresponding entry of r is

rui

W.K. Pan (CSSE, SZU) FM 10 / 23

Method Factorization Machine for Ratings

Prediction Rule (1/3)

The prediction rule of user u on item i

r̂ui = w0 +

n+m∑

j=1

wjxj +

n+m∑

j=1

n+m∑

j ′=j+1

xjxj ′wjj ′ (2)

where w0,wj ,wjj ′ ∈ R.

The above prediction rule includes zero-, first- and second orderinteractions.

W.K. Pan (CSSE, SZU) FM 11 / 23

Method Factorization Machine for Ratings

Prediction Rule (2/3)

The second order interaction is usually approximated via the innerproduct of two vectors,

wjj ′ = Pj ·PTj ′· (3)

where Pj ·,Pj ′· ∈ R1×d .

W.K. Pan (CSSE, SZU) FM 12 / 23

Method Factorization Machine for Ratings

Prediction Rule (3/3)

For the rating data only (without content information), we have

r̂ui = w0 +

n+m∑

j=1

wjxj +

n+m∑

j=1

n+m∑

j ′=j+1

xjxj ′Pj ·PTj ′·

= w0 + wu + wn+i + Pu·PTn+i ,·

⇒ µ+ bu + bi + Uu·V Ti ·

where P = [UT VT ]T ∈ R(n+m)×d .

Observation: for FM with rating only, it is equivalent to RSVD

W.K. Pan (CSSE, SZU) FM 13 / 23

Method Factorization Machine for Ratings

Question

Why do we need FM

W.K. Pan (CSSE, SZU) FM 14 / 23

Method Factorization Machine for Ratings and Content

Representation

When we have some content information of each item or eachuser (e.g., an item’s description or a user’s profile)

W.K. Pan (CSSE, SZU) FM 15 / 23

Method Factorization Machine for Ratings and Content

Prediction Rule

The prediction rule of user u on item i

r̂ui = w0 +z∑

j=1

wjxj +z∑

j=1

z∑

j ′=j+1

xjxj ′Pj ·PTj ′· (4)

where z = n + m + f .

W.K. Pan (CSSE, SZU) FM 16 / 23

Method Factorization Machine for Ratings and Content

Objective Function

We have the objective function,

minΘ

n∑

u=1

m∑

i=1

yui [12(rui − r̂ui)

2 + reg(w ,P)]

where reg(w ,P) = αw2

∑zj=1 δ(xj 6= 0)w2

j +αp2

∑zj=1 δ(xj 6= 0)‖Pj ·‖

2

is the regularization term used to avoid overfitting.And Θ = {w0,w ,P} are model parameters to be learned.

W.K. Pan (CSSE, SZU) FM 17 / 23

Method Factorization Machine for Ratings and Content

Gradient

Denoting fui =12(rui − r̂ui)

2 + reg(w ,P), for each (u, i , rui ) ∈ R, we havethe gradients,

∇w0 =∂fui

∂w0= −eui (5)

∇wj =∂fui

∂wj= −euixj + αw wj ,∀xj 6= 0 (6)

∇Pj · =∂fui

∂Pj ·= −euixj

z∑

j ′ 6=j

xj ′Pj ′· + αpPj ·,∀xj 6= 0 (7)

where eui = rui − r̂ui .

W.K. Pan (CSSE, SZU) FM 18 / 23

Method Factorization Machine for Ratings and Content

Update Rule

For each (u, i , rui ) ∈ R, we have the update rules,

w0 = w0 − γ∇w0 (8)

wj = wj − γ∇wj ,∀xj 6= 0 (9)

Pj · = Pj · − γ∇Pj ·,∀xj 6= 0 (10)

where γ is the learning rate.

W.K. Pan (CSSE, SZU) FM 19 / 23

Method Factorization Machine for Ratings and Content

Algorithm

1: Initialize model parameters Θ2: for t = 1, . . . ,T do3: for t2 = 1, . . . ,p do4: Randomly pick up a rating from R5: Calculate the gradients in Eq.(5-7)6: Update the parameters in Eq.(8-10)7: end for8: Decrease the learning rate γ ← γ × 0.99: end for

Figure: The SGD algorithm for FM.

Note that the above algorithm is slightly different from that inlibFM [Rendle, 2012].

W.K. Pan (CSSE, SZU) FM 20 / 23

Discussion

Discussion

Can FM incorporate an item’s taxonomy information?

Can FM incorporate a user’s profile?

Can FM incorporate a user’s social connections?

Can FM incorporate the context information such as location andtime?

W.K. Pan (CSSE, SZU) FM 21 / 23

Conclusion

Conclusion

FM can incorporate auxiliary data seamlessly.

W.K. Pan (CSSE, SZU) FM 22 / 23

Conclusion

Homework

Use the libFM software at http://www.libfm.org/

Read the libFM paper [Rendle, 2012]

Read chapter 3 of Recommender Systems: An Introduction

W.K. Pan (CSSE, SZU) FM 23 / 23

References

Rendle, S. (2012).

Factorization machines with libfm.ACM Transactions on Intelligent Systems and Technology, 3(3):57:1–57:22.

W.K. Pan (CSSE, SZU) FM 23 / 23

Recommended