53
Cold Start Solutions for Recommender Systems Amin Mantrach [email protected] Research Scientist – Yahoo Labs Barcelona

Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach [email protected] Research Scientist – Yahoo Labs Barcelona

Embed Size (px)

Citation preview

Page 1: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Cold Star t So lu t ions fo r Recommender Sys tems Amin Mantrach

[email protected] Research Scientist – Yahoo Labs Barcelona

Page 2: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Outline

§  Recommending at cold start; ›  Learning representations for the item cold start:

•  Recommending cold articles to users;

›  Enrich user profiles by using users’ implicit feedback: •  Learning representations for completing the user profile;

›  Enrich user profiles by using query logs; §  Discussions: Matrix factorization and skip gram.

2

Page 3: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

RECOMMENDING AT COLD START

3

Page 4: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Item Cold Start Problem on Yahoo Properties

§  Majority of the items (~80%) are never shown or clicked:

§  Personalization uses content as main signal (CTR can not be used on cold items);

§  Motivations: why recommending cold start items? ›  Diversify the offer; ›  Avoid the “kim kardashian” effect; ›  Avoid quick sold out of advertisings.

Page 5: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Weakly-engaged users on Yahoo Properties

5

§  User engagement is power-law distributed; à ~80% of the users have sparse profiles: on Netflix, Amazon, Yahoo news and Yahoo search. §  In other words, we are facing a coverage problem; §  Recommendations can not be efficient for the majority of the users due

to the sparsity of their profiles.

4 50 100 150 200 250 3000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Clicks

Cove

rage

Page 6: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Drawbacks of state-of-the-art in cold-start recommendations Item cold start §  State-of-the-art of collaborative filtering approaches can not be applied:

[2009 Koren et al., Matrix factorization techniques for recommender systems; 2009 Rendle et al. UAI, B.P. Ranking from implicit feedback];

§  Basic approaches relies on content-based (CB) models. §  State-of-the-art for item cold-start consists of hybrid methods

[WSDM 2010, Agarwal et al., fLDA; Gantner ICDM 2010, learning mappings…];

Weakly-engaged user §  80% of the users are weakly-engaged and thus have sparse profiles; §  State-of-the art user profile enrichment techniques rely on

›  the kNN to enrich the user profile (this does not work for the weakly-engaged user); ›  external information but low coverage.

Page 7: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Our contributions to the cold start

7

§ Novel collective learning representation framework: ›  Common framework:

•  We solve both the item cold start and the user cold start; •  Our representations are interpretable (non-negative) and can be used to

reconstruct the user profile; •  Our implementation relies on simple alternating least squares (ALS) or

multiplicative updates (MU).

§ Weakly-engaged users: •  We can complete better the user profiles in comparison to the state-of-the-

art.

Page 8: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

The cold start: Research Questions

8

Content + User feedback à Collective factorization Singh and Gordon KDD 2008

Item cold start: 1.Can we learn collective representations from content and

collaborative information to outperform state-of-the-art item cold-start recommenders? [Saveski and Mantrach – RecSys 2014]. Weakly-engaged user:

2.Can we design collective representations for enriching the weakly-engaged user profile using implicit user feedback? [ongoing work]

3.Can we use query logs as external information to improve recommendations on homerun ? [patent filed]

Page 9: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

The cold start: Research Questions

9

Content + User feedback + Matrix factorization à Collective factorization Singh and Gordon KDD 2008

Item cold start: 1.Can we learn collective representations from content and

collaborative information to outperform state-of-the-art item cold-start recommenders? [Saveski and Mantrach – RecSys 2014]. Weakly-engaged user:

2.Can we design collective representations for enriching the weakly-engaged user profile using internal user feedback?

3.Can we use external informational – as query logs – to improve recommendations ? [patent filed]

Page 10: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

1. RECOMMEND FOR THE ITEM COLD START

10

Page 11: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Collective Representation Learning

§  Why collective? ›  It allows learning from multiple sources: users’ feedback + items’ features.

§  Why representation? ›  By learning embeddings we extract latent factors that capture the essence of the data.

§  Why collective representation for cold start? ›  When observing just one view we can reconstruct the missing one .

By projecting items’ features on the joined representation we can

reconstruct missing the user’s items

11

Page 12: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Xs W Hs

≈ Global

topic model TOPICS

Topic 1 …

Xu W Hu

≈ C1 …Ck

Topic k

#features

#users

#ite

ms

#ite

ms

COMMUNITY Personalized

model per user

Collective Representation Learning

12

Non-negative representations + locality constraints: LCE 2 similar items should share similar representations

Page 13: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Optimization Problem

13

§  We implemented an alternating least squares algorithm and a multiplicative update algorithm to learn the decomposition.

[https://github.com/amantrac/JNMF]

Page 14: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Item Cold-Start Recommendations

Offline evaluation: §  Enron: 10 mailboxes, 36K emails, 5K users, explicit feedback. §  Yahoo News Articles (40days – random sample of 41K articles –

650K users + user implicit feedback (3.5M comments).

A/B testing ›  Average #of items surfaced/day; ›  Dwell time of the items

[RecSys2014, Yi et al., Beyond clicks: dwell time for personalization.]

14

Page 15: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Item Cold Start: Baselines

15

1.  Content Based Recommender (CB) 2.  Content Topic Based Recommender

3.  Latent Semantic Indexing on user profiles [Soboroff’99]

4.  Author Topic Model [M. Rosen-Zvi’04]

5.  Bayesian Personalized Ranking + kNN (BRP-kNN) [Gantner’10]

6.  fLDA [Agarwal’10]

Page 16: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Offline Evaluation: Email Recipients Recommendation

16

0.00

0.10

0.20

0.30

0.40

0.50

MicroF1 MacroF1 MAP NDCG

Performan

ce*

BPR-kNN CB LCE (No GR) LCE

Page 17: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Offline Evaluation: Cold News Articles Recommendation

17

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

RA@3 RA@5 RA@7 RA@10

Rank

ing'Ac

curacy'

CB BPR-kNN LCE (No GR) LCE

Page 18: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Next directions…

§  Use dwell time/duration (i.e. proportion of the video watched) instead of intentional plays;

§  Incorporate a profile enrichment strategy based on representation learning to diversify recommendations for the weakly-engaged user.

18

Page 19: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

The cold start: Research Questions

19

Content + User feedback + Matrix factorization à Collective factorization Singh and Gordon KDD 2008

Item cold start: 1.Can we learn collective representation from content and

collaborative information to outperform state-of-the-art item cold-start recommenders? [Saveski and Mantrach – RecSys 2014]. Weakly-engaged user:

2.Can we design collective representations for enriching the weakly-engaged user profile using internal user feedback?

3.Can we use external informational – as query logs – to improve recommendations ? [1 patent – submitted to Techpulse 2014]

Page 20: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Recommending in the Long Tail: User Profile Completion Why it is important: §  Current rec. systems are only effective for the 20% loyal users having

a dense profile. Why enriching weakly engaged user ? §  Improving recommendation for 80% of the remaining users; §  Encouraging churning of weakly-engaged users to loyal; §  Easy to integrate: we feed the existing system with enriched user

profiles and do not need to change existing algorithms; §  Advertising can benefit as well of better enriched profiles.

20

Page 21: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Endogenous vs Exogenous Profile Enrichment

21

A. Endogenous: Using implicit feedback §  We have this info for loyal users for free; §  We do not need to rely on any external source. Our solution: §  Learning embedding spaces designed to reconstruct user profiles to

improve news recommendation.

B. Exogenous §  Many external sources of information are available inside Yahoo. They can

be used to enrich user profiles. §  Our solution: §  Using search query logs to enrich user profiles for news recommendation.

Page 22: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

The cold start: Research Questions

22

Content + User feedback + Matrix factorization à Collective factorization Singh and Gordon KDD 2008

Item cold start: 1.Can we learn collective representation from content and

collaborative information to outperform state-of-the-art item cold-start recommenders? [Saveski and Mantrach – RecSys 2014]. Weakly-engaged user:

2.Can we design collective representations for enriching the weakly-engaged user profile using implicit user feedback?

3.Can we use external informational – as query logs – to improve recommendations ? [1 patent – submitted to Techpulse 2014]

Page 23: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

2. USING IMPLICIT USER FEEDBACK 23

Page 24: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

User Coverage Against Click Count for News Data Set

24

4 50 100 150 200 250 3000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Clicks

Cove

rage

Page 25: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Xs W Hs

Xu Hu

#features

#users

#ite

ms

#ite

ms

Collective Representation Learning for User Profiles Reconstruction

W

Xp ≈ Hu Hs

#use

rs

#features

User Profile Reconstruction Xp=XuT.Xs

25

Page 26: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Optimization Problem

26

Page 27: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

User Profile Reconstruction Regularization

27

0.25

0.3

0.35

0.4

0.45

0.5

0.55

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

NDCG

a

clicks=1clicks=2clicks=3clicks=4clicks=5

Page 28: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Performance in Terms of Sparsity

28

0.1

0.2

0.3

0.4

0.5

0.6

1 2 3 4 5 6

NDCG

Clicks

CEUP-ACLS

CEUP-MU

kNN

WITHOUT ENRICHMENT

Page 29: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

The cold start: Research Questions

29

Content + User Feedback + Matrix Factorization à Collective Factorization Singh and Gordon KDD 2008

Item cold start: 1.Can we learn collective representation from content and

collaborative information to outperform state-of-the-art item cold-start recommenders? [Saveski and Mantrach – RecSys 2014]. Weakly-engaged user:

2.Can we design collective representations for enriching the weakly-engaged user profile using internal user feedback? [submitted to techpulse 2014]

3.Can query logs be used as external information to improve recommendationso ? [1 patent]

Page 30: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

B. USING EXTERNAL SIGNAL QUERY LOGS 30

Page 31: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

News Personalization

§ Reading profile (endogenous): Aggregated clicked news (implicit feedback) or skipped news.

§ Search profiles (exogenous): Aggregated queries submitted by users (explicit feedback).

31

Page 32: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Search profiles

§ Motivations: using other sources of available information to improve news personalization.

§ Why search? ›  More familiar; ›  Explicit user intent.

32

Page 33: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Query

33

Page 34: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Titles

34

Page 35: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Abstracts

35

Page 36: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Coverage

36

66% of the Homerun users in a specific target day did also use “Search” during the last month.

Others Finance FrontPage Mail News Search SportsYahoo! properties

0

10

20

30

40

50

60

70

80

90

100

Ove

rlap

in p

age

view

s (%

)unique yuidsunique bcookies

Page 37: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Coverage

37

§ Considering users who clicked at least once on a Homerun recommendation on a target day, how many queries did each of them submit during the last 3 months?

0

200K

400K

600K

800K

1.0M

1.2M

1.4M

1.6M

1.8M

100 101 102 103 104

Num

ber o

f use

rs

Number of queries per user during 90 days

Page 38: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

§  Users that clicked at least once during a target day on a recommended article;

§  We consider only users who submitted at least 1000 queries during the last 3 months (~10 query/day);

à 70K users with 140K recommendations.

Data Set

38

Page 39: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

User Query

39

Page 40: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

User QTitles

40

Page 41: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

User QAbstracts

41

Page 42: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

User Query 1

User Query 2

User Query 3

User Query N

Query User Profile

1 2 3

90

AG

GR

EG

ATION

42

Page 43: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

User QTitles 1

User QTitles 2

User QTitles 3

User QTitles N

QTitle User Profile

1 2 3

90

AG

GR

EG

ATION

43

Page 44: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

User QAbstracts 1

User QAbstracts 2

User QAbstracts 3

User QAbstracts N

QAbstracts User Profile

1 2 3

90

AG

GR

EG

ATION

44

Page 45: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

I. Do search profiles help improve the quality of news personalization?

45

Page 46: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

I. Do search profiles help improve the quality of news personalization?

46

Page 47: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

II. What are the important features to be considered in a search profile?

47

Page 48: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

III. How many queries do we need?

48

Limitation: 400 queries corresponds to a coverage of ~200K users

Page 49: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

IV. Which period should the historical search information span in order to produce high-quality recommendations?

49

Page 50: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

V. How does the recency of search profiles affect the quality of news personalization?

50

Page 51: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Status and Limitations

§  The main limitation is the coverage: ›  Scales up to 200K users.

§  Further work: ›  Improve coverage; ›  Complete user profile by learning collective representations from (1) implicit

feedback, (2) query logs and (3) item’s features.

51

Page 52: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Discussions: Matrix Factorizations and skip-gram based Representations

52

§  Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR, 2013

§  “We analyze skip-gram with negative-sampling (SGNS), a word

embedding method introduced by Mikolov et al., and show that it is implicitly factorizing a word-context matrix, whose cells are the pointwise mutual information (PMI) of the respective word and context pairs (shifted by a global constant).” [Omer Levy and Yoav Goldberg,Neural Word Embeddings as Implicit Matrix Factorization, NIPS, 2014]

Page 53: Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender Systems Amin Mantrach amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Quest ions Doubts

Concerns Queries

Quest ion Advises Issues

53