23
Dynamic Covering for Recommendation Systems Ioannis Antonellis Anish Das Sarma Shaddin Dughmi

Dynamic Covering for Recommendation Systems

  • Upload
    erma

  • View
    34

  • Download
    2

Embed Size (px)

DESCRIPTION

Dynamic Covering for Recommendation Systems. Ioannis Antonellis Anish Das Sarma Shaddin Dughmi. Outline. Covering & Recommendations Succinct Dynamic Covering Results: Upper Bounds Lower Bounds. Max k-cover Problem. Input: integer k items: X = {1,2, ..., n} - PowerPoint PPT Presentation

Citation preview

Page 1: Dynamic Covering for Recommendation Systems

Dynamic Covering for Recommendation Systems

Ioannis AntonellisAnish Das SarmaShaddin Dughmi

Page 2: Dynamic Covering for Recommendation Systems

Outline

• Covering & Recommendations• Succinct Dynamic Covering• Results:

o Upper Boundso Lower Bounds

Page 3: Dynamic Covering for Recommendation Systems

Max k-cover Problem• Input:

o integer ko items: X = {1,2, ..., n}o sets: I = {S1, ..., Sm}, Si subset of X

• Output: Find subset of I with size less than k that maximizes cover of items

A

B

1

5

4

3

k=1, Solution: A (size=3)

k=2, Solutions: A,C (size=4) A,B

(size=4)

B,C (size=4)

C

2

Sets

Items

Page 4: Dynamic Covering for Recommendation Systems

Max k-cover Problem• NP-complete• Greedy Algorithm

o pick set that cover more itemso iterate

• 1 - ((k-1)/k)^k <= 1 - 1/e = 0.67 approximation

A

B

C

1

5

4

3

2

Sets

Items

k=1, Solution: A (size=3)

k=2, Solutions: A,C (size=4) A,B

(size=4)

B,C (size=4)

Page 5: Dynamic Covering for Recommendation Systems

Max k-cover in Recommendations

• Alice views and rates movies• Netflix would like to recommend

new movies to Alice for watching

• Important problem: o Find users "similar" to Aliceo Find users who cover a large set of

Alice's likes and dislikes

Page 6: Dynamic Covering for Recommendation Systems

Netflix example• Each user is identified by subset of movies

he likes/viewed• Alice likes {A, B, C}• Fred likes {A, D}• Bob likes {B, E}• Ben likes {C, F}• Jim likes {A, B, F}• James likes {A, B, F}

Ben and Jim in conjunction cover all Alice's likesFred, Bob and Ben in conjunction cover all Alice's likesJim and James add same value

Page 7: Dynamic Covering for Recommendation Systems

k-covering vs nearest neighbor

• for k=1, equivalent (dot product similarity)• covering allows for diversifying

recommendations• want to cover all genres liked by a user

o consider a user that likes 100 thriller movies and 10 comedies

o want "similar" users to cover as many movies as possible

o k-nearest neighbor attempts to find many similar users, not cover as many movies as possible

Page 8: Dynamic Covering for Recommendation Systems

oDesk example• Online labor marketplace• clients post jobs and/or invite contractors• contractors apply to jobs

• Contractor recommendations for clientso Bob invites/interviews/hires contractorso find clients "similar" to Bob

• Job recommendations for contractorso Alice applies to jobso find contractors "similar" to Alice

Page 9: Dynamic Covering for Recommendation Systems

Succinct Dynamic Covering (SDC)

• Input:o integer ko items: X = {1,2, ..., n}o sets: I = {S1, ..., Sm}, Si subset of Xo query Q subset of X

• Output: Find subset of I with size less than k that maximizes cover of items in query Q

• However we further constrain the problem:o space constrained: statically preprocess (X,I)

and store a small sketch, much smaller than O(mn)

o dynamic: Q is not known apriori during the sketch creation

Page 10: Dynamic Covering for Recommendation Systems

Notice two twists• dynamic

o for each user the set of movies that need to be covered is different

o covering is not static

• space-constrainedo real time, interactive recommendationso the whole netflix graph is huge

10 million users 100k movies popular movies have been viewed many

timeso cannot process over the entire graph at query

time

Page 11: Dynamic Covering for Recommendation Systems

Ad serving• online advertisers

o bid on webpages matching relevancy criteriao target certain user demographics

When a user visits a page• Ad servers:

o have some (not precise) idea about the demographic of the user (e.g. from click logs)

o try to pick a set of ads that cover many user demographics

o need to solve the SDC probem

Page 12: Dynamic Covering for Recommendation Systems

Ad serving• space-constraint:

o set system consists of users, webpages and clicks

• dynamic:o each user view of each page is

associated with different user demographic

A

B

C

1

5

4

3

2

Ads

Webpages

User visited pages

Page 13: Dynamic Covering for Recommendation Systems

Coverage Oracle• Offline stage:

o Input: integer k items: X = {1,2, ..., n} sets: I = {S1, ..., Sm}, Si subset of X

• Output: Data Structure D

• Dynamic stage:o Input: Query Q subset of Xo Output: use D to find subset of I with

size less than k that maximizes cover of items in query Q

Page 14: Dynamic Covering for Recommendation Systems

Outline

• Covering & Recommendations• Succinct Dynamic Covering• Results:

o Upper Boundso Lower Bounds

Page 15: Dynamic Covering for Recommendation Systems

Results• given space limitations

o interested in approximate solutions for SDC

• space vs approximation ratio tradeoffs

• ε: [0,1/2]• δ1, δ1: non-negative integers, not both

zero

Page 16: Dynamic Covering for Recommendation Systems

Simple Deterministic Algorithm

• For every item, "remember" one set• break ties arbitrarily• m/k approximation, linear space

Sets ItemsSets Items

k=2:OPT = 16APPROX = 8ratio = 16/8 =2

Page 17: Dynamic Covering for Recommendation Systems

Better Deterministic Algorithm• Find unchosen set containing the most

uncovered items. Iterate.• similar to previous algorithm, order is fixed• sqrt(n/k) approximation, linear spaceSets Items

Sets Items

k=2:OPT = 16APPROX = 16ratio = 16/16 = 1

Page 18: Dynamic Covering for Recommendation Systems

Randomized Algorithm• mε/sqrt(k) approximation• nm1-2ε space

• Find unchosen set containing at least n/(mεsqrt(k)). Choose and Iterate.

• For every remaining unchosen set, choose n/m2ε uniformly at random from the uncovered items

Page 19: Dynamic Covering for Recommendation Systems

Randomized Algorithm• mε/sqrt(k) approximation• nm1-2ε space

• Find unchosen set containing at least n/(mεsqrt(k)). Choose and Iterate.

• For every remaining unchosen set, choose n/m2ε uniformly at random from the uncovered items

Page 20: Dynamic Covering for Recommendation Systems

Lower Bound• holds for deterministic oracles only• proof somewhat involved, uses the probabilistic

method• matches randomized upper bound

• Open problem: randomized lower bound

Page 21: Dynamic Covering for Recommendation Systems

Related word• distance oracles in graphs, Thorup and

Zwick• set cover in streaming model (sets are

streams or items are streams)• nearest neighbor (NN) search:

o for k=1, SDC and NN are equivalent using the dot product similarity

o no locality sensitive hashing for dot product (Charikar). So, no hope for signature schemes for SDC.

Page 22: Dynamic Covering for Recommendation Systems

Summary• Introduced Succinct Dynamic Covering

problem

• Applications in many real-world recommendation systems

• approximation ratio and space tradeoffs

• Deterministic and Randomized upper bounds

• Deterministic lower bound

Page 23: Dynamic Covering for Recommendation Systems

Thank you!