18
Content Recommendations with Redis Torben Brodt plista GmbH 28. February 2013 Recommender Systems Stammtisch http://recommenders.de

Content recommendations

Embed Size (px)

Citation preview

Page 1: Content recommendations

Content Recommendationswith Redis

Torben Brodtplista GmbH

28. February 2013

Recommender SystemsStammtisch

http://recommenders.de

Page 2: Content recommendations

Introduction

● plista GmbH○ recommendations & advertising○ founded in 2008, Berlin [DE]○ ~3k recommendations/ second

● never batch = never Hadoop● stream computing with In Memory Database

● we love

Page 3: Content recommendations
Page 4: Content recommendations

How to build recommendations?

welt.de/football/berlin_wins.html

We only have the URL?

to show recommendations we are integrated on the website

so "at least" we can count the hits

Page 5: Content recommendations

Most popular

welt.de/football/berlin_wins.html● ZINCR "p:welt.de" berlin_wins● ZREVRANGEBYSCORE

p:welt.de

berlin_wins 689 +1

summer_is_coming 420

plista_company 135

Live Read+ Live Write= Real Time Recommendations

Page 6: Content recommendations

Most popular with timeseries

welt.de/football/berlin_wins.html● ZINCR "p:welt.de:1360007000" berlin_wins● ZUNION

○ "p:welt.de:1360007000"○ "p:welt.de:1360006000"○ "p:welt.de:1360005000"

● ZREVRANGEBYSCOREp:welt.de:1360005000

berlin_wins 420

summer_is_coming 135

plista_best_company 689

p:welt.de:1360006000

berlin_wins 420

summer_is_coming 135

plista_best_company 689

p:welt.de:1360007000

berlin_wins 689

summer_is_coming 420

plista_best_company 135

Page 7: Content recommendations

Most popular with timeseries

welt.de/football/berlin_wins.html● ZINCR "p:welt.de:1360007000" berlin_wins● ZUNION ... WEIGHTS

○ "p:welt.de:1360007000" .. 4○ "p:welt.de:1360006000" .. 2○ "p:welt.de:1360005000" .. 1

● ZREVRANGEBYSCOREp:welt.de:1360005000

berlin_wins 420

summer_is_coming 135

plista_best_company 689

p:welt.de:1360006000

berlin_wins 420

summer_is_coming 135

plista_best_company 689

p:welt.de:1360007000

berlin_wins 689

summer_is_coming 420

plista_best_company 135

Page 8: Content recommendations

Most popular with timeseries

:1360007000

-1h -2h -3h -4h -5h -6h -7h -8h

:1360007000

:1360007000

Page 9: Content recommendations

Most popular to any context

● it's not only publisher, we use ~50 context attributes

context attributes:● publisher● weekday● geolocation● demographics● ...

publisher = welt.de

berlin_wins 689 +1

summer_is_coming 420

plista_company 135

weekday = sunday

berlin_wins 400 +1

dortmund_wins 200

... 100

geolocation = dortmund

dortmund_wins 200

berlin_wins 10 +1

... 5

Page 10: Content recommendations

Most popular to any context

ZUNION ... WEIGHTSp:welt.de:1360007 4p:welt.de:1360006 2p:welt.de:1360005 1

w:sunday:1360007 4w:sunday:1360006 2w:sunday:1360005 1

g:dortmund:1360007 4g:dortmund:1360006 2g:dortmund:1360005 1

● how it looks like in Redispublisher = welt.de

berlin_wins 689 +1

summer_is_coming 420

plista_company 135

weekday = sunday

berlin_wins 400

dortmund_wins 200

... 100

geolocation = dortmund

dortmund_wins 200

berlin_wins 10

... 5

Page 11: Content recommendations

Most popular with Effect size

ZUNION ... WEIGHTSp:welt.de:1360007 4p:welt.de:1360006 2p:welt.de:1360005 1

w:sunday:1360007 4w:sunday:1360006 2w:sunday:1360005 1

g:dortmund:1360007 4g:dortmund:1360006 2g:dortmund:1360005 1

* 70%* 70%* 70%

* 10%* 10%* 10%

* 30%* 30%* 30%

Effect Size

Examples:small effect: weatherbig effect: publisher

Data with small effect should not been taken into account, otherwise we get avg results

● which context has an influence?

Page 12: Content recommendations

Most popular with Significance

● some data has more significance/trust● so we add a significance matrix

● Significance might depend on a common limit, like 200 (in the example)

X

sig:publisher = welt.de

berlin_wins 1

summer_is_coming 1

plista_company 0.5

publisher = welt.de

berlin_wins 689

summer_is_coming 420

plista_company 135

Page 13: Content recommendations

Most popular with Significance

● some data has more significance/trust● so we add a significance matrix

XNumerator

Denominatorsig:publisher = welt.de

berlin_wins 1

summer_is_coming 1

plista_company 0.5

sig:publisher = welt.de

berlin_wins 1

summer_is_coming 1

plista_company 0.5

publisher = welt.de

berlin_wins 689

summer_is_coming 420

plista_company 135

Σ

Σ

SUM over all context

SUM over all context

( )

Page 14: Content recommendations

SUM over..

● timeseries● different context● previous hits of the user● similar publisher

knowledge

publisher = welt.de

berlin_wins 689

summer_is_coming 420

plista_company 135ΣZUNION ... WEIGHTSp:welt.de:1360007 4p:welt.de:1360006 2p:welt.de:1360005 1

w:sunday:1360007 4w:sunday:1360006 2w:sunday:1360005 1

g:dortmund:1360007 4g:dortmund:1360006 2g:dortmund:1360005 1

... redis can do it ;)

Page 15: Content recommendations

Even more Matrix Operations ;)

● Similarity Matrix

● Human Control Matrix

● Meta-learning Matrix○ might be covered in next talk

○ cooperation with

○ aided from

∏Σ

Page 16: Content recommendations

Conclusions

● Redis fits perfect for simple operations○ SUM + AGGREGATE + MIN + MAX

● In-Memory operations are pretty fast

● Real-time features feel better in a real-time

database (e.g. time series)

● We don't need batch

Page 17: Content recommendations

What else?

In Redis● Incremental Collaborative Filtering● More Recommender● Live StatisticsAt plista● Semantics with Lucene● Cloud Technologies

○ Scalability○ Enterprise Service Bus

● Contest for Recommenders