Browsemap: Collaborative Filtering at LinkedIn

Preview:

DESCRIPTION

Many web properties make extensive use of item-based collaborative filtering, which showcases relationships between pairs of items based on the wisdom of the crowd. This paper presents LinkedIn’s horizontal collaborative filtering infrastructure, known as browsemaps. The platform enables rapid development, deployment, and computation of collaborative filtering recommendations for al- most any use case on LinkedIn. In addition, it provides centralized management of scaling, monitoring, and other operational tasks for online serving. We also present case studies on how LinkedIn uses this platform in various recommendation products, as well as lessons learned in the field over the several years this system has been in production.

Citation preview

Recruiting Solutions Recruiting Solutions Recruiting Solutions 1

Browsemap: Collaborative Filtering At LinkedIn

Lili Wu, Sam Shah, Sean Choi, Mitul Tiwari, Christian Posse

RSWeb 2014 with RecSys

2

Agenda § Motivation § Architecture § Applications §  Lessons Learned

3

Profile Browsemap: People who viewed this profile also viewed… Count co-views

Collaborative filtering for member profile

4

Collaborative filtering for job page

Job Browsemap: People who viewed this job also viewed… Count co-views

5

company group portfolio

… many CF based recommenders

6

•  Many different entities

•  Similar problems with different requirement •  Fast product development cycle

•  Hybrid recommender systems

•  Handle LinkedIn data volume and traffic

Challenges

7

Challenges

è Horizontal Platform

•  Many different entities

•  Similar problems with different requirement •  Fast product development cycle

•  Hybrid recommender systems

•  Handle LinkedIn data volume and traffic

8

Browsemap

Collaborative Filtering Platform at LinkedIn

9

Browsemap Platform

•  Scalability Ø  Online/offline architecture Ø  Hundreds of millions of entities, billions of

monthly page views •  Browsemap Domain Specific Language (DSL)

Ø  Code reuse through modular components Ø  Flexible computation workflow construction

•  Data are used by hybrid recommenders

10 10 10

Browsemap Architecture

HDFS

User Activity

Data

Frontend Services

Results Queries

Hadoop

Browsemap Engine

Browsemap DSL Online

Query API

Key-value storage

Voldemort

11 11 11

Browsemap Architecture

HDFS

Frontend Services

Results Queries

Hadoop

Browsemap Engine

Browsemap DSL Online

Query API

Key-value storage

Voldemort

User Activity

Data High Throughput

12 12 12

Browsemap Architecture

HDFS

Frontend Services

Results Queries

Hadoop

Browsemap Engine

Browsemap DSL Online

Query API

Key-value storage

Voldemort

User Activity

Data Low Latency

13

Browsemap Domain Specific Language (DSL)

Module Collection

Co-view counting

Spam User Filtering

Expired Job Filtering

Expired Job Filtering

Cold-start techniques

Co-view counting

Cold-start techniques

… Job browsemap

���

Job ��� Company

Spam User Filtering

Co-view counting

Cold-start techniques

Spam User Filtering

Company browsemap

14

•  Support all entity types •  Adjust to each product requirement

•  Scale

Recap

Voldemort

15

Agenda ü Motivation ü Architecture § Applications §  Lessons Learned

16 16 16

Applications – CF based recommenders Profile Browsemap

Portfolio Browsemap

Job Browsemap Group Browsemap

Hiring Browsemap

Company Browsemap

Influencer Browsemap

17 17 17

Applications – Hybrid Recommender Systems

Suggested Profile Update

Swee Lim

18 18 18

Applications – Hybrid Recommender Systems

Suggested Profile Update

Goal: for each member,

find companies he may want to follow

19 19 19

Applications – Hybrid Recommender Systems

Google Cisco Member followed companies

Linkedin, Facebook

Juniper, Arista Companies user may

be interested in

Member info: •  Content-based features

title, industry, location, … •  Collaborative filtering feature

Co-follow Browsemaps: People who follow this company also follow these companies

20 20 20

Applications – Hybrid Recommender Systems

Question: For a company C, will member M like it?

Approach: Logistic regression Features:

member location company location 1 if yes, 0 if no

company is in the list of the co-follow browsemaps ? 1 if yes, 0 if no

21 21 21

Applications – Hybrid Recommender Systems

Collaborative Filtering is important: •  Surface implicit connection between companies •  Based on Member’s preference

22

Agenda ü Motivation ü Architecture ü Applications §  Lessons Learned

Lesson 1: Tall oaks grow from little acorns

23

Lesson 1: Tall oaks grow from little acorns

24

Lesson 1: Tall oaks grow from little acorns

25

Lesson 1: Tall oaks grow from little acorns

26

A generic horizontal platform is essential

Lesson 2: One hand washes the other

27

Job Browsemap

Similar Jobs

Collaborative filtering: “Follower audience”

Content based: “Leader audience”

Lesson 3: You can’t get blood out of a stone

28

Job 1 Job 2 Job 3 (new)

Need to handle cold start problem

(view time)

merge

Leverage Browsing History Personalized Backfill

Lesson 4: A chain is only as strong as its weakest link

29

CF: Relies solely on user activities Good data is crucial

§  Mistakes can be hard to detect / debug

§  Simple mistakes can have big impact e.g. “jobid” à “id”

§  Need prevention mechanism Ø  Improve tracking Ø  Unit test Ø  Browsemap platform data-check :

input volume, coverage/metrics analysis

Lesson 5: User experience matters

50% CTR

30

500% more applications

ª  Put recommendations in user’s flow

31

§  Collaborative filtering is important for LinkedIn

§  Browsemap is in production for 3+ years §  Horizontal platform is crucial

Conclusion

32

§  Questions?

Thank you !

Recommended