H2O World - Quora: Machine Learning Algorithms to Grow the Worlds Knowledge - Xavier Amatriain

Machine Learning to Grow the World's Knowledge

Xavier Amatriain (@xamat)

11/10/2015

Our Mission

“To share and grow the world’s

knowledge”

• Millions of questions & answers

• Millions of users

• Thousands of topics

• ...

Demand

What we care about

Quality

Relevance

Data@Quora

Lots of data relations

Complex network propagation effects

Importance of topics & semantics

Machine Learning@Quora

Ranking - Answer rankingWhat is a good Quora answer?

• truthful

• reusable

• provides explanation

• well formatted

• ...

Ranking - Answer rankingHow are those dimensions translated

into features?

• Features that relate to the text

quality itself

• Interaction features

(upvotes/downvotes, clicks,

comments…)

• User features (e.g. expertise in topic)

Ranking - Feed• Goal: Present most interesting stories for

a user at a given time• Interesting = topical relevance +

social relevance + timeliness

• Stories = questions + answers

• ML: Personalized learning-to-rank approach

• Relevance-ordered vs time-ordered = big

gains in engagement

• Challenges:

• potentially many candidate stories

• real-time ranking

• optimize for relevance

Feed dataset: impression logs

upvote

downvote

expand

answer pass

downvote

follow

● Value of showing a story to a user, e.g. weighted sum of actions:

v = ∑a va 1{ya = 1}

● Goal: predict this value for new stories. 2 possible approaches:○ predict value directly

v_pred = f(x)

■ pros: single regression model

■ cons: can be ambiguous, coupled

○ predict probabilities for each action, then compute expected value:

v_pred = E[ V | x ] = ∑a va p(a | x)

■ pros: better use of supervised signal, decouples action models from action values

■ cons: more costly, one classifier per action

What is relevance?

● Essential for getting good rankings

● Better if updated in real-time (more reactive)

● Main sets of features:○ user (e.g. age, country, recent activity)

○ story (e.g. popularity, trendiness, quality)

○ interactions between the two (e.g. topic or author affinity)

Feature engineering

● Linear

○ simple, fast to train

○ manual, non-linear transforms for richer

representation (buckets, ngrams)

● Decision trees

○ learn non-linear representations

● Tree ensembles

○ Random forests

○ Gradient boosted decision trees

● In-house C++ training code, third-party

libraries for prototyping new models

Models

Scalability: feed backend system

Aggregator 1 Aggregator 2 Aggregator 3

Leaf 1 Leaf 2 Leaf 3

Aggregator

Requests from Web (python)

user_id

object_id

Recommendations - Topics

Goal: Recommend new topics for the

user to follow

• Based on

• Other topics followed

• Users followed

• User interactions

• Topic-related features

• ...

Recommendations - Users

Goal: Recommend new users to follow

• Based on:

• Other users followed

• Topics followed

• User interactions

• User-related features

• ...

H2O World - Quora: Machine Learning Algorithms to Grow the Worlds Knowledge - Xavier Amatriain

Software

Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017

App sharing quora

Quora iPhone App Grand Makerover concept

(Home - Quora

Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15

Feature Article The AlloSphere: Immersiveholl/pubs/Amatriain-2009-IEEEMM.pdf · The AlloSphere: Immersive Multimedia for Scientific Discovery and Artistic Exploration Xavier Amatriain,

· Antonio Robles Juan Sanz Mariano Domingo Fermin Amatriain Marcelino Amatriain ... Juan Lopez Casiano Muro Beniono Igal Aquilino Gurpegui Fructuoso Amatriain

similarity Quora question

Reddit, Quora, Tumblr,BuzzFeed | Company Showdown

Quora Questions

Quora for Business | Advertising on Quora 2020

(13) jHome - Quora

(S14) Home - Quora

Audience Building With Quora

Home - Quora

Quora sull'Espresso

Quora User Guide for Marketing and PR

@Quora @QconSF 11/7/16 @nikhilgarg28 Scaling Quality On … · 2017-02-02 · Scaling Quality On Quora Using Machine Learning Nikhil Garg @nikhilgarg28 @Quora @QconSF 11/7/16 Introducing

Xavier Amatriain, VP of Engineering, Quora at MLconf SEA - 5/01/15

Usability Test on Quora: Browse & Search Results