129
Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Embed Size (px)

Citation preview

Page 1: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Graphical Models for the Internet

Amr Ahmed and Alexander SmolaYahoo Research, Santa Clara, CA

Page 2: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Thus far ...

• Motivation• Basic tools– Clustering– Topic Models

• Distributed batch inference– Local and global states– Star synchronization

Page 3: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Up next

• Inference – Online Distributed Sampling– Single machine multi-threaded inference– Online EM and Submodular Selection

• Applications– User tracking for behavioral Targeting– Content understanding– User modeling for content recommendation

Page 4: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

4. Online Model

Page 5: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Scenarios• Batch Large-Scale

– Covered in part 1

• Mini-batches– We already have a model– Data arrives in batches– We would like to keep model up-to-data

• Time-sensitive– Data arrives one item at a time– Model should be up-to-data

Time

Time

Page 6: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

4.1 Dynamic Clustering

Page 7: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

The Chinese Restaurant Process• Allows the number of mixtures to grow with the data• They are called non-parametric models– Means the number of effective parameters grow with data– Still have hyper-parameters that control the rate of growth

• a: how fast a new cluster/mixture is born?• G0: Prior over mixture component parameters

Page 8: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

The Chinese Restaurant Process

-For data point xi - Choose table j mj and Sample xi ~ f(fj)- Choose a new table K+1 a

- Sample fK+1 ~ G0 and Sample xi ~ f(fK+1)

Generative Process

f2f1 f3

62

63 6

1

6

The rich gets richer effectCANNOT handle sequential data

Page 9: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Recurrent CRP (RCRP) [Ahmed and Xing 2008]

• Adapts the number of mixture components over time– Mixture components can die out– New mixture components are born at any time– Retained mixture components parameters evolve according

to a Markovian dynamics

Page 10: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

The Recurrent Chinese Restaurant Process

f2,1f1,1 f3,1 T=1

Dish eaten at table 3 at time epoch 1OR the parameters of cluster 3 at time epoch 1

-Customers at time T=1 are seated as before:- Choose table j mj,1 and Sample xi ~ f(fj,1)- Choose a new table K+1 a

- Sample fK+1,1 ~ G0 and Sample xi ~ f(fK+1,1)

Generative Process

Page 11: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

The Recurrent Chinese Restaurant Process

f2,1f1,1 f3,1 T=1

T=2

f2,1f1,1 f3,1

m'1,1=2 m'2,1=3 m'3,1=1

Page 12: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

f2,1f1,1 f3,1 T=1

T=2f2,1f1,1 f3,1

m'1,1=2 m'2,1=3 m'3,1=1

Page 13: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

f2,1f1,1 f3,1 T=1

T=2f2,1f1,1 f3,1

62 6

3 61

6

m'1,1=2 m'2,1=3 m'3,1=1

Page 14: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

f2,1f1,1 f3,1 T=1

T=2f2,1f1,1 f3,1

62

m'1,1=2 m'2,1=3 m'3,1=1

Page 15: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

f2,1f1,1 f3,1 T=1

T=2f2,1f1,2 f3,1

Sample f1,2 ~ P(.| f1,1)

m'1,1=2 m'2,1=3 m'3,1=1

Page 16: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

f2,1f1,1 f3,1 T=1

T=2f2,1f1,2 f3,1

16

21 16

3 16

1

16

And so on ……

m'1,1=2 m'2,1=3 m'3,1=1

Page 17: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

f2,1f1,1 f3,1 T=1

T=2f2,2f1,2 f3,1

At the end of epoch 2

f4,2

Newly born clusterDied out cluster

m'1,1=2 m'2,1=3 m'3,1=1

Page 18: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

f2,1f1,1 f3,1 T=1

T=2f2,2f1,2 f3,1

N1,1=2 N2,1=3 m'3,1=1

f4,2

T=3

f2,2f1,2 f4,2

m'4,2=1m'2,2=2m'1,2=2

Page 19: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Recurrent Chinese Restaurant Process• Can be extended to model higher-order

dependencies• Can decay dependencies over time– Pseudo-counts for table k at time t is

History size

Decay factory Number of customers sitting at table K at time epoch t-h

å=

-

-

÷ø

öçè

æH

hhtk

h

me1

,r

Page 20: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

f2,1f1,1 f3,1

T=1

T=2f2,2f1,2 f3,1

m'1,1=2 m'2,1=3 m'3,1=1f4,2

T=3f2,2f1,2 f4,2

å=

-

-

÷ø

öçè

æH

hhtk

h

me1

,r

m'2,3

m'2,3 =

Page 21: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

4.2 Online Distributed Inference

Tracking Users Interest

Page 22: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Characterizing User Interests• Short term vs long-term

Jan April July Oct

Music

Housing

Furniture

Buying a car

Travel plans

Page 23: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Characterizing User Interests• Short term vs long-term• Latent

Jan April July Oct

millage

fast

mortgageGaga

seafood

Barcelona

used

Page 24: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Problem formulationInput

• Queries issued by the user or tags of watched content• Snippet of page examined by user• Time stamp of each action (day resolution)

Output

• Users’ daily distribution over interests• Dynamic interest representation• Online and scalable inference• Language independent

FlightLondonHotelweather

SchoolSuppliesLoansemester

classesregistrationhousingrent

Page 25: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

Input

• Queries issued by the user or tags of watched content• Snippet of page examined by user• Time stamp of each action (day resolution)

Output

• Users’ daily distribution over interests• Dynamic interest representation• Online and scalable inference• Language independent

Page 26: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

When to show a financing ad?

Page 27: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

When to show a financing ad?

Page 28: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

When to show a financing ad?

Page 29: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

When to show a hotel ad?

Page 30: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

When to show a hotel ad?

Page 31: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

Input

• Queries issued by the user or tags of watched content• Snippet of page examined by user• Time stamp of each action (day resolution)

Output

• Users’ daily distribution over interests• Dynamic interest representation• Online and scalable inference• Language independent

Page 32: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Mixed-Membership Formulation

job Career

BusinessAssistant

HiringPart-time

Receptionist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Diet Cars Job Finance

Objects

Mixtures

Degree ofmembership

Job Hiring speed price

part-time CamryCareer opening bonus package

card diet caloriesloan recipe milk

Weight lb kg

Page 33: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

In Graphical Notation

Page 34: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

In Polya-Urn Representation

• Collapse multinomial variables:• Fixed-dimensional Hierarchal Polya-Urn

representation– Chinese restaurant franchise

xx

Page 35: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Food Chicken

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Car speed offercamry accord career

Global topics trends

Topic word-distributions

User-specific topics trends(mixing-vector)

User interactions: queries, keyword from pages viewed

Page 36: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Food Chicken

Car speed offercamry accord career

• For each user interaction• Choose an intent from local distribution

• Sample word from the topic’s word-distribution

• Choose a new intent l • Sample a new intent from the global distribution• Sample word from the new topic word-

distribution

Generative Process

………

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Page 37: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Food Chickenpizza

Car speed offercamry accord career

• For each user interaction• Choose an intent from local distribution

• Sample word from the topic’s word-distribution

• Choose a new intent l • Sample a new intent from the global distribution• Sample word from the new topic word-

distribution

Generative Process

………

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Page 38: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Food Chickenpizza

Car speed offercamry accord career

• For each user interaction• Choose an intent from local distribution

• Sample word from the topic’s word-distribution

• Choose a new intent l • Sample a new intent from the global distribution• Sample word from the new topic word-

distribution

Generative Process

………

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Page 39: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Food Chickenpizza hiring

Car speed offercamry accord career

• For each user interaction• Choose an intent from local distribution

• Sample word from the topic’s word-distribution

• Choose a new intent l • Sample a new intent from the global distribution• Sample word from the new topic word-

distribution

Generative Process

………

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Page 40: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Food Chickenpizza millage

Car speed offercamry accord career

• For each user interaction• Choose an intent from local distribution

• Sample word from topic’s word-distribution

• Choose a new intent l • Sample a new intent from the global distribution• Sample from word the new topic word-

distribution

Generative Process

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Page 41: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Food Chickenpizza millage

Car speed offercamry accord career

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Problems- Static Model- Does not evolve user’s interests- Does not evolve the global trend of interests- Does not evolve interest’s distribution over terms

Page 42: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Food Chickenpizza millage

Car speed offercamry accord career

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

At time t At time t+1

Build a dynamic model

Connect each level using a RCRP

Page 43: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

At time t At time t+1 At time t+2 At time t+3

User 1process

User 2process

User 3process

Globalprocess

mm'

nn'

Which time kernel to use at each level?

Page 44: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Pseudo counts = *

Decay factor

Food Chickenpizza millage

Car speed offercamry accord career

At time t

Observation 1-Popular topics at time t are likely to be popular at time t+1- fk,t+1 is likely to smoothly evolve from fk,t

At time t+1job

CareerBusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Page 45: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Food Chickenpizza millage

Car speed offercamry accord career

At time t At time t+1job

CareerBusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Observation 1-Popular topics at time t are likely to be popular at time t+1- fk,t+1 is likely to smoothly evolve from fk,t

CarAltimaAccordBookKelleyPricesSmallSpeed

fk,t+1 ~ Dir(bk,t+1)fk,t

CarBlueBookKelleyPricesSmallSpeedlarge

Intuition

Captures current trend of the car industry

(new release for e.g.)

~

Page 46: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Food Chickenpizza millage

Car speed offercamry accord career

At time t At time t+1

Observation 2- User prior at time t+1 is a mixture of the user short and long term interest

How do we get a prior that captures both long

and short term interest?

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarAltimaAccord

BlueBookKelleyPricesSmallSpeed

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Page 47: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

All

week

job Career

BusinessAssistant

HiringPart-time

Receptionist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

month

Time t t+1

FoodChickenpizza

recipejobhiring

Part-timeOpeningsalary

foodchickenPizzamillage

Kellyrecipecuisine

Diet Cars Job Finance

Prior for user actions at time t

μ

μ2

μ3

Long-termshort-term

Page 48: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Food ChickenPizza millage

Car speed offercamry accord career

At time t At time t+1

• For each user interaction• Choose an intent from local distribution

• Sample word from the topic’s word-distribution

• Choose a new intent l • Sample a new intent from the global distribution• Sample word from the new topic word-

distribution

Generative Process

short-termpriors

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarAltimaAccord

BlueBookKelleyPricesSmallSpeed

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Page 49: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Polya-UrnRCRF Process

?

Page 50: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Simplified Graphical Model

~

~

~

~

~

~

At time t At time t+1

Page 51: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Simplified Graphical Model

~

~

~

~

~

~

CarAltimaAccordBookKelleyPricesSmallSpeed

CarBlueBookKelleyPricesSmallSpeedlarge

Page 52: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Simplified Graphical Model

~

~

~

~

~

~

Food ChickenPizza millage

Page 53: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Simplified Graphical Model

~

~

~

~

~

~

Page 54: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

~

~

~

~

~

~

Simplified Graphical Model

Topics evolve over time?

User’s intent evolve over time?

Capture long and term interests of users?

Page 55: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

User 1process

User 2process

User 3process

Globalprocess

At time t At time t+1 At time t+2 At time t+3

mm'

nn'

Page 56: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

4.2 Online Distributed Inference

Work Flow

Page 57: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Work Flowtoday

CurrentUsers’ models

Systemstate

newUsers’ models

User interactions

User interactions

User interactions

User interactions

User interactions

Daily Update

(inference)

Hundred of millions

Page 58: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

~

~

~

~

~

~

Online Scalable Inference• Online algorithm– Greedy 1-particle filtering algorithm– Works well in practice – Collapse all multinomials except Ωt

• This makes distributed inference easier

– At each time t:

• Distributed scalable implementation– Used first part architecture as a subroutine– Added synchronous sampling capabilities

Page 59: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Distributed Inference (at time t)

W

f

q

z

w

Page 60: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

clientclient

Distributed Inference (at time t)

q

z

w

q

z

w

W

f

Collapse all multinomial

Page 61: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

After collapsing

Food ChickenPizza millage

job CareerBusines

sAssistan

tHiringPart-time

Receptionist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolio

Finance

Chase

RecipeChocol

atePizzaFood

Chicken

MilkButterPowde

r

client client

Car speed offercamry accord career

Use Star-Synchronization

Page 62: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Fully Collapsed

Food ChickenPizza millage

job CareerBusines

sAssistan

tHiringPart-time

Receptionist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolio

Finance

Chase

RecipeChocol

atePizzaFood

Chicken

MilkButterPowde

r

Shared memory

client client

Car speed offercamry accord career

Page 63: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Distributed Inference (at time t)

Food ChickenPizza millage

client client

Car speed offercamry accord career

Local trendGlobal trend Topic factor

Page 64: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Semi-Collapsed

Food ChickenPizza millage

job CareerBusines

sAssistan

tHiringPart-time

Receptionist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolio

Finance

Chase

RecipeChocol

atePizzaFood

Chicken

MilkButterPowde

r

Shared memory

client client

Car speed offercamry accord career

Page 65: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Semi-Collapsed

Food ChickenPizza millage

job CareerBusines

sAssistan

tHiringPart-time

Receptionist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolio

Finance

Chase

RecipeChocol

atePizzaFood

Chicken

MilkButterPowde

r

Shared memory

client client

ΩtΩtΩt

Car speed offercamry accord career

Page 66: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Semi-Collapsed

Food ChickenPizza millage

client client

ΩtΩt

Car speed offercamry accord career

Page 67: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Distributed Sampling Cycle

Sample ZFor users

Sample ZFor users

Sample ZFor users

Sample ZFor users

Barrier

Write counts to

memcached

Write counts to

memcached

Write counts to

memcached

Write counts to

memcached

Collect counts and sample W

Do nothing Do nothing Do nothing

Barrier

Read W from memcached

Read Wfrom

memcached

Read Wfrom

memcached

Read Wfrom

memcached

Sample Ωt

Requires a reduction step

Page 68: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Distributed Sampling Cycle

Sample ZFor users

Sample ZFor users

Sample ZFor users

Sample ZFor users

Barrier

Write counts Write counts Write counts Write counts

Collect counts and sample W

Do nothing Do nothing Do nothing

Barrier

Read W from Read Wfrom

Read Wfrom

Read Wfrom

Page 69: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Sampling W

• Introduce auxiliary variable mkt

– How many times the global distribution was visited – ~ AnotniaK

W

m

Page 70: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Distributed Sampling Cycle

Sample ZFor users

Sample ZFor users

Sample ZFor users

Sample ZFor users

Barrier

Write counts Write counts Write counts Write counts

Collect counts and sample W

Do nothing Do nothing Do nothing

Barrier

Read W from Read Wfrom

Read Wfrom

Read Wfrom

Page 71: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

4.2 Online Distributed Inference

Behavioral Targeting

Page 72: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

• Tasks is predicting convergence in display advertising• Use two datasets– 6 weeks of user history– Last week responses to Ads are used for testing

• Baseline:– User raw data as features– Static topic model

Experimental Results

Page 73: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

InterpretabilityUser-1 User-2

Page 74: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Performance in Display Advertising

ROC

Number of conversions

Page 75: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Performance in Display Advertising

Weighted ROC measure

StaticBatch modelsEffect of number of topics

Page 76: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

How Does It Scale?

2 Billion instances with 5M vocabularyusing 1000 machines

one iteration took ~ 3.8 minutes

Page 77: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Distributed Inference Revisited

Page 78: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

To collapse or not to collapse?• Not collapsing– Keeps conditional independence

• Good for parallelization• Requires synchronous sampling

– Might mix slowly

• Collapsing – Mixes faster– Hinder parallelism– Use star-synchronization

• Works well if sibling depends on each others via aggregates

• Requires asynchronous communication

Page 79: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Inference Primitive

• Collapse a variable– Star synchronization for the sufficient statistics

• Sampling a variable– Local • Sample it locally (possibly using the synchronized statistics)

– Shared• Synchronous sampling using a barrier

• Optimizing a variable– Same as in the shared variable case– Ex. Conditional topic models

Page 80: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Online Models• Batch Large-Scale

– Covered in part 1

• Mini-batches– We already have a model– Data arrives in batches– We would like to keep model up-to-data

• Time-sensitive– Data arrives one item at a time– Model should be up-to-data

Time

Time

Page 81: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

What Is Coming?

• Inference – Online Distributed Sampling– Single machine multi-threaded inference– Online EM and Submodular Selection

• Applications– User tracking for behavioral Targeting– Content understanding– User modeling for content recommendation

Page 82: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

4.2 Scalable SMC Inference

Storylines

Page 83: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

News Stream

Page 84: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

News Stream

• Realtime news stream – Multiple sources (Reuters, AP, CNN, ...)– Same story from multiple sources– Stories are related

• Goals– Aggregate articles into a storyline– Analyze the storyline (topics, entities)

• How does the story develop over time?• Who are the main entities?• What topics are addressed?

Page 85: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

A Unified Model• Jointly solves the three main tasks– Clustering, – Classification – Analysis

• Building blocks– A Topic model

• High-level concepts (unsupervised classification)

– Dynamic clustering (RCRP)• Discover tightly-focused concepts

– Named entities– Story developments

Page 86: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Infinite Dynamic Cluster-Topic HybridSports

gamesWonTeamFinal

SeasonLeague

held

Politics

GovernmentMinister

AuthoritiesOpposition

OfficialsLeadersgroup

Unrest

PoliceAttack

runman

grouparrested

move

UEFA-soccer

ChampionsGoalCoachStrikerMidfieldpenalty

Juventus AC Milan Lazio RonaldoLyon

g

Tax-Bill

TaxBillionCutPlanBudgetEconomy

BushSenateFleischerWhite HouseRepublican

Page 87: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Infinite Dynamic Cluster-Topic HybridSports

gamesWonTeamFinal

SeasonLeague

held

Politics

GovernmentMinister

AuthoritiesOpposition

OfficialsLeadersgroup

Unrest

PoliceAttack

runman

grouparrested

move

UEFA-soccer

ChampionsGoalCoachStrikerMidfieldpenalty

Juventus AC Milan Lazio RonaldoLyon

g

Tax-Bill

TaxBillionCutPlanBudgetEconomy

BushSenateFleischerWhite HouseRepublican

Border-Tension

NuclearBorderDialogueDiplomaticmilitantInsurgencymissile

PakistanIndiaKashmirNew DelhiIslamabadMusharrafVajpayee

Page 88: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

The Graphical Model

• Topic model• Topics per cluster• RCRP for cluster• Hierarchical DP for

article• Separate model for

named entities• Story specific

correction

Page 89: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

4.2 Fast SMC Inference

Inference via SMC

Page 90: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Online Inference Algorithm

• A Particle filtering algorithm• Each particle maintains a hypothesis

– What are the stories– Document-story associations– Topic-word distributions

• Collapsed sampling– Sample (zd,sd) only for each document

Page 91: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Particle Filter Representation

Page 92: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Fold the document into the structure of each filter

- s and z are tightly coupled- Alternatives

- Sample s then sample z (high variance)

w w w w w wentitiesDocument td

z z z z z zs

Page 93: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Fold the document into the structure of each filter

- s and z are tightly coupled- Alternatives

- Sample s then sample z (high variance)- Sample z then sample s (doesn’t make sense)

w w w w w wentitiesDocument td

z z z z z z s

Page 94: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Fold the document into the structure of each filter

- s and z are tightly coupled- Alternatives

- Sample s then sample z (high variance)- Sample z then sample s (doesn’t make sense)

- Idea - Run a few iterations of MCMC over s and z- Take last sample as the proposed value

Page 95: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

How good each filter look now?

Page 96: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Get rid of bad filterReplicate good one

Page 97: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Get rid of bad filterReplicate good one

Page 98: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Efficient Computation and Storage• Particles get replicated

1

State P1

1 2

State P2State P1

1 23

State P2State P3 State P1

Page 99: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Efficient Computation and Storage• Particles get replicated– Use thread-safe Inheritance tree

1

State P1

1 2

State P2State P1

1 23

State P2State P3 State P1

Page 100: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Efficient Computation and Storage• Particles get replicated– Use thread-safe Inheritance tree [extends Canini et. Al 2009]

1

state

1

Page 101: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Efficient Computation and Storage• Particles get replicated– Use thread-safe Inheritance tree [extends Canini et. Al 2009]

Root

state

2

state

1

State1

1 2

Page 102: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Efficient Computation and Storage• Particles get replicated– Use thread-safe Inheritance tree [extends Canini et. Al 2009]

Root

3

state

1

2

State State

state

State1

1 2

1 23

Page 103: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Efficient Computation and Storage• Particles get replicated– Use thread-safe Inheritance tree [extends Canini et. Al 2009]

– Inverted representation for fast lookup

Root

3

India: story 5Pakistan: story 1

1

2

(empty) Congress: story 2

Bush: story 2India: story 3

India: story 1, US: story 2, story 3

1

1 2

1 23

Page 104: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Efficient Computation and Storage

• Why this is useful?

• Only focus on stories that mention at least one entity– Otherwise pre-compute and reuse

• We can use fast samplers for z as well [Yao et. Al. KDD09]

Page 105: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Experiments

• Yahoo! News datasets over two months– Three sub-sampled sets with different characteristics

• Editorially-labeled documents– Cannot-like and must-link pairs

• Performance measures using clustering accuracy• Baseline– A strong offline Correlation clustering algorithm [WSDM 11]

• Scaled with LSH to compute neighborhood graph (similar to Petrovic 2010)

Page 106: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Structured BrowsingSports

gamesWonTeamFinal

SeasonLeague

held

Politics

GovernmentMinister

AuthoritiesOpposition

OfficialsLeadersgroup

Unrest

PoliceAttach

runman

grouparrested

move

Border-Tension

NuclearBorderDialogueDiplomaticmilitantInsurgencymissile

PakistanIndiaKashmirNew DelhiIslamabadMusharrafVajpayee

UEFA-soccer

ChampionsGoalLegCoachStrikerMidfieldpenalty

Juventus AC Milan Real Madrid Milan Lazio RonaldoLyon

Tax-bills

TaxBillionCutPlanBudgetEconomylawmakers

BushSenateUSCongressFleischerWhite HouseRepublican

Page 107: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Structured Browsing

More Like India-Pakistan storyMore Like India-Pakistan story

Middle-east-conflict

PeaceRoadmapSuicideViolenceSettlementsbombing

Israel PalestinianWest bankSharonHamasArafat

Based on topics

Nuclear programs

Nuclearsummitwarningpolicymissileprogram

North KoreaSouth KoreaU.SBushPyongyang

Nuclear+ topics [politics]

Border-Tension

NuclearBorderDialogueDiplomaticmilitantInsurgencymissile

PakistanIndiaKashmirNew DelhiIslamabadMusharrafVajpayee

Page 108: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Structured BrowsingSports

gamesWonTeamFinal

SeasonLeague

held

Politics

GovernmentMinister

AuthoritiesOpposition

OfficialsLeadersgroup

Unrest

PoliceAttach

runman

grouparrested

move

Border-Tension

NuclearBorderDialogueDiplomaticmilitantInsurgencymissile

PakistanIndiaKashmirNew DelhiIslamabadMusharrafVajpayee

UEFA-soccer

ChampionsGoalLegCoachStrikerMidfieldpenalty

Juventus AC Milan Real Madrid Milan Lazio RonaldoLyon

Tax-bills

TaxBillionCutPlanBudgetEconomylawmakers

BushSenateUSCongressFleischerWhite HouseRepublican

More on Personalization later on the talk

Page 109: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Quantitative Evaluation

Number of topics = 100

Effect of number of topics

Page 110: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Scalability

Page 111: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Model Contribution

• Named entities are very important• Removing time increase processing up to 2

seconds per document

Page 112: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Putting Things Together

Page 113: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Time vs. Machines

• Data arrives dynamically• How to keep models up to date?

Batch Mini-batches Truly online

Single-Machine GibbsVariational

Online-LDA SMC

Multi-Machine Star-Synch. Star-Synch + Synchronous step

?

Page 114: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

4.3 User Preference

Online EM and Submodularity

Page 115: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Storyline Summarization

• How to summarize a storyline with few articles?

Earthquake & Tsunami Nuclear PowerRescue & Relief Economy

Time

Page 116: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Storyline SummarizationEarthquake & Tsunami Nuclear PowerRescue & Relief Economy

Time

• How to summarize a storyline with few articles?• How to personalize the summary?

Page 117: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Storyline SummarizationEarthquake & Tsunami Nuclear PowerRescue & Relief Economy

Time

• How to summarize a storyline with few articles?• How to personalize the summary?

Page 118: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

User Interaction

• Passive– We observe the user generated contents– Model user based on those content using unsupervised techniques

• Explicit– We present users with content– User give explicit feedback

• Like/dislike

– Learn user preference using supervised techniques• Implicit

– Mixture between the two– Present the user with items– Observe which items the user interact with– Learning user preference using semi-supervised models

Page 119: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

User Satisfaction

• Modular– Present users with items she prefers

• Regardless of the context

– Targets relevance– Ex: vector space models

• Submodular– More of the same thing is not always better

• Dimensioning return

– Targets diversity – Ex: TDN [ElArini et. Al. KDD 09]

Page 120: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Sequential Click-View Model

Modeling Views based on position

Page 121: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Sequential Click-View Model

Threshold Information gain#clicks

Modeling clicks using position and information gain

Coverage function’s weights are learnt

Page 122: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Sequential Click-View Model

Selected Summary

Story

Features

Modular

Submodular

Page 123: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Online Inference

• Treat missing views as hidden variables– Realistic interaction model

• Use the online EM algorithm– Infer the value of hidden variables

• Optimize parameters using SGD– Use additive weights• Background + story + category + user

Page 124: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Online Inference

Page 125: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

How Does it Work?

Page 126: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

How Does It Work?

Page 127: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

5. Summary Future Directions

Page 128: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Summary• Tools

– Load distribution, balancing and synchronization– Clustering, Topic Models

• Models– Dynamic non-parametric models– Sequential latent variable models

• Inference Algorithms– Distributed batch– Sequential Monte Carlo

• Applications– User profiling– News content analysis & recommendation

Page 129: Graphical Models for the Internet Amr Ahmed and Alexander Smola Yahoo Research, Santa Clara, CA

Future Directions

• Theoretical bounds and guarantees• Network data– Graph partitioning

• Non-parametric models– Learning structure from data

• Working under communication constraints• Data distribution for particle filters