Estimating the causal impact of recommender systems

Estimating the causal impact of recommendation systemsAMIT SHARMA, JAKE HOFMAN, DUNCAN WATTSMICROSOFT RESEARCH, NEW YORK

2nd International Conference on Computational Social Science

How much do they change user behavior?

Naively, up to 30% of traffic comes from recommendations

“Burton Snowboard, a sports retailer, reported that personalized product recommendations have driven nearly 25% of total sales since it began offering them in 2008. Prior to this, Burton’s customer recommendations consisted of items from its list of top-selling products.”

Almost surely an over-estimate of the actual effect, because of correlated demand between a product and its recommendations.

Example: product browsing on Amazon.com

Counterfactual browsing: no recommendations

Problem: Correlated demand may drive page visits, even without recommendations

The problem of correlated demand

Demand for winter

accessories

Visits to winter hat

Rec. visits to winter

gloves

Goal: Estimate the extra activity caused by a recommender system that would not have happened otherwise

Causal

Convenience

OBSERVED CLICK-THROUGHS WITHOUT RECOMMENDER

Convenience

Ideal experiment: A/B Test

Treatment (A)Control (B)

But, experiments:may be costlyhamper user experiencerequire full access to the system

Experiments may be costly or infeasible.

Can we derive an observational method to identify the causal effect of recommendations?

Using natural variations to simulate an experiment

Studying sudden spikes, “shocks” to demand for a book

[Carmi et al. 2012]

The same author’s recommended book may also have a shock

Past work: Controlling for correlated demand

Uses statistical models to control for confounds Carmi et al. [2012], Oestreicher and Sundararajan [2012] and Lin [2013] construct “complementary sets” of similar, non-recommended products.

Garfinkel et. al. [2006] and Broder et al. [2015] compare to model-predicted clicks without recommendations.

But, 1. These assumptions are hard to verify.2. Finding examples of valid shocks requires ingenuity

and restricts researchers to very specific categories

Shock-IV: A simpler, more robust method for estimating causal impact.

Distinguishing between recommendation and direct traffic

All visits to a product

Recommender visits Direct visits

Search visits

Direct browsing

Proxy for unobserved demand

The Shock-IV strategy: Searching for valid shocks

The Shock-IV strategy: Filtering out invalid shocks

Search for products that receive a sudden shock in their traffic but direct traffic for their recommendations remains constant.

Why does it work? Shock as an instrumental variable

Demand

Focal visits (X)

Rec. visits (Y)

Sudden Shock

Directvisits (Y)

Computing the causal estimate

Increase in recommendation clicks (Δr)

Causal CTR (ρ) = Δr/Δv

*Same as Wald estimator for instrumental variables

Increase in visits to focal product (Δv)

The shock-IV strategy: In equations

At any time t:

When product i experiences a shock in page visits:

(Because constant dj implies constant convenience visits)

Application to Amazon.com, using Bing toolbar logs

Anonymized browsing logs:

• 23 million pageviews

• 1.3 million Amazon products

• 2 million Bing Toolbar users

Sept 2013-May 2014

Recreating sequence of page visits by a user

Search page Focal product page Recommended product page

Timestamp URL2014-01-20 09:04:10

http://www.amazon.com/s/ref=nb_sb_noss_1?field-keywords=George%20saunders

2014-01-20 09:04:15

http://www.amazon.com/dp/0812984250/ref=sr_1_1

2014-01-20 09:05:01

http://www.amazon.com/dp/1573225797/ref=pd_sim_b_2

Timestamp URL2014-01-20 09:04:10

http://www.amazon.com/s/ref=nb_sb_noss_1?field-keywords=George%20saunders

2014-01-20 09:04:15

http://www.amazon.com/dp/0812984250/ref=sr_1_1

2014-01-20 09:05:01

http://www.amazon.com/dp/1573225797/ref=pd_sim_b_2

User searches for George Saunders

User clicks on the first search result

User clicks on the second recommendation

I. Weekly and seasonal patterns in traffic, nearly tripling in holidays

II. 30% of all pageviews come through recommendations

III. Books and eBooks are the most popular categories by far

IV. Apparel and shoes see a substantially higher fraction of visits through recommendations

Shock-IV: Finding shocks in user visit data

We look for focal products with large and sudden increases in views relative to typical traffic.

Size of shock exceeds:◦ 5 times median traffic◦ Shock exceeds 5 times the previous day's traffic and 5 times the

mean of the last 7 days.

Shocked product has: ◦ Visits from at least 10 unique users during the shock◦ Non-zero visits for at least five out of seven days before and after

the shock

Shock-IV: Ensuring exclusion restriction

Recommended product (Y) should have constant direct visits during the time of the shock.

(1-β): Ratio of maximum 14-day variation in visits to a recommended product to the size of the shock for the focal product.

Direct traffic to Y is stable relative to the shock to the focal product.

β = 1 Direct traffic to Y is no less varying than the shock to focal product.

β = 0

How to choose

Focal product visits Rec. product direct visits

Accept

RejectSelect

Using the method, obtain >4000 natural experiments!

20% of all products that had visits on any single day.

Estimating the causal clickthrough rate ()

ρ =Δrxyt*/ Δvxt*

At β = 0.7, causal CTR =3%.

Causal click-through rate by product category

What fraction of the observed click-throughs are causal?

Estimating fraction of observed click-throughs that are causal

Compare the number of estimated causal clicks to all observed recommendation clicks (non-shock period).

λ = ρxy.vxt / rxyt

Only a quarter of the observed click-throughs are causal

At β = 0.7, only 25% of recommendation traffic is caused by the recommender.

Generalization? Shocks may be due to discounts or sales

Lower CTR may be due to the holiday season

Local average treatment effect (LATE), not fully generalizable

Shocked products are not a representative sample of all products, nor are the users who participate in them.

• Shock-IV method covers roughly one-fifth of all products with at least 10 visits on any single day.

• Our results are robust to sale or holiday effects. • Causal estimates are consistent with

experimental findings (e.g., Belluf et. al. [2012])

More generally…A robust, scalable method for causal inference.

◦ Causal CTR for Amazon’s recommender system much less than the naïve observational CTR.

◦ Can be applied to other domains, such as online ads.

Data mining for instruments I. Allows us to study a much larger sample of natural experiments, while being able to test for exclusion restriction directly.

II. Can be used for finding potential instruments.

Thank you!AMIT SHARMA

MICROSOFT RESEARCH http://www.amitsharma.in

Sharma, A., Hofman, J. M., & Watts, D. J. (2015). Estimating the causal impact of recommendation systems from observational data. In Proceedings of the Sixteenth ACM Conference on Economics and Computation.

Shock-IV: A robust, scalable method for estimating causal impact from observational data, with testable assumptions.

Naïve observational estimates of CTR for recommendation systems may be big overestimates.

Estimating the causal impact of recommender systems

Data & Analytics

Estimating Causal Effects from Observations

gformula: Estimating causal effects in the presence of time - Stata

Estimating the causal impact of recommendation systems from

LINGAM: NON-GAUSSIAN METHODS FOR ESTIMATING CAUSAL …cox-associates.com/CausalAnalytics/LiNGAMShimuzi2014.pdf · LINGAM: NON-GAUSSIAN METHODS FOR ESTIMATING CAUSAL STRUCTURES 67

Matrix Completion Methods for Causal Panel Data Models · PDF fileIn this paper we develop new methods for estimating causal e ects in settings with panel data, ... simulations and

estimating individual causal effects - Patrick Lampatricklam.org/talk/4_10_13.pdf · 1.think of estimating each ˝ i as a separate \study" where we have data consisting of observation

LiNGAM: Non-Gaussian methods for estimating causal structuressshimizu/papers/Shimizu13... · 2013-10-28 · LiNGAM: Non-Gaussian methods for estimating causal structures Shohei Shimizu∗

University of Reading - Estimating the causal …centaur.reading.ac.uk/70798/1/journal.pmed.1002314 (1).pdfRESEARCH ARTICLE Estimating the causal influence of body mass index on risk

Estimating the Causal Effects of Dormitory Residence ...personal.kent.edu/~creynol9/Oncampus083112.pdf · Estimating the Causal Effects of Dormitory Residence During College on Student

Estimating the Magic Barrier of Recommender Systems: A User Study

A nonparametric dynamic causal model for macroeconometricsConceptualizing and estimating dynamic causal e ects from time series data is challenging. Dy-namic feedback between the treatments

Estimating Causal Effects from Large Data Sets Using Propensity Scores Hal V. Barron, MD TICR May 2007

Estimating Causal Effect of Ads in a Real-Time Bidding Platform

Causal Inference with Noisy and Missing Covariates via Matrix … · 2019. 2. 19. · Estimating the causal effect of an intervention is a fundamental goal across many domains. Examples

Optimal Causal Inference: Estimating Stored Information ...csc.ucdavis.edu/~cmg/papers/oci.pdf · Santa Fe Institute Working Paper 07-08-024 arxiv.org: 0708.1580 [cs.IT] Optimal Causal

Estimating Causal Effects Using Experimental and ...people.oregonstate.edu/~flayb/MY COURSES/Adv Eval & Res...Estimating Causal Effects Using Experimental and Observational Designs

Estimating Causal Effects of Early Occupational Choice … · nber working paper series estimating causal effects of early occupational choice on later health: evidence using the

Multivariate dynamical systems models for estimating causal ...med.stanford.edu/content/dam/sm/scsnl/documents/Ryali...Third, causal interactions between brain regions 31 can change

Wealth and Stock Market Participation: Estimating the ... · Wealth and Stock Market Participation: Estimating the Causal E ect From Swedish Lotteries Joseph Briggsy David Cesariniz

The Principle of Maximum Causal Entropy for Estimating ...The Principle of Maximum Causal Entropy for Estimating Interacting Processes Brian D. Ziebart, J. Andrew Bagnell, and Anind