Consumer Behavior Prediction using Parametric and Nonparametric Methods

Consumer Behavior Prediction using Parametric and

Nonparametric Methods

Elena EnevaCALD Masters Presentation

19 August 2002

Advisors: Alan Montgomery, Rich Caruana,

Christos Faloutsos

Outline

Introduction Data Economics Overview Baseline Models New Hybrid Models Results Conclusions and Future Work

Background

Retail chains are aiming to customize prices in individual stores

Pricing strategies should adapt to the neighborhood demand

Stores can increase operating profit margins by 33% to 83%

Price Elasticity

consumer’s response to price change

Ppercent

Qpercent E

inelastic elastic

Q is quantity purchased

P is price of product

Data Example

100000

0.02 0.03 0.04 0.05 0.06price

quantity

-1.58 -1.53 -1.48 -1.43 -1.38 -1.33 -1.28ln(price)

ln(quant)

Data Example – Log Space

Assumptions

Independence– Substitutes: fresh fruit, other juices– Other Stores

Stationarity– Change over time– Holidays

“The” ModelCategory

Price of Product 1

Price of Product 2

Price of Product 3

Price of Product N

“I know your

customers”

PredictorPredictor

Quantity bought of Product 1

Quantity bought of Product N

Need to multiply this across many stores, many categories.

))(ln(~)ln(2

Converting to Original Space

))(ln(~)ln(2

))(ln(ˆ)ln( pfq

),(~))(ln(|)ln( 2Npfq

Existing Methods

Traditionally – using parametric models (linear regression)

Recently – using non-parametric models (neural networks)

Our Goal

Advantage of LR: known functional form (linear in log space), extrapolation ability

Advantage of NN: flexibility, accuracy

robustness

Take Advantage: use the known functional form to bias the NN

Build hybrid models from the baseline models

Datasets

weekly store-level cash register data at the product level

Chilled Orange Juice category

2 years 12 products 10 random stores selected

Evaluation Measure

Root Mean Squared Error (RMS) the average deviation between the

predicted quantity and the true quantity

iii qq

NRMSerror

Models

Hybrids– Smart Prior– MultiTask Learning– Jumping Connections– Frozen Jumping

Connections

Baselines–Linear Regression–Neural Networks

Baselines

Linear Regression

Neural Networks

q is the quantity demanded pi is the price for the ith product K products overall The coefficients a and bi are determined by

the condition that the sum of the square residuals is as small as possible.

Linear Regression

)ln()ln(

pbaq i

Linear Regression

Results RMS

LR NN SmPr MTL JC FJC Vote WAV

Neural Networks

generic nonlinear function approximators

a collection of basic units (neurons), computing a (non)linear function of their input

backpropagation

Neural Networks

1 hidden layer, 100 units, sigmoid activation function

Results RMS

Hybrids

Smart Prior MultiTask Learning Jumping Connections Frozen Jumping Connections

Smart Prior

Idea: start the NN at a “good” set of weights, help it start from a “smart” prior.

Take this prior from the known “linearity” NN first trained on synthetic data generated

by the LR model NN then trained on the real data

Smart Prior

Results RMS

Multitask Learning

Idea: learning an additional related task in parallel, using a shared representation

Adding the output of the LR model (built over the same inputs) as an extra output to the NN

Make the net share its hidden nodes between both tasks

Custom halting function Custom RMS function

MultiTask Learning

Results RMS

Jumping Connections

Idea: fusing LR and NN

change architecture add connections which “jump” over the

hidden layer Gives the effect of simulating a LR and

NN all together

Jumping Connections

Results RMS

Frozen Jumping Connections

Idea: you have the linearity, now use it!

same architecture as Jumping Connections, plus really emphasizing the linearity

freeze the weights of the jumping layer, so the network can’t “forget” about the linearity

Frozen Jumping Connections

Results RMS

Models

Hybrids– Smart Prior– MultiTask Learning– Jumping Connections– Frozen Jumping

Connections

Baselines:–Linear Regression–Neural Networks

Combinations–Voting–Weighted Average

Combining Models

Idea: Ensemble Learning

Committee Voting – equal weights for each model’s prediction

Weighted Average – optimal weights determined by a linear regression model

2 baseline and 3 hybrid models (Smart Prior, MultiTask Learning, Frozen Jumping Conections)

Committee Voting

Average the predictions of the models

Results RMS

Weighted Average – Model Regression

Linear regression on baselines and hybrid models to determine vote weights

Results RMS

Normalized RMS Error

Compare model performance across stores Stores of different sizes, ages, locations, etc Need to normalize Compare to baselines

Take the error of the LR benchmark as unit error

Normalized RMS Error

Conclusions

Clearly improved models for customer choice prediction

Will allow stores to price the products more strategically and optimize profits

Maintain better inventories Understand product interaction

Future Work Ideas

analyze Weighted Average model compare extrapolation ability of new

models use other domain knowledge

– shrinkage model – a “super” store model with data pooled across all stores

Acknowledgements

I would like to thank my advisors

my CALDling friends and colleagues

The Most Important Slide

for this presentation and the paper:

www.cs.cmu.edu/~eneva/research.htm

eneva@cs.cmu.edu

References

Montgomery, A. (1997). Creating Micro-Marketing Pricing Strategies Using Supermarket Scanner Data

West, P., Brockett, P. and Golden, L (1997) A Comparative Analysis of Neural Networks and Statistical Methods for Predicting Consumer Choice

Guadagni, P. and Little, J. (1983) A Logit Model of Brand Choice Calibrated on Scanner data

Rossi, P. and Allenby, G. (1993) A Bayesian Approach to Estimating Household Parameters

Consumer Behavior Prediction using Parametric and Nonparametric Methods

Documents

PARAMETRIC AND NONPARAMETRIC IDENTIFICATION OF

Parametric versus Nonparametric Statistics-when … versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations

Nonparametric Link Prediction in Dynamic Graphs

MatchIt: Nonparametric Preprocessing for Parametric Causal

Survival Analysis: Overview of Parametric, Nonparametric, and

Beyond Whittle: Nonparametric correction of a parametric … · · 2017-01-19Beyond Whittle: Nonparametric correction of a parametric likelihood with a focus on Bayesian time series

Parametric and nonparametric models and methods in

A Brief Overview of Nonparametric Bayesian Modelsmlg.eng.cam.ac.uk/zoubin/talks/nips09npb.pdf · Parametric vs Nonparametric Models •Parametric models assume some ﬁnite set of

Parametric & Nonparametric Models for Within-Groups Comparisons

Experiments : design, parametric and nonparametric ... · Experiments : design, parametric and nonparametric ... design, parametric and nonparametric analysis, and ... Fisher'sbook

Combining parametric and nonparametric models for off-policy …12-16-00)-12-16-40-4537... · Parametric vs. Nonparametric Models Nonparametric models – Predicting the dynamics

Fourth Edition Handbook of Parametric and Nonparametric ...library.mpib-berlin.mpg.de/toc/z2007_770.pdf · Fourth Edition Handbook of Parametric and Nonparametric Statistical Procedures

Selected Nonparametric and Parametric Statistical … Nonparametric and Parametric Statistical Tests for Two-Sample Cases | 1 Selected Nonparametric and Parametric Statistical Tests

Parametric and Nonparametric Population Methods: Their

Is my sample size large enough? Parametric vs ... · 5/14/2015 · Parametric vs nonparametric Stat tests and sample size May 14, 2015 • A test may be parametric or nonparametric

Parametric, nonparametric and parametric modelling of a

Distributed Parametric-Nonparametric Estimation in ...paduaresearch.cab.unipd.it/3692/1/main.pdf · In the framework of parametric and nonparametric distributed estimation, we intro-duce

Fourth Edition Handbook of Parametric and Nonparametric

Parametric and Nonparametric Statistical Methods for

Nonparametric Sequential Prediction of Time Series. · Nonparametric Sequential Prediction of Time Series. Extension to quantile prediction. G. Biau – B. Patra University Paris