19
Workshop Overview Module 1: Data Module 2: Model and Estimation Module 3: Sample Output and Empirical Generalization

Promotion Analytics in Consumer Electronics - Module 1: Data

Embed Size (px)

DESCRIPTION

Promotion Analytics in Consumer Electronics and High-tech Industries: Module 1 - Data

Citation preview

Page 1: Promotion Analytics in Consumer Electronics - Module 1: Data

Workshop Overview

• Module 1: Data

• Module 2: Model and Estimation

• Module 3: Sample Output and Empirical Generalization

Page 2: Promotion Analytics in Consumer Electronics - Module 1: Data

Outline

• Ideal Data for Promotion/Pricing Analytics: Scanner Data (in CPG)

• Data and Inference: What Can Go Wrong?

• Challenges and Common Mistakes in Consumer Electronics

• Data Requirement and Potential Data Source

Page 3: Promotion Analytics in Consumer Electronics - Module 1: Data

Scanner Data (Store Audit Data)How is Data Collected?

• Syndicated data providers: IRI and A.C. Nielsen

• Sample of stores (Grocery, Drug, Convenience, Mass Merchandiser, Warehouse stores)

• Scanner data

– UPC info (product features), (Retail) price, Quantity (Volume) all recorded

• Features

– Centrally collected and coded (daily)

• Displays

– Collected by store auditors (1X/week)

Page 4: Promotion Analytics in Consumer Electronics - Module 1: Data

4 Data Dimensions

• The Data Cube– Geography (Market) x Product x Time x Variable (Measure)– G x P x T x V > 1,000,000 even for one category

• Aggregation (chain/regions, SKU groups, temporal)

Page 5: Promotion Analytics in Consumer Electronics - Module 1: Data

Scanner Data: Advantages

• Completeness– Linking aggregate sales movements to marketing instruments

(price, feature, display, etc)– Obtaining a richer set of performance measures beyond market

share and factory shipments

• Timeliness– Getting the data within a window that allows for meaningful

managerial action (i.e. less than old lag time of 8 weeks or more)

• Accuracy

Page 6: Promotion Analytics in Consumer Electronics - Module 1: Data

Scanner Data: Limitations

• Not a complete sampling frame: excluded stores– Small shops, Walmart!

• Hard to make causal statements without careful modeling: non-random assignment

• No information on consumer behaviors before purchases (e.g. search, consideration) and consumption after purchases

• No information on psychographics

Page 7: Promotion Analytics in Consumer Electronics - Module 1: Data

5 %

Week

Mar

ket S

hare

8 %

4.5 %

Promotion Week

1 2 3 4 5 6 7

4.8 %

8

Purchase Deceleration

Purchase Acceleration

Net Effect = (8 - 5) - .2 - .5 = 2.3%

Promotion Analytics from Scanner Data

• A simplistic picture

Page 8: Promotion Analytics in Consumer Electronics - Module 1: Data

5 10 15 20 25 30Week

0.2

0.4

0.6

0.8

1

te

kra

Me

ra

hS

0.25

0.5

0.75

1

eci

rP

FDC

FDC

FD

FDC

FDC

FD

FD

F = Feature, D = Display, C = Store Coupon

Promotions: Actual data

Page 9: Promotion Analytics in Consumer Electronics - Module 1: Data

Promotion Types

(End of Aisle) Display

FeaturePrice-cut(BOGO)

Coupon

Page 10: Promotion Analytics in Consumer Electronics - Module 1: Data

1. Size of Data Information in Data

• Consider the following two options:

(1) Wal-Mart with 4,000 stores, 52 weeks of data, 500 SKUs (104 million observations!)

(2) Best-buy with 1,500 stores, 52 weeks of data, 500 SKUs (39 million observations)

• Which dataset would be more useful to measure price responses?

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 520

5

10

15

20

25Wal-Mart (EDLP)

P1 P2 P31 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52

0

5

10

15

20

25

Best-Buy (Hi-Lo)

P1 P2 P3

Page 11: Promotion Analytics in Consumer Electronics - Module 1: Data

2. Pay Attention to Signal-to-Noise Ratio• Consider the following measurement. Is there significant impact from marketing event?

• Well, it depends on signal-to-noise ratio!

Revenue Before Event After Event % Change

Average 10 13 30

0 20 40 60 80 100 1200

2

4

6

8

10

12

14

16

Revenue before/after Event

0 20 40 60 80 100 1200

20

40

60

80

100

120

Revenue before/after event

Page 12: Promotion Analytics in Consumer Electronics - Module 1: Data

3. Careful about Reverse Causality• Imagine the following data generating process.

• If you do analysis ignoring the reverse causality, you may conclude the following.

0 1 2 3 4 5 60

10

20

30

40

50

60

Sales (m,t) vs. Adv (m,t)

Adv (m,t)

Sale

s (m

,t)

Coefficients Standard Error P-value

Intercept -3.357941562 1.121241507 0.004875

Advertising (t) 9.716546286 0.354635984 3.58E-26

R square: 0.95 -> Good fit!

Significant impact of advertising?

Page 13: Promotion Analytics in Consumer Electronics - Module 1: Data

4. Omitted Variables Can Be Dangerous• Oftentimes, we don’t have data on some important variables, which can impact sales,

revenue, or profits. – Doing analytics ignoring these “omitted variables” can lead to “biased” estimates of

marketing mix effects.

• Think about the graph below (from NYT). Is the family income really responsible for better academic achievement? What would be potential omitted variable bias here?

Page 14: Promotion Analytics in Consumer Electronics - Module 1: Data

5. Selection by Outcome: Bad Idea!• Problem: Often times, two groups, which are conditioned by outcome variables, are

compared to infer the causal impact of marketing mix

• Example– To calculate the ROI of paid search campaign, advertisers compare the “conversion

rates” of each “search” keyword. Usually, branded keywords are shown to have high conversion rates (> 6%) compared to generic keywords (~ 1%).

• How to fix the problems? – Use proper “control” condition!– In the paid search example, all the traffics/conversions from consumers who click on

branded keywords are attributed to the resulting sales and profit. An implicit assumption here is that all the sales/profits are lost without paid search. Really?

– It’s possible that consumers who use branded keywords are already quite committed to purchase, and they may simply substitute to unpaid (organic) search links if paid searches are turned off.

– A proper control in this case is “halting selected search engine marketing keywords”

Page 15: Promotion Analytics in Consumer Electronics - Module 1: Data

Key Challenges

• There are no syndicated data providers such as IRI and Nielsen in Consumer Electronics

• Slightly better situation in North America or Europe– NPD (U.S.), GfK (Europe) provides market (or retail channel) level unit sales

and price data by SKUs– However, they do not provide promotion details– Even with promotion data, the use of market (or channel) level data can cause

aggregation bias (i.e. overestimation of promotion effects)

• You have to assemble multiple datasets on your own– At least 2 ~ 3 datasets need to be merged– SKU-level unit sales data from ERP + External tracking service data (on price

and promotion): half-blind (no sales info for competitors)– Better data access if you are a category captain – Most painful and time-consuming step: organizational silo

Page 16: Promotion Analytics in Consumer Electronics - Module 1: Data

Common Mistakes: For Discussion• Use factory shipment data instead of retail sales data

– Biased promotion effect estimates due to forward buying from retailers

• Use cross-sectional data to measure price/promotion effects– Biased price or promotion effect estimates due to omitted variable bias– Better to use panel data and identify effects from within-store (or within chain) variation

• Use market (or channel) level data– Promotion effects are not homogeneous within a market (or channel)– Due to aggregation bias, promotion effects will be overstated – Better to use store, account, or chain-level data where promotion activities do not vary

across units

• Use data with short history (1 year or less)– At least, 2 ~ 3 years of data are required to properly measure seasonality

• Ignore price changes and promotion from competitors– Biased estimates of baseline sales and price/promotion effects

Page 17: Promotion Analytics in Consumer Electronics - Module 1: Data

Consumer Sales vs. Factory Shipments

20,000

40,000

60,000

80,000

100,000

1978 1979 1980 1981 1982

Promotion

Shipment

Retail sales

Page 18: Promotion Analytics in Consumer Electronics - Module 1: Data

Data Requirement

• Key elements of data– Unit sales by SKUs (outcome): ideally for the entire category (including competitors), but

feasible only with data for focal company‘s own SKUs– Price measures by SKU(causal): focal company + competitors – Promotion measures by SKU/product line/brand (causal): focal company + competitors

• Duration– Ideally 3 years (of weekly data); At least 2 years of data– To properly control seasonality

• Level of aggregation– Ideally store-level data; chain or account (chain-market combination) data can be used

as long as promotion/price policies are uniform (within chain or account)– Using market or channel-level data can cause overstating of promotion effects due to

aggregation bias

• Type of response data: Retail sales data (Do not use factory shipment data)– Due to forward buying from retailers

• Level of aggregation

Page 19: Promotion Analytics in Consumer Electronics - Module 1: Data

Potential Data Source: For Discussion

• Key elements of data

– Unit sales by SKUs (outcome)

– Price measures by SKU(causal)

– Promotion measures by SKU/product line/brand (causal)