Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Econ 219BPsychology and Economics: Applications
(Lecture 12)
Stefano DellaVigna
April 19, 2017
Outline
1. Methodology: Structural Behavioral Economics
2. Market Reaction to Biases: Behavioral IO
3. Market Reaction to Biases: Behavioral Firms
4. Methodology: Markets and Non-Standard Behavior
5. Market Reaction to Biases: Behavioral Finance
6. Market Reaction to Biases: Corporate Decisions
7. Market Reaction to Biases: Political Economy I
1 Methodology: Structural Behavioral Economics
• Structural estimation in behavioral economics
— Use model for estimation
— Estimate key model parameters
• What are plusses and minuses?
• In preparation for first Handbook of Behavioral Economics
• Comments most welcome (email me!) [Attach Slides]
Overview What do we mean by structural?
“Estimation of a model on data that recovers parameter estimates (and c.i.s) for some key model parameters”
Bad and good reasons to do structural BE Two bad reasons to do structural BE:
1. It sells well on the job market and in top journals-> Possibly true, but don’t do it for that reason. There are much better reasons
2. It makes up for poor identification and/or lack of power.-> Definitely a bad idea. Need identification AND adequate power. Cautionary tale of NIT experiments (e.g., Card, DellaVigna, and Malmendier JEP 2013)
Overview
Six good reasons to do Structural BE:1. (Calibration) It builds on, and expands, great behavioral
tradition of calibrating models: Are magnitudes right?
2. (Stability) Are key behavioral parameters stable?
3. (Model and Design) It helps better understand models and can lead to better experimental design
4. (Welfare and Policy) It allows for welfare evaluation and policy counterfactuals
5. (Assumptions) It clarifies assumptions made
6. (Not so complex) It can be pretty straightforward
Overview
Two pitfalls to Structural BE:1. (Overdoing it) Taking the estimated model too seriously,
including precision of estimates
2. (Time Cost) It will, generally, take long
Illustrate using small number of papers as examples
Also, some examples on methods: Minimum distance Non-linear least squares Maximum likelihood… OLS!
1. Calibration
Importance of calibrating models is lesson ONE from behavioral economics
Example 1: Inertia in retirement savings 410k enrollment goes from 45% to 90% from opt-out to
opt-in (Madrian-Shea 2001; Choi, Laibson, Madrian, Metrick 2002)
Standard model can explain qualitative pattern given switching costs k
But magnitudes? Costs would need to be ridiculous (O’Donoghue and Rabin, 1998)
Instead, procrastination plausible for naïve β-δ model even with β very close to 1 (O’Donoghue and Rabin, 1999; 2001)
Example 2: Rabin (EMA 2000) calibration theorem on risk
1. Calibration OK, Calibration is great. Why do we need estimation?
(Estimation ≠ Calibration because provides standard errors)
Hard to provide back-of-the-envelope calibration for realistic (and complex) models
Return to example 1: Inertia in retirement savings O’Donoghue and Rabin (1998-2001) calibrations based
on deterministic switching cost k But more realistic that switching cost k varies day-to-day
(e.g., Carroll et al., 2009; DellaVigna and Malmendier, 2006)
Need to solve dynamic programming problem Changes result on beta calibration for procrastination
1. Calibration Great example of estimate of inertia: Handel (AER 2013) Administrative data on health insurance choice within a
company Analyze choice only among PPO plans, all by same insurer Only difference is premia and co-pay
Year t: firm introduces new plans and require active choice
Year t+1: some plans change, choice by default
1. Calibration Structural estimation Compute value for insurance based on previous risk at t-1 Models switching cost as cost k to pay when switch (no
cost in year t when active choice) Maximum likelihood estimation: $2,000!
Clearly unlikely to capture administrative costsMore likely captures procrastination or inattention
(Precise) estimate of $2,000 drives home the point also to non-behavioral economists
1. Calibration Ref.-dep. job search (DellaVigna, Lindner, Reizer, Schmieder, 2016)
Fit of ref. dep. Model similar with β model and δ model
BUT δ model has 15-day δ=.9 (implausible impatience) β model instead has β =.6 (in range of other estimates) This “calibration” could only be done with full estimation
2. Stability Behavioral economics has convergence on some models: Beta-delta model of time preference (Laibson, 1997;
O’Donoghue and Rabin, 1999)
Reference-dependence model of risk preferences (Kahnemanand Tversky, 1979; Koszegi and Rabin, 2006)
k levels of thinking (Camerer, Ho, and Chong, 2004; Costa-Gomes, Crawford, and Broseta, 2001)
Is there reasonable agreement in parameters across settings? Structural estimation of models in different settings We start to have that for beta-delta model One of earliest examples of Structural BE: Laibson, Repetto, and
Tobacman. 2007, now Laibson, Maxted, Repetto, Tobacman, 2016
Match consumption-savings moments to beta-delta model
2. Stability Compare to some other estimates Paserman (EJ 2008) – Estimate beta-delta model of job
search decisions for unemployed workers from DellaVigna and Paserman (JOLE) [maximum likelihood]
Augenblick, Niederle and Sprenger (QJE 2015) –Estimate beta-delta from real-effort decision over time
Augenblick and Rabin (2015) – Estimates beta and beta hat from real effort over time [maximum likelihood]
Augenblick (2016) – Estimate beta at different time distances using real effort over time [maximum likelihood]
3. Model and Design Structural estimation forces to take model more
seriously Sketch of model will not suffice Full specification in order to do estimation Forces to work out details [Cite O’D. on reference points]
For experiments, important to set up estimation as much as possible before running experiment
Benefit of model-based experiment (Card, DellaVigna, and Malmendier JEP 2011) Can lead to improved design
Remarkably still rare
3. Model and Design Field experiments published in top-5 journals from 1975
010
2030
Num
ber
75-84 85-89 90-94 95-99 00-04 05-10Period
Descriptive Single ModelCompeting Models Parameter estim.
Count of field experiments by theoretical content
3. Model and Design Very different from what is happens in lab experiments
010
2030
4050
Num
ber
75-84 85-89 90-94 95-99 00-04 05-10Period
Descriptive Single ModelCompeting Models Parameter estim.
Count of lab experiments by theoretical content
3. Model and Design DellaVigna, List, and Malmendier (QJE 2012)
Assume donor asks for money. Why did I give? Because utility-increasing (altruism/warm glow)? Or because felt bad saying no (social pressure)?
Step 1. Idea for field experiment design: Do door-to-door field experiment Run treatment group with flyer Run control group with no flyer Estimate effect on % answering and % giving
Both should go up with altruism, down with social pressure
3. Model and Design Step 2. Write simple model Altruism a for charity Social pressure cost S if say no to giving in person No cost if not at home Individuals can sort in/out at a convex cost c (Simplistic model by Doug’s standards)
Insights from writing model: New outcome variable: Different effects for small and
large donations New treatment: If only could estimate sorting cost c, identify altruism and social pressure parameters
Thought experiment: If flyers lower share at home by 10 percent, is that a 3c or $5 gain from sorting?
3. Model and Design
Observational data: You’d be stuck! Field experiments: You can, in fact
should, design extra treatments when needed for identification
Still in design phase, we added survey treatments to estimate elasticity with respect to $ and time
3. Model and Design Survey experiments that give elasticity
Figure 6b. Survey (2009 Experiment)
0%
10%
20%
30%
40%
50%
60%
No Flyer($0/5min)
(N = 1421)
No Flyer($5/5min)(N = 910)
Flyer($0/10min)(N = 769)
Flyer($0/5min)
(N = 1704)
Flyer($5/5min)
(N = 1856)
Flyer($10/5min)(N = 675)
Opt-out($0/5min)
(N = 1374)
Opt-out($5/5min)
(N = 1330)
Answeringthe Door
Completingthe Survey
Opt-out
4. Welfare and Policy Advantage of estimating model is… you can use it! Compute welfare of setting versus counterfactuals Estimate effect of potential policies
Return to previous papers: DellaVigna, List, and Malmendier (QJE 2012) on charity Handel (AER 2013) on health insurance
4. Welfare and Policy: Charity Welfare effect in DLM: Do fund-raisers raise welfare? With only altruism: Yes, of course! With social pressure: No, welfare effect can be negative All non-donors pay social pressure cost Only few donors get warm glow
Does that mean that fund-raisers should be limited? No Can introduce opt-out option as a win-win solution
Panel C. Welfare La Rabida Charity ECU Charity
Welfare per Household Contacted (in $) -1.077 (0.160) -0.439 (0.286)Money Raised per Household Contacted 0.722 (0.036) 0.332 (0.046)Money Raised per Household, Net of Salary 0.247 (0.036) -0.143 (0.046)
Welfare in Standard (No-Flyer) Fund-Raiser
4. Welfare and Policy: Health Insurance Handel (AER 2013) considers welfare effect of policy
that reduces switching costs from k to .25k Partial equilibrium: average gain of $100
General equilibrium Need to take into account effects on pricing Lowering inertia will worsen adverse selection Health insurance firms need to raise price
5. Structural and Complexity Structural work does require longer time generally: Set up a (full) model Organize and analyze data Estimate model on data, often with lengthy computer runs
BUT structural model can be simple Especially if rich data provides necessary variation
(sufficient statistic approach)
OR if data collection / experiment is set up to make estimation simple
5. Structural and Complexity Lacetera, Pope, and Sydnor (AER 2012) – Inattention
to left-digit bias for odometer readings
Estimate how value of a used car is affected by mileage, and inattention to second digit (e.g., 19,900 miles vs. 20,010 miles)
5. Structural and Complexity Model of impact of odometer reading on value
5. Structural and Complexity Using 22 million (!) used car auction transactions
Consider real-effort experiment (e.g., Gill et al., 2016)
What assumption are behind such specification? Effort is outcome of maximization decision
Utility maximization of effort e, piece rate 𝑝𝑝𝑊𝑊,
Assume exponential Cost Function: 𝑐𝑐(𝑒𝑒)=exp(𝑠𝑠⋅𝑒𝑒)/𝑠𝑠 Solution:
FOC: 𝑝𝑝𝑊𝑊 + 𝐴𝐴 = exp(𝑠𝑠 ⋅ 𝑒𝑒𝑖𝑖𝑖𝑖∗ ) ⋅ exp(𝑘𝑘𝑖𝑖 + 𝑡𝑡𝑡𝑡𝑡𝑡𝑒𝑒𝑡𝑡𝑡𝑡𝑒𝑒𝑡𝑡𝑑𝑑𝑖𝑖) ⋅ exp(−𝑠𝑠 ⋅ 𝜖𝜖𝑖𝑖𝑖𝑖)
⇒ 𝑒𝑒𝑖𝑖𝑖𝑖∗ =1𝑠𝑠 log 𝑝𝑝𝑊𝑊 + 𝐴𝐴 − 𝑘𝑘𝑖𝑖 − 𝑡𝑡𝑡𝑡𝑡𝑡𝑒𝑒𝑡𝑡𝑡𝑡𝑒𝑒𝑡𝑡𝑑𝑑𝑖𝑖 + 𝜖𝜖𝑖𝑖𝑖𝑖
Can estimate with Non-linear Least Squares, almost like OLS (nl in Stata) – Easy!
6. Assumptions
What about using log effort instead as dependent variable?
1. Power Cost Function: 𝑐𝑐 𝑒𝑒 = 𝑒𝑒1+𝑠𝑠
1+𝑠𝑠FOC: 𝑝𝑝𝑊𝑊 + 𝐴𝐴 = 𝑒𝑒𝑖𝑖𝑖𝑖∗ 𝑠𝑠 ⋅ exp(𝑘𝑘𝑖𝑖 + 𝑡𝑡𝑡𝑡𝑡𝑡𝑒𝑒𝑡𝑡𝑡𝑡𝑒𝑒𝑡𝑡𝑑𝑑𝑖𝑖) ⋅ exp(−𝑠𝑠 ⋅ 𝜖𝜖𝑖𝑖𝑖𝑖)
⇒ log 𝑒𝑒𝑖𝑖𝑖𝑖∗ =1𝑠𝑠 log 𝑝𝑝𝑊𝑊 + 𝐴𝐴 − 𝑘𝑘𝑖𝑖 − 𝑡𝑡𝑡𝑡𝑡𝑡𝑒𝑒𝑡𝑡𝑡𝑡𝑒𝑒𝑡𝑡𝑑𝑑𝑖𝑖 + 𝜖𝜖𝑖𝑖𝑖𝑖
(DellaVigna, List, Malmendier, and Rao, 2016)
6. Assumptions
Pitfall: Overdoing It What about pitfalls? Assume you fully specify model and estimate it You spend years of life doing it Model implies welfare and policy implications Now you sell the implications for policy and welfare
Warning 1. Estimation and implications are only as good as assumptions going into it Test much robustness
Warning 2. Standard errors do not acknowledge model mis-specification Point estimates are likely too precise
Pitfall: Overdoing It How to avoid pitfall: Allcott and Taubinsky (AER 2016) Consumers make choices between incandescent and CFL
One group receives information on energy savings, other not
Pitfall: Overdoing It Shift in demand curve due to information
Optimal subsidy: $3 Ban not optimal
Pitfall: Overdoing It What if not sure about some key assumptions?
Conclusion Summary: Behavioral economics is normal science
As such, we should see a variety of approaches used in empirical work, including structural estimates
2 Market Reaction to Biases: Behavioral IO
• Start from case of
— Consumers purchasing products have biases
— Firms, unbiased, maximize profits
— Do consumer biases affect profit-maximizing contract design?
— How is consumer welfare affected by firm response?
• Discuss early paper: DellaVigna and Malmendier (QJE 2004). Con-sumers with
³ ̂
´preferences
2.1 Self-Control I
MARKET (I). INVESTMENT GOODS
• Monopoly• Two-part tariff: (lump-sum fee) (per-unit price)• Cost: set-up cost , per-unit cost
Consumption of investment good
Payoffs relative to best alternative activity:
• Cost at = 1 stochastic— non-monetary cost
— experience good, distribution ()
• Benefit 0 at = 2 deterministic
FIRM BEHAVIOR. Profit-maximization
max
{− + (− ) (− )}
s.t.
(−+
Z ̂−−∞
(− − ) ()
)≥
• Notice the difference between and ̂• Substitute for to maximize
max
(Z ̂−−∞
(− − ) () + (− ) (− )− −
)
Solution for the per-unit price ∗:
∗ = [exponentials]
−³1− ̂
´³̂− ∗
´ (− ∗)
[sophisticates]
−³̂− ∗
´− (− ∗)
(− ∗)[naives]
Features of the equilibrium
1. Exponential agents ( = ̂ = 1).Align incentives of consumers with cost of firm=⇒ marginal cost pricing: ∗ = .
∗ = [exponentials]
−³1− ̂
´³̂− ∗
´ (− ∗)
[sophisticates]
−³̂− ∗
´− (− ∗)
(− ∗)[naives]
2. Hyperbolic agents. Time inconsistency=⇒ below-marginal cost pricing: ∗ .
(a) Sophisticates ( = ̂ 1): commitment.
(b) Naives ( ̂ = 1): overestimation of consumption.
MARKET (II). LEISURE GOODS
Payoffs of consumption at = 1:
• Benefit at = 1 stochastic• Cost at = 2 deterministic
=⇒ Use the previous setting: − is “current benefit”, 0 is “future cost.”
Results:
1. Exponential agents.
Marginal cost pricing: ∗ = , ∗ = (PC).
2. Hyperbolic agents tend to overconsume. =⇒Above-marginal cost pricing: ∗ . Initial bonus ∗ (PC).
EXTENSIONS
• Perfect Competition. Can write maximization problem as
max
− +Z ̂−−∞
(− − ) ()
s.t. {− + (− ) (− )} = 0— Implies the same solution for ∗
• Heterogeneity. Simple case of heterogeneity:— Share of fully naive consumers ( ̂ = 1)
— Share 1− of exponential consumers ( = ̂ = 1)
— At = 0 these consumers pool on same contract, given no immediatepayoffs
• Maximization (with Monopoly):max
{− + [ (− ) + (1− ) (− )] (− )}
s.t. − +Z −−∞
(− − ) () ≥
• Solution:∗ =
− (− )− (− )
(− ) + (1− ) (− )
• The higher the fraction of naives the higher the underpricing of
EMPIRICAL PREDICTIONS
Two predictions for time-inconsistent consumers:
1. Investment goods (Proposition 1):
(a) Below-marginal cost pricing
(b) Initial fee (Perfect Competition)
2. Leisure goods (Corollary 1)
(a) Above-marginal cost pricing
(b) Initial bonus or low initial fee (Perfect Competition)
FIELD EVIDENCE ON CONTRACTS
• US Health club industry ($11.6bn revenue in 2000)— monthly and annual contracts
— Estimated marginal cost: $3-$6 + congestion cost
— Below-marginal cost pricing despite small transaction costs and pricediscrimination
• Vacation time-sharing industry ($7.5bn sales in 2000)— high initial fee: $11,000 (RCI)
— minimal fee per week of holiday: $140 (RCI)
• Credit card industry ($500bn outstanding debt in 1998)— Resale value of credit card debt: 20% premium (Ausubel, 1991)
— No initial fee, bonus (car / luggage insurance)
— Above-marginal-cost pricing of borrowing
• Gambling industry: Las Vegas hotels and restaurants:— Price rooms and meals below cost, at bonus
— High price on gambling
WELFARE EFFECTS
Result 1. Self-control problems + Sophistication ⇒ First best
• Consumption if ≤ − ∗
• Exponential agent:— ∗ =
— consume if ≤ − ∗ = −
• Sophisticated time-inconsistent agent:— ∗ = − (1− )
— consume if ≤ − ∗ = −
• Perfect commitment device• Market interaction maximizes joint surplus of consumer and firm
Result 2. Self-control + Partial naiveté ⇒ Real effect of time inconsistency
• ∗ = − [ (− ∗)− (− ∗)](− ∗)
• Firm sets ∗ so as to accentuate overconfidence
• Two welfare effects:— Inefficiency: Surplusnaive ≤ Surplussoph.
— Transfer (under monopoly) from consumer to firm
• Profits are increasing in naivete’ ̂ (monopoly)• Welfarenaive ≤ Welfaresoph.
• Large welfare effects of non-rational expectations
2.2 Self-Control II
• Eliaz and Spiegler (RES 2006), Contracting with Diversely Naive Agents.• Extend DellaVigna and Malmendier (2004):— incorporate heterogeneity in naiveté
— allow more flexible functional form in time inconsistency
— different formulation of naiveté
• Setup:1. Actions:
— Action ∈ [0 1] taken at time 2— At time 1 utility function is ()
— At time 2 utility function is ()
2. Beliefs: At time 1 believe:
— Utility is () with probability
— Utility is () with probability 1−
— Heterogeneity: Distribution of types
3. Transfers:
— Consumer pays firm ()
— Restrictive assumption: no cost to firm of providing
• Therefore:— Time inconsistency ( 1) — Difference between and
— Naiveté (̂ ) — 0
— Partial naiveté here modelled as stochastic rather than deterministic
— Flexibility in capturing time inconsistency (self-control, reference de-pendence, emotions)
• Proposition 1. There are two types of contracts:1. Perfect commitment device for sufficiently sophisticated agents ( )
2. Exploitative contracts for sufficiently naive agents ( )
• Commitment device contract: Implement = max ()— Transfer:∗ () = max ()
∗ () =∞ for other actions
— Result here is like in DM: Implement first best
• Exploitative contract:— Agent has negative utility:
()− () 0
— Maximize overestimation of agents:
= argmax ( ()− ())
2.3 Bounded Rationality
• Gabaix and Laibson (2003), Competition and Consumer Confusion
• Non-standard feature of consumers:— Limited ability to deal with complex products
— imperfect knowledge of utility from consuming complex goods
• Firms are aware of bounded rationality of consumers−→ design products & prices to take advantage of bounded rationality ofconsumers
Example: Checking account. Value depends on
• interest rates• fees for dozens of financial services (overdrafts, more than checks permonths, low average balance, etc.)
• bank locations• bank hours• ATM locations
• web-based banking services• linked products (e.g. investment services)
Given such complexity, consumers do not know the exact value of products theybuy.
Model
• Consumers receive noisy, unbiased signalsabout product value.
— Agent chooses from goods.
— True utility from good :
−
— Utility signal
= − +
is complexity of product is zero mean, iid across consumers and goods, with density andcumulative distribution .(Suppress consumer-specific subscript ; ≡ and ≡ .)
• Consumer decision rule: Picks the one good with highest signal from()
=1.
Market equilibrium with exogenous complexity. Bertrand competition with
• : quality of a good,
: complexity of a good,
: production cost
: price
• Simplification: identical across firms. (Problem: How shouldconsumers choose if all goods are known to be identical?)
• Firms maximize profit = ( − )
• Symmetry reduces demand to
=Z ()
µ − +
¶−1
Example of demand curves
Gaussian noise ∼ (0‚1) 2 firms
Demand curve faced by firm 1:
1 = (− 1 + 1 − 2 + 2)
= ³2 − 1
√2´with = (2 − 1)
√2 N(0,1)
= Φ
Ã2 − 1
√2
!
Usual Bertrand case ( = 0) : infinitely elastic demand at 1 = 2
1 ∈⎧⎪⎨⎪⎩
1 if 1 2[0 1] if 1 = 20 if 1 2
⎫⎪⎬⎪⎭
Complexity case ( 0) : Smooth demand curve, no infinite drop at 1 = 2.At 1 = 2 = demand is 12
max1
Φ
Ã2 − 1
√2
![1 − 1]
: − 1
√2
Ã2 − 1
√2
![1 − 1] + Φ
Ã2 − 1
√2
!= 0
Intuition for non-zero mark-ups: Lower elasticity increases firm mark-upsand profits. Mark-up proportional to complexity .
Endogenous complexity
• Consider Normal case — For →∞
max1
Φ
Ã2 − 1
√2
![1 − 1]→ max
1
1
2[1 − 1]
Set →∞ and obtain infinite profits by letting 1→∞(Choices are random, Charge as much as possible)
• Gabaix and Laibson: Concave returns of complexity ()
Firms increase complexity, unless “clearly superior” products in model withheterogenous products.
In a nutshell: market does not help to overcome bounded rationality. Com-petition may not help either
• More work on Behavioral IO:
• Heidhus-Koszegi (2006, 2007)— Incorporate reference dependence into firm pricing
— Assume reference point rational exp. equilibrium (Koszegi-Rabin)
— Results on
∗ Price compression (consumers hate to pay price higher than referencepoint)
∗ But also: Stochastic sales
• Gabaix-Laibson (QJE 2006)— Consumers pay attention to certain attributes, but not others (Shroudedattributes)
— Form of limited attention
— Firms charge higher prices on shrouded attributes (add-ons)
— Similar to result in DellaVigna-Malmendier (2004): Charge more onitems consumers do not expect to purchase
• Ellison (2006): Early, concise literature overview
• Future work: Empirical Behavioral IO— Document non-standard behavior
— Estimate structurally
— Document firm response to non-standard feature
3 Behavioral Firms
• Reasonable to assume that firms respond to consumers self-control, naivete’reference dependence
• But are frms behavioral in maximizing profits?
• ‘Behavioral firms’ is likely key area of future research— Firms may be very good at maximizing within a particular dimension
— Yet, they may miss another dimension altogether
• Levitt (2004) bagelman story:— Retired economist delivers bagels to offices in NYC
— Bagelman has to set two variables:
∗ Quantity delivered to each office: do not want excess bagels (stale),nor too few (lost profits)
∗ Price of bagels— Quantity: bagelman is perfect on average
— Price: bagelman is way off, sets too low price. Price increased twice,both times profits are up
• Is it lack of experimentation?
• Hanna, Mullainathan, Schwartzstein (QJE 2014)— Examines seaweed farmers in Indonesia
— What do they pay attention to?
— Researchers do experiments varying
∗ Pod size∗ Pod spacing
— Farmers pay a lot of attention to pod spacing
— Experiments — Farmers get about optimal choice in pod spacing
— Farmers were not instead paying attention to pod size
— Experiments — Farmers far from optimum on pod size
— When given feedback farmers change the pod size
• — Consistent with Schwartzstein (JEEA) limited attention model
∗ Optimize when pay attention∗ But completely miss some variables, do not realize they are relevant
• Some other examples:
• Bloom and van Reenen series of papers
— Measure managerial skill with suvey of top managers
— Plenty of variation
— Correlates with firm productivity
• Ou (2017): Considers sellers on eBay (small firms)— Do they choose the right menu of 3-part tariffs?
— Do they sell items in the dynamically optimal way within a month?
— A sizable subet of sellers does not optimize in either dimension
• DellaVigna and Gentzkow (2017): [Slides]
Question
How do chains vary prices in response to local demand?
Relevant for: IO: Firm behavior
Macro: Sources of price rigidity
Behavioral econ: Testing for “behavioral firms”
Data Kilts Center Nielsen RMS retail scanner data Revenue and units sold for UPC u in week t, for store i 53 grocery retailers with at least 10 stores each 8,375 stores that meet our criteria
Data Data extraction criteria Select 12 high-revenue modules (product category):
Within each (module, year), select UPC with top coverage across all chains Reese’s Peanut Butter Cups for “Candy (chocolate)” Campbell’s Cream of Mushroom Soup for “Canned Soup”
• Canned Soup • Soft Drinks (carbonated)• Cookies • Toilet Paper
• Cat Food (wet type) • Bleach
• Candy (chocolate) • Paper Towels
• Coffee (ground/whole bean) • Toothpaste
• Orange Juice (refrigerated) • Yogurt
Challenge: Show pricing across stores, over time, for different products
This paper: Color-coded plots of price Plot ln(price in store i, week t) – ln(average yearly price
across chains) Example: 0.1 indicates price 10% higher than in avg. store Darker colors are higher price Blank if no price
Each row is a store i, stores sorted by measure of store-level income, 250 stores in a chain
Each column is a week t
Motivating Facts: Visualizing Prices
Motivating Facts Majority of Chains display largely uniform pricing (e.g., chain 79) Plenty of price variation over time (sales) Almost no variation across stores
Motivating Facts Same chain, multiple UPCs, same 50 stores for each UPC
Motivating Facts Another example of largely uniform pricing
Motivating Facts Small number of other chains: Separate by geography. Rigid pricing within a region: Chain 9
Motivating Facts Separate by geography. Links indicate similar pricing
How similar are prices across stores in same chain? Two main measures of price similarity b/w stores 1 and 2:
1. (Low-frequency) Absolute difference in log quarterly prices(quarterly prices = unweighted average weekly logP in each quarter)
2. (High-frequency) Weekly correlation in demeaned log prices (demeaning at store-year-UPC level)
Additional measure: Share of identical prices (up to 1 percent difference in prices)
u denotes a UPC, t denotes a non-missing week
Measures computed by UPC, then averaged across 12 UPCs (except for correlation, computed with all modules together)
Motivating Facts
Motivating Facts
346k within-chain pairs High similarity of prices
along each of 3 measures BUT maybe prices are
similar also across chains
Motivating Facts
1.5m between-chain pairs
Much larger difference in pricing along each of 3 measures
Motivating Facts: Within vs. Between Chains
Stronger test of rigid pricing: Compare stores across DMAs
(not in same advertising area) Compare stores with different
income (top-third-of-income store & bottom-third-of-income store)
Similar pattern of rigidity
Motivating Facts: Price difference decomposition How does it differ across chains? Plot two main measures, average at chain level
Model
Model
Model: Elasticity Est. elast. i by store i: logQi,u,t =αi + ilogPi,u,t +γiXi,u,t+εi,u,t X is a vector of upcXyear and upcXweek of year dummies
Estimated elasticities i measured with error Empirical shrinkage procedure: Divide sample into 1st half and 2nd half of each year Estimate for each store i and Choose to min ∑ = ( 1 − + − ) = .1239 Apply to elasticity on entire sample: = 1 − + ̅
Model: Elasticity Validation of Elasticities I: Estimated with precision Distribution of standard error of elasticity i
Model: Elasticity Validation of elasticities II: linearity of logQ on logP Partial out X from logQ and logP, binscatter of residuals
Model: Elasticity Validation of elasticities III: relate to store-level
income measure
Evidence on Model: Price v. Elasticity Within chain relationship (demeaning by chain) of: Average log price for store i (averaged across weeks & UPCs)
Elasticity for store i
1. Clear statistical relationship2. Small coefficient economically
Does it differ depending on price centralization?
Evidence on Model: Price v. Elasticity Split chains by price correlation measure Slope mostly due to chains with less rigid pricing
Evidence on Model: Price v. Elasticity Return to model: Log price vs. price elasticity Within-Chain, compare to level predicted by model
(assume constant marginal cost across stores within chain)
Slope much flatter than in model
Evidence on Model: Price v. Elasticity What about between chain? (No attention rigidity) Compute average price and average elasticity across all
stores in a chain (assume equal marginal cost in all chains)
Slope noisy, but comes closer to slope in model
Evidence on Model: Price v. Elasticity Empirical response to elasticity much lower than
model predicted response Especially true for within-chain response
(1) (2) (3)
VARIABLES
OLS Within Chain Price
OLS Between Chain Price
Model Median Price
Elasticity 0.0126*** 0.0266* 0.1916
(0.00361) (0.0140) Constant 0.0715*
(0.0420)
Observations 7,824 53 R-squared 0.061 0.052 Chain FE Y *** p<0.01, ** p<0.05, * p<0.1
Evidence on Model: Possible Explanations
Wrong model / estimates Elasticity measure not correct (e.g., substitution or
short-run vs. long-run) Instrument for elasticity Richer competitive interactions (not today)
Managerial Costs Fixed costs of price setting Managerial inertia or attention
Other Fairness constraints
Evidence on Model: Price v. Elasticity Instrument elasticity with local demographics: income Match to Nielsen Homescan dataset of consumers Compute income per capita of (5-digit) ZIP code of
residence for all consumers shopping in store i Weighted average of income measure for consumers
shopping at store i, weighted by trips taken to store i
Instrument likely biases price-elasticity slope upward if also (pos.) correlated with marginal cost (e.g., land cost)
Evidence on Model: Price v. Elasticity First stage: within and between chain
We pool the within and between variation in the first stage:
(1) VARIABLES Shrunk Elasticity Income 0.160*** ($10,000s) (0.0271) Constant -3.278***
(0.126)
Observations 7,824 R-squared 0.124 Standard errors clustered by parent_code *** p<0.01, ** p<0.05, * p<0.1
Evidence on Model: Price v. Elasticity Reduced form: price on income Very flat relationship within chain Steeper relationship between chains
Evidence on Model: Price v. Elasticity IV results: Within-chain slope is still one order of magnitude flatter Between-chain slope is as predicted by model
(1) (2) (3) (4) (5)
VARIABLES
OLS Within Chain Price
OLS Between Chain Price
Model Median Price
IV Within Chain Price
IV Between Chain Price
Elasticity 0.0126*** 0.0266* 0.1916 0.0318** 0.257***
(0.00361) (0.0140) (0.0126) (0.0854) Constant 0.0715* 0.716***
(0.0420) (0.233)
Observations 7,824 53 7,824 7,824 R-squared 0.061 0.052 0.099 0.274 Chain FE Y Y *** p<0.01, ** p<0.05, * p<0.1
Evidence on Model: Possible Explanations
Wrong model / estimates Elasticity measure not correct (e.g., substitution or
short-run vs. long-run) Instrument for elasticity Richer competitive interactions (not today)
Managerial Costs Fixed costs of price setting Managerial inertia or attention
Other Fairness constraints
Evidence on Model: Managerial Costs
Managerial costs (“behavioral firms”, e.g., Bloom and Van Reenen): Managers find it too costly/too hard to solve pricing
problem across stores in a chain Almost uniform pricing within a chain Approximate correct pricing between chains
Puzzle: Why positive (if flat) slope also within chain? Not predicted with fixed cost of inattention
How large would the managerial costs be?
Evidence on Model: Managerial Costs Using elasticities, compute profit losses from centralized
pricing Loss in profit is function of dispersion of elasticities About 1% assuming no fixed costs (understates loss)
Do chains with higher losses respond more?
Conclusion Retail firms respond little to local demand Robust to product choice, chain, and sector Some response to price elasticity, but small magnitude Thus, they appear to forgo profit opportunities
Explanations? Could be managerial attention cost BUT need to explain positive slope
Could be fairness constraint BUT is it plausible in US grocery sector?
In any case, important fact to contend with
4 Methodology: Markets and Non-Standard Be-
havior
• Why don’t market forces eliminate non-standard behavior?
• Common Chicago-type objection
• Argument 1. Experience reduces non-standard behavior.— Experience appears to mitigate the endowment effect (List, 2003 and2004).
— Experience improves ability to perform backward induction (Palacios-Huerta and Volji, 2007 and 2008)
— BUT: Maybe experience does not really help (Levitt, List, and Reiley,2008)
— What does experience imply in general?
∗ Feedback is often infrequent (such as in house purchases) or noisy(such as in financial investments) —not enough room for experience
∗ Experience can exacerbate a bias if individuals are not Bayesian learn-ers (Haigh and List 2004)
∗ Not all non-standard features should be mitigated by experience.Example: social preferences
∗ Debiasing by experienced agents can be a substitute for direct expe-rience. However, as Gabaix and Laibson (2006) show, experiencedagents such as firms typically have little or no incentive to debiasindividuals
• Curse of Debiasing (Gabaix-Laibson 2006)— Credit Card A teaser fees on $1000 balance:
∗ $0 for six months∗ $100 fee for next six months
— Cost of borrowing to company $100 — Firm makes 0 profit in PerfectlyCompetitive market
— Naive consumer:
∗ Believes no borrowing after 6 months∗ Instead keeps borrowing∗ Expects cost of card to be $0, instead pays $100
• Can Credit Card B debias consumers and profit from it?— Advertisement to consumers: ‘You will borrow after 6 months!’
— Offer rate of
∗ $50 for six months∗ $50 for next six months
• What do consumers (now sophisticated) do?— Stay with Card A
∗ Borrow for 6 months at $0∗ Then switch to another company
• No debiasing in equilibrium
• System of transfers:— Firms take advantage of naive consumers
— Sophisticated consumers benefit from naive consumers
• Related: Suppose Credit Card B can identify naive consumer— What should it do?
— If debias, then lose consumer
— Rather, take advantage of consumer
• Argument 2. Even if experience or debiasing do not eliminate the biases,the biases will not affect aggregate market outcomes
— Arbitrage — Rational investors set prices
— However, limits to arbitrage (DeLong et al., 1991) — individuals withnon-standard features affect stock prices
— In addition, in most settings, there is no arbitrage!
∗ Example: Procrastination of savings for retirement∗ (Keep in mind SMRT plan though)
— Behavioral IO: Non-standard features can have a disproportionate im-pact on market outcomes
∗ Firms focus pricing on the biases∗ Lee and Malmendier (AER 2011) on overbidding in eBay auctions
• Bidders with bias have disproportionate impact
• Opposite of Chicago intuition
5 Market Reaction to Biases: Behavioral Finance
• Who do ‘smart’ investors respond to investors with biases?
• First, brief overview of anomalies in Asset Pricing (from Barberis andThaler, 2004)
1. Underdiversification.
(a) Too few companies.
— Investors hold an average of 4-6 stocks in portfolio.
— Improvement with mutual funds
(b) Too few countries.
— Investors heavily invested in own country.
— Own country equity: 94% (US), 98% (Japan), 82% (UK)
— Own area: own local Bells (Huberman, 2001)
(c) Own company
— In companies offering own stock in 401(k) plan, substantial invest-ment in employer stock
2. Naive diversification.
— Investors tend to distribute wealth ‘equally’ among alternatives in401(k) plan (Benartzi and Thaler, 2001; Huberman and Jiang, 2005)
3. Excessive Trading.
— Trade too much given transaction costs (Odean, 2001)
4. Disposition Effect in selling
— Investors more likely to sell winners than losers
5. Attention Effects in buying
— Stocks with extreme price or volume movements attract attention(Odean, 2003)
6. Inattention to Fees
• Should market forces and arbitrage eliminate these phenomena?
• Arbitrage:— Individuals attempt to maximize individual wealth
— They take advantage of opportunities for free lunches
• Implications of arbitrage: ‘Strange’ preferences do not affect pricing
• Implication: For prices of assets, no need to worry about behavioral stories
• Is it true?
• Fictitious example:— Asset A returns $1 tomorrow with = 5
— Asset B returns $1 tomorrow with = 5
— Arbitrage — Price of A has to equal price of B
— If
∗ sell and buy
∗ keep selling and buying until =
— Viceversa if
• Problem: Arbitrage is limited (de Long et al., 1991; Shleifer, 2001)
• In Example: can buy/sell A or B and tomorrow get fundamental value
• In Real world: prices can diverge from fundamental value
• Real world example. Royal Dutch and Shell— Companies merged financially in 1907
— Royal Dutch shares: claim to 60% of total cash flow
— Shell shares: claim to 40% of total cash flow
— Shares are nothing but claims to cash flow
— Price of Royal Dutch should be 60/40=3/2 price of Shell
• differs substantially from 1.5 (Fig. 1)
• Plenty of other example (Palm/3Com)
• What is the problem?— Noise trader risk, investors with correlated valuations that diverge fromfundamental value
— (Example: Naive Investors keep persistently bidding down price ofShell)
— In the long run, convergence to cash-flow value
— In the short-run, divergence can even increase
— (Example: Price of Shell may be bid down even more)
• Noise Traders
• DeLong, Shleifer, Summers, Waldman (JPE 1990)
• Shleifer, Inefficient Markets, 2000
• Fundamental question: What happens to prices if:— (Limited) arbitrage
— Some irrational investors with correlated (wrong) beliefs
• First paper on Market Reaction to Biases
• The key paper in Behavioral Finance
The model assumptions
A1: arbitrageurs risk averse and short horizon
−→ Justification?
* Short-selling constraints
(per-period fee if borrowing cash/securities)
* Evaluation of Fund managers.
* Principal-Agent problem for fund managers.
A2: noise traders (Kyle 1985; Black 1986)
misperceive future expected price at by
∼ N (∗ 2)
misperception correlated across noise traders (∗ 6= 0)
−→ Justification?
* fads and bubbles (Internet stocks, biotechs)
* pseudo-signals (advice broker, financial guru)
* behavioral biases / misperception riskiness
What else?
• noise traders, (1− ) arbitrageurs
• OLG model— Period 1: initial endowment, trade— Period 2: consumption
• Two assets with identical dividend — safe asset: perfectly elastic supply=⇒ price=1 (numeraire)
— unsafe asset: inelastic supply (1 unit)=⇒ price?
• Demand for unsafe asset: and with + (1− ) = 1
• CARA: () = −−2 ( wealth when old)
[()] =Z ∞∞− −2 · 1q
22· −
122
(−)2
= −Z ∞∞
1q22
· −42+
2+2−222
= −Z ∞∞
1q22
· −(−[22+])2+2−424−2−22
22
= −424+2
2
22
Z ∞∞
1q22
· −(−[22+])2
22
= −422+2 = −2(−2)
¸max [()] y
pos. mon. transf.max − 2
Arbitrageurs:
max( − )(1 + )
+ ([+1] + )
− ( )2 (+1)
Noise traders:
max( − )(1 + )
+ ([+1] + + )
− ( )2 (+1)
(Note: Noise traders know how to factor the effect of future price volatility intotheir calculations of values.)
f.o.c.
Arbitrageurs: [ ]
!= 0
= +[+1]− (1 + )
2 · (+1)
Noise traders: [ ]
!= 0
= +[+1]− (1 + )
2 · (+1)
+
2 · (+1)
Interpretation
• Demand for unsafe asset function of:— (+) expected return ( +[+1]− (1 + ))— (-) risk aversion ()— (-) variance of return ( (+1))
— (+) overestimation of return (noise traders)
• Notice: noise traders hold more risky asset than arb. if 0 (andviceversa)
• Notice: Variance of prices come from noise trader risk. “Price when old”depends on uncertain belief of next periods’ noise traders.
• Impose general equilibrium: + (1− ) = 1 to obtain
1 = +[+1]− (1 + )
2 · (+1)+
2 · (+1)
or
=1
1 + [ +[+1]− 2 · (+1) + ]
• To solve for we need to solve for [+1] = [] and (+1)
[] =1
1 + [ +[]− 2 · (+1) + []]
[] = 1 +−2 · (+1) + ∗
— Rewrite plugging in
= 1− 2 · (+1)
+∗
(1 + )+
1 +
[] = ∙1 +
¸=
2
(1 + )2 () =
2
(1 + )22
— Rewrite
= 1 +∗+ ( − ∗)1 +
− 2 22
(1 + )2
— Noise traders affect prices!
— Term 1: Variation in noise trader (mis-)perception
— Term 2: Average misperception of noise traders
— Term 3: Compensation for noise trader risk
• Relative returns of noise traders— Compare returns to noise traders to returns for arbitrageurs :
∆ = − = ( − ) [ + +1 − (1 + )]
(∆|) = −(1 + )2 222
(∆) = ∗ − (1 + )2 (∗)2 + (1 + )2 2
22
— Noise traders hold more risky asset if ∗ 0
— Return of noise traders can be higher if ∗ 0 (and not too positive)
— Noise traders therefore may outperform arbitrageurs if optimistic!
— (Reason is that they are taking more risk)
Welfare
• Sophisticated investors have higher utility
• Noise traders have lower utility than they expect
• Noise traders may have higher returns (if ∗ 0)
• Noise traders do not necessarily disappear over time
• Three fundamental assumptions1. OLG: no last period; short horizon
2. Fixed supply unsafe asset ( cannot convert safe into unsafe)
3. Noise trader risk systematic
• Noise trader models imply that biases affect asset prices:— Reference Dependence
— Attention
— Persuasion
6 Market Reaction to Biases: Corporate Deci-
sions
• Baker, Ruback, and Wurgler (2005)
• Behavioral corporate finance:— biased investors (overvalue or undervalue company)
— smart managers
— (Converse: biased (overconfident) managers and rational investors)
• Firm has to decide how to finance investment project:1. internal funds (cash flow/retained earnings)
2. bonds
3. stocks
• Fluctuation of equity prices due to noise traders
• Managers believe that the market is inefficient— Issue equity when stock price exceeds perceived fundamental value
— Delay equity issue when stock price below perceived fundamental value
• Consistent with— Survey Evidence of 392 CFO’s (Graham and Harvey 2001): 67% sayunder/overvaluation is a factor in issuance decision
— Insider trading
• Go over quickly two examples
• Long-run performance of equity issuers— Market Timing prediction: Companies issuing equity underperformlater
— Loughran-Ritter (1995): Compare matching samples of
∗ companies doing IPOs∗ companies not doing IPOs but have similar market cap.
• Similar finding with SEOs
7 Market Reaction to Biases: Political EconomyI
• Interaction between:— (Smart) Politicians:∗ Personal beliefs and party affiliation∗ May pursue voters/consumers welfare maximization∗ BUT also: strong incentives to be reelected
— Voters (with biases):∗ Low (zero) incentives to vote∗ Limited information through media∗ Likely to display biases
• Behavioral political economy
• Examples of voter biases:
— Effect of candidate order (Ho and Imai)
— Imperfect signal extraction (Wolfers, 2004) — Voters more likely tovote an incumbent if the local economy does well even if... it’s justdue to changes in oil prices
— Susceptible to persuasion (DellaVigna and Kaplan, 2007)
— More? Short memory about past performance?
• Eisensee and Stromberg (QJE 2007): Limited attention of voters
• Setting:
— Natural Disasters occurring throughout the World
— US Ambassadors in country can decide to give Aid
— Decision to give Aid affected by
∗ Gravity of disaster
∗ Political returns to Aid decision
• Idea: Returns to aid are lower when American public is distracted by amajor news event
• Main Measure of Major News: median amount of Minutes in Evening TVNews captured by top-3 news items (Vanderbilt Data Set)
• — Dates with largest news pressure
• 5,000 natural Disasters in 143 countries between 1968 and 2002 (CRED)— 20 percent receive USAID from Office of Foreign Disaster Assistance(first agency to provide relief)
— 10 percent covered in major broadcast news— OFDA relief given if (and only if) Ambassador (or chief of Mission) incountry does Disaster Declaration
— Ambassador can allocate up to $50,000 immediately
• EstimateRe = + +
• Below: about the Disaster is instrumented with:— Average News Pressure over 40 days after disaster— Olympics
• — 1st Stage: 2 s.d increase in News Pressure (2.4 extra minutes) decrease
∗ probability of coverage in news by 4 ptg. points (40 percent)∗ probability of relief by 3 ptg. points (15 percent)
• Is there a spurious correlation between instruments and type of disaster?• No correlation with severity of disaster
• OLS and IV Regressions of Reliefs on presence in the News• (Instrumented) availability in the news at the margin has huge effect: Al-most one-on-one effect of being in the news on aid
• Finan and Schechter (2012 EMA): Politicians target voter reciprocity— Motivation is vote buying
— Politicians do favors to individuals in the hope of the return of a vote
— BUT: Vote is private, no way to enforce a contract
• Solution that makes the contract enforceable: reciprocity of voters— Voter that receives a gift takes into account the politician
— In return, provides vote
• Similar to gift exchange in the workplace— Reciprocity helps enforcement of ‘contract’
• BUT: Vote-maximizing politician must find reciprocal voters
• Finan and Schechter do survey in Paraguay in 2002, 2007, and 2010
• Survey of voters:— In 2002 asked to play trust game
∗ First mover has allocation of 8k and decide how much to send torecipient: 0, 2k, 4k, 6k, 8k
∗ Money sent to recipient is tripled∗ Recipient decides how much money to send back (strategy method)∗ Measure of reciprocity: Share returned by recipient when receiving12k+ versus when receiving 6k
— In 2007 ask voters whether targeted by vote-buying:
∗ ‘whether, during the run-up to the 2006 elections, any political partyoffered them money, food, payment of utility bills, medicines, and/orother goods (excluding propaganda hats, shirts, and posters)’
∗ 26 percent say yes
• Survey of middlemen in 2010— Evidence that they know villagers well
— Ex.: Correlation between actual years of schooling and middleman re-port: 0.73
— (Lower correlation in prediction of amount sent in dictator game, 0.08)
• Main evidence: clear correlation of self-reported vote-buying and reci-procity measure
• Social preferences used for evil purposes!
• What explains political participation?— Olson (1965): Public good problem: Even if think participation is right,individually better off staying at home
— Example 1: Riots and protests
— Example 2: Voter turnout at the polls - Probability of being pivotalvery small
• Series of papers introduce variants of social preferences to explain partici-pation in political activities
• Passarelli and Tabellini (2013):— Focus on protests
— Assume negative reciprocity and role of emotions
— Individuals treated poorly by government get glow from protesting
• Model in a nutshell for individual — Cost of participating to protest
— Psychological benefit of participation to protest
— Benefit depends on aggrievement:
=
⎧⎨⎩ 0 if ≥ ̂
³ − ̂
´2if ̂
— is welfare of individual with given policy
— ̂ is what individual thinks appropriate (can be self-biased)
— Ad-hoc form of reference dependence
— When aggrieved, individual willing to incur cost of participation becauseof glow from participation
8 Next Lecture
• Behavioral Institutional Design
• Structural Behavioral Economics