36
A Practitioner’s Guide to Estimation of Random-Coef cients Logit Models of Demand Aviv Nevo University of California–Berkeley, Berkeley, CA 94720-3880 and NBER Estimation of demand is at the heart of many recent studies that exam- ine questions of market power, mergers, innovation, and valuation of new brands in differentiated-products markets. This paper focuses on one of the main methods for estimating demand for differentiated products: random- coef cients logit models. The paper carefully discusses the latest innovations in these methods with the hope of increasing the understanding, and there- fore the trust among researchers who have never used them, and reducing the dif culty of their use, thereby aiding in realizing their full potential. 1. Introduction Estimation of demand has been a key part of many recent studies examining questions regarding market power, mergers, innovation, and valuation of new brands in differentiated-product industries. 1 This paper explains the random-coef cients (or mixed) logit method- ology for estimating demand in differentiated-product markets using An earlier version of this paper circulated under the title “A Research Assistant’s Guide to Random Coef cients Discrete Choice Model of Demand.” I wish to thank Steve Berry, Iain Cockburn, Bronwyn Hall, Ariel Pakes, various lecture and seminar participants, and an anonymous referee for comments, discussions, and suggestions. Financial sup- port from the UC Berkeley Committee on Research Junior Faculty Grant is gratefully acknowledged. 1. Just to mention some examples, Bresnahan (1987) studies the 1955 price war in the automobile industry; Gasmi et al. (1992) empirically study collusive behavior in a soft- drink market; Hausman et al. (1994) study the beer industry; Berry et al. (1995, 1999) examine equilibrium in the automobile industry and its implications for voluntary trade restrictions; Goldberg (1995) uses estimates of the demand for automobiles to investi- gate trade policy issues; Hausman (1996) studies the welfare gains generated by a new brand of cereal; Berry et al. (1996) study hubs in the airline industry; Bresnahan et al. (1997) study rents from innovation in the computer industry; Nevo (2000a, b) examines price competition and mergers in the ready-to-eat cereal industry; Davis (1998) studies spatial competition in movie theaters; and Petrin (1999) studies the welfare gains from the introduction of the minivan. © 2000 Massachusetts Institute of Technology. Journal of Economics & Management Strategy, Volume 9, Number 4, Winter 2000, 513–548

APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

  • Upload
    lykhue

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

A Practitionerrsquos Guide toEstimation of Random-Coef cients

Logit Models of Demand

Aviv NevoUniversity of CaliforniandashBerkeley Berkeley

CA 94720-3880 andNBER

Estimation of demand is at the heart of many recent studies that exam-ine questions of market power mergers innovation and valuation of newbrands in differentiated-products markets This paper focuses on one of themain methods for estimating demand for differentiated products random-coefcients logit models The paper carefully discusses the latest innovationsin these methods with the hope of increasing the understanding and there-fore the trust among researchers who have never used them and reducingthe difculty of their use thereby aiding in realizing their full potential

1 Introduction

Estimation of demand has been a key part of many recent studiesexamining questions regarding market power mergers innovationand valuation of new brands in differentiated-product industries1

This paper explains the random-coefcients (or mixed) logit method-ology for estimating demand in differentiated-product markets using

An earlier version of this paper circulated under the title ldquoA Research Assistantrsquos Guideto Random Coefcients Discrete Choice Model of Demandrdquo I wish to thank Steve BerryIain Cockburn Bronwyn Hall Ariel Pakes various lecture and seminar participantsand an anonymous referee for comments discussions and suggestions Financial sup-port from the UC Berkeley Committee on Research Junior Faculty Grant is gratefullyacknowledged

1 Just to mention some examples Bresnahan (1987) studies the 1955 price war in theautomobile industry Gasmi et al (1992) empirically study collusive behavior in a soft-drink market Hausman et al (1994) study the beer industry Berry et al (1995 1999)examine equilibrium in the automobile industry and its implications for voluntary traderestrictions Goldberg (1995) uses estimates of the demand for automobiles to investi-gate trade policy issues Hausman (1996) studies the welfare gains generated by a newbrand of cereal Berry et al (1996) study hubs in the airline industry Bresnahan et al(1997) study rents from innovation in the computer industry Nevo (2000a b) examinesprice competition and mergers in the ready-to-eat cereal industry Davis (1998) studiesspatial competition in movie theaters and Petrin (1999) studies the welfare gains fromthe introduction of the minivan

copy 2000 Massachusetts Institute of TechnologyJournal of Economics amp Management Strategy Volume 9 Number 4 Winter 2000 513ndash548

514 Journal of Economics amp Management Strategy

market-level data This methodology can be used to estimate the dem-and for a large number of products using market data and allowingfor the endogenity of price While this method retains the benets ofalternative discrete-choice models it produces more realistic demandelasticities With better estimates of demand we can for example bet-ter judge market power simulate the effects of mergers measure thebenets from new goods or formulate innovation and competitionpolicy This paper carefully discusses the recent innovations in thesemethods with the intent of reducing the barriers to entry and increas-ing the trust in these methods among researchers who are not familiarwith them

Probably the most straightforward approach to specifying de-mand for a set of closely related but not identical products is to specifya system of demand equations one for each product Each equationspecies the demand for a product as a function of its own price theprice of other products and other variables An example of such asystem is the linear expenditure model (Stone 1954) in which quan-tities are linear functions of all prices Subsequent work has focusedon specifying the relation between prices and quantities in a way thatis both exible (ie allows for general substitution patterns) and con-sistent with economic theory2

Estimating demand for differentiated products adds two addi-tional nontrivial concerns The rst is the large number of productsand hence the large number of parameters to be estimated Con-sider for example a constant-elasticity or logndashlog demand systemin which logarithms of quantities are linear functions of logarithms ofall prices Suppose we have 100 differentiated products then withoutadditional restrictions this implies estimating at least 10000 param-eters (100 demand equations one for each product with 100 pricesin each) Even if we impose symmetry and adding up restrictionsimplied by economic theory the number of parameters will still betoo large to estimate them The problem becomes even harder if wewant to allow for more general substitution patterns

An additional problem introduced when estimating demand fordifferentiated products is the heterogeneity in consumer tastes If allconsumers are identical then we would not observe the level of dif-ferentiation we see in the marketplace One could assume that pref-erences are of the right form [the Gorman form see Gorman (1959)]so that an aggregate or average consumer exists and has a demandfunction that satises the conditions specied by economic theory3

2 Examples include the Rotterdam model (Theil 1965 Barten 1966) the translogmodel (Christensen et al 1975) and the almost ideal demand system (Deaton andMuellbauer 1980)

3 For an example of a representative consumer approach to demand for differenti-ated products see Dixit and Stiglitz (1977) or Spence (1976)

Random-Coefcients Logit Models of Demand 515

However the required assumptions are strong and for many applicati-ons seem to be empirically false The difference between an aggregatemodel and a model that explicitly reects individual heterogeneitycan have profound affects on economic and policy conclusions

The logit demand model (McFadden 1973)4 solves the dimen-sionality problem by projecting the products onto a space of charac-teristics making the relevant size the dimension of this space and notthe square of the number of products A problem with this modelis the strong implication of some of the assumptions made Due tothe restrictive way in which heterogeneity is modeled substitutionbetween products is driven completely by market shares and not byhow similar the products are Extensions of the basic logit model relaxthese restrictive assumptions while maintaining the advantage of thelogit model in dealing with the dimensionality problem The essen-tial idea is to explicitly model heterogeneity in the population andestimate the unknown parameters governing the distribution of thisheterogeneity These models have been estimated using both market-and individual-level data5 The problem with the estimation is that ittreats the regressors including price as exogenously determined Thisis especially problematic when aggregate data is used to estimate themodel

This paper describes recent developments in methods for esti-mating random-coefcients discrete-choice models of demand [Berry1994 Berry et al 1995 (henceforth BLP)] The new method maintainsthe advantage of the logit model in handling a large number of prod-ucts It is superior to prior methods because (1) the model can beestimated using only market-level price and quantity data (2) it dealswith the endogeneity of prices and (3) it produces demand elasticitiesthat are more realisticmdashfor example cross-price elasticities are largerfor products that are closer together in terms of their characteristics

The rest of the paper is organized as follows Section 2 describesa model that encompasses with slight alterations the models pre-viously used in the literature The focus is on the various modelingassumptions and their implications for estimation and the results InSection 3 I discuss estimation including the data required an outlineof the algorithm and instrumental variables Many of the nitty-grittydetails of estimation are described in an appendix (available from

4 A related literature is the characteristics approach to demand or the addressapproach (Lancaster 1966 1971 Quandt 1968 Rosen 1974) For a recent exposition ofit and a proof of its equivalence to the discrete choice approach see Anderson et al

5 For example the generalized extreme-value model (McFadden 1978) and therandom-coefcients logit model (Cardell and Dunbar 1980 Boyd and Mellman 1980Tardiff 1980 Cardell 1989 and references therein) The random-coefcients model isoften called the hedonic demand model in this earlier literature it should not be con-fused with the hedonic price model (Court 1939 Griliches 1961)

516 Journal of Economics amp Management Strategy

httpelsaberkeleyedu ~ nevo) Section 4 provides a brief exampleof the type of results the estimation can produce Section 5 concludesand discusses various extensions and alternatives to the method de-scribed here

2 The Model

In this section I discuss the model with an emphasis on the vari-ous modeling assumptions and their implications In the next sectionI discuss the estimation details However for now I want to stresstwo points First the method I discuss here uses (market-level) priceand quantity data for each product in a series of markets to estimatethe model Some information regarding the distribution of consumercharacteristics might be available but a key benet of this methodol-ogy is that we do not need to observe individual consumer purchasedecisions to estimate the demand parameters6

Second the estimation allows prices to be correlated with theeconometric error term This will be modeled in the following wayA product will be dened by a set of characteristics Producers andconsumers are assumed to observe all product characteristics Theresearcher on the other hand is assumed to observe only some of theproduct characteristics Each product will be assumed to have a char-acteristic that inuences demand but that either is not observed by theresearcher or cannot be quantied into a variable that can be includedin the analysis Examples are provided below The unobserved charac-teristics will be captured by the econometric error term Since the pro-ducers know these characteristics and take them into account whensetting prices this introduces the econometric problem of endogenousprices7 The contribution of the estimation method presented belowis to transform the model in such a way that instrumental-variablemethods can be used

21 The Setup

Assume we observe t 5 1 T markets each with i 5 1 It

consumers For each such market we observe aggregate quantities

6 If individual decisions are observed the method of analysis differs somewhatfrom the one presented here For clarity of presentation I defer discussion of this caseto Section 5

7 The assumption that when setting prices rms take account of the unobserved(to the econometrician) characteristics is just one way to generate correlation betweenprices and these unobserved variables For example correlation can also result fromthe mechanics of the consumerrsquos optimization problem (Kennan 1989)

Random-Coefcients Logit Models of Demand 517

average prices and product characteristics for J products8 The de-nition of a market will depend on the structure of the data BLP useannual automobile sales over a period of twenty years and thereforedene a market as the national market for year t where t 5 1 20On the other hand Nevo (2000a) observes data in a cross section ofcities over twenty quarters and denes a market t as a cityndashquartercombination with t 5 1 1124 Yet a different example is givenby Das et al (1994) who observe sales for different income groupsand dene a market as the annual sales to consumers of a certainincome level

The indirect utility of consumer i from consuming product j inmarket t U (xj t raquoj t p j t iquesti h ) 9 is a function of observed and unobser-ved (by the researcher) product characteristics xj t and raquoj t respectivelyprice p j t individual characteristics iquesti and unknown parameters h I focus on a particular specication of this indirect utility10

u ij t 5 a i (yi p j t) 1 x j t b i 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (1)

where yi is the income of consumer i p j t is the price of product j inmarket t xj t is a K-dimensional (row) vector of observable character-istics of product j raquoj t is the unobserved (by the econometrician) prod-uct characteristic and e ij t is a mean-zero stochastic term Finally a i isconsumer i rsquos marginal utility from income and b i is a K-dimensional(column) vector of individual-specic taste coefcients

Observed characteristics vary with the product being consideredBLP examine the demand for cars and include as observed charac-teristics horsepower size and air conditioning In estimating demandfor ready-to-eat cereal Nevo (2000a) observes calories sodium andber content Unobserved characteristics for example can includethe impact of unobserved promotional activity unquantiable factors(brand equity) or systematic shocks to demand Depending on thestructure of the data some components of the unobserved characteris-tics can be captured by dummy variables For example we can modelraquoj t 5 raquoj 1 raquot 1 D raquoj t and capture raquoj and raquot by brand- and market-specicdummy variables

Implicit in the specication given by equation (1) are threethings First this form of the indirect utility can be derived from a

8 For ease of exposition I have assumed that all products are offered to all con-sumers in all markets The methods described below can easily deal with the casewhere the choice set differs between markets and also with different choice sets fordifferent consumers

9 This is sometimes called the conditional indirect utility ie the indirect utilityconditional on choosing this option

10 The methods discussed here are general and with minor adjustments can dealwith different functional forms

518 Journal of Economics amp Management Strategy

quasilinear utility function which is free of wealth effects For someproducts (for example ready-to-eat cereals) this is a reasonable ass-umption but for other products (for example cars) it is an unreason-able one Including wealth effects alters the way the term yi p j t entersequation (1) For example BLP build on a Cobb-Douglas utility func-tion to derive an indirect utility that is a function of log(yi p j t) Inprinciple we can include f (yi p j t) where f ( ) is a exible functionalform (Petrin 1999)

Second equation (1) species that the unobserved characteristicwhich among other things captures the elements of vertical productdifferentiation is identical for all consumers Since the coefcient onprice is allowed to vary among individuals this is consistent withthe theoretical literature of vertical product differentiation An alter-native is to model the distribution of the valuation of the unobservedcharacteristics as in Das et al (1994) As long as we have not madeany distributional assumptions on consumer-specic components (ieanything with subscript i) their model is not more general Once wemake such assumptions their model has slightly different implica-tions for some of the normalizations usually made An exact discus-sion of these implications is beyond the scope of this paper

Finally the specication in equation (1) assumes that all con-sumers face the same product characteristics In particular all con-sumers are offered the same price Depending on the data if differentconsumers face different prices using either a list or average trans-action price will lead to measurement error bias This just leads toanother reason why prices might be correlated with the error termand motivates the instrumental-variable procedure discussed below11

The next component of the model describes how consumer pref-erences vary as a function of the individual characteristics iquesti In thecontext of equation (1) this amounts to modeling the distribution ofconsumer taste parameters The individual characteristics consist oftwo components demographics which I refer to as observed andadditional characteristics which I refer to as unobserved denoted Di

and vi respectively Given that no individual data is observed neithercomponent of the individual characteristics is directly observed in thechoice data set The distinction between them is that even though wedo not observe individual data we know something about the distri-bution of the demographics Di while for the additional characteris-tics vi we have no such information Examples of demographics areincome age family size race and education Examples of the type of

11 However as noted by Berry (1994) the method proposed below can deal withmeasurement error only if the variable measured with error enters in a restrictive wayNamely it only enters the part of utility that is common to all consumers d j in equation(3) below

Random-Coefcients Logit Models of Demand 519

information we might have is a large sample we can use to estimatesome feature of the distribution (eg we could use Census data toestimate the mean and standard deviation of income) Alternativelywe might have a sample from the joint distribution of several demo-graphic variables (eg the Current Population Survey might tell usabout the joint distribution of income education and age in differ-ent cities in the US) The additional characteristics m i might includethings like whether the individual owns a dog a characteristic thatmight be important in the decision of which car to buy yet even verydetailed survey data will usually not include this fact

Formally this will be modeled as

a i

b i

5a

b1 OtildeDi 1 aringm i m i ~ P

m ( m ) Di ~ AtildeP D(D) (2)

where Di is a d acute 1 vector of demographic variables m i captures theadditional characteristics discussed in the previous paragraph P

m ( ) isa parametric distribution AtildeP

D( ) is either a nonparametric distributionknown from other data sources or a parametric distribution with theparameters estimated elsewhere Otilde is a (K 1 1) acute d matrix of coefcientsthat measure how the taste characteristics vary with demographicsand aring is a (K 1 1) acute (K 1 1) matrix of parameters12 If we assumethat P

m ( ) is a standard multivariate normal distribution as I do inthe example below then the matrix aring allows each component of m i

to have a different variance and allows for correlation between thesecharacteristics For simplicity I assume that m i and Di are indepen-dent Equation (2) assumes that demographics affect the distributionof the coefcients in a fairly restrictive linear way For those coef-cients that are most important to the analysis (eg the coefcients onprice) relaxing the linearity assumption could have important impli-cations [for example see the results reported in Nevo (2000a b)]

As we will see below the way we model heterogeneity hasstrong implications for the results The advantage of letting the tasteparameters vary with the observed demographics Di is twofold Firstit allows us to include additional information about the distributionof demographics in the analysis Furthermore it reduces the relianceon parametric assumptions Therefore instead of letting a key elementof the method the distribution of the random coefcients be deter-mined by potentially arbitrary distributional assumptions we bringin additional information

12 To simplify notation I assume that all characteristics have random coefcientsThis need not be the case I return to this in the appendix when I discuss the detailsof estimation

520 Journal of Economics amp Management Strategy

The specication of the demand system is completed with theintroduction of an outside good the consumers may decide not to pur-chase any of the brands Without this allowance a homogenous priceincrease (relative to other sectors) of all the products does not changequantities purchased The indirect utility from this outsideoption is

u i0t 5 a i yi 1 raquo0t 1 frac140Di 1 frac340vi0 1 e i0t

The mean utility from the outside good raquo0t is not identied (with-out either making more assumptions or normalizing one of the insidegoods) Also the coefcients frac140 and frac340 are not identied separatelyfrom coefcients on an individual-specic constant term in equation(1) The standard practice is to set raquo0t frac140 and frac340 to zero and since theterm a iyi will eventually vanish (because it is common to all prod-ucts) this is equivalent to normalizing the utility from the outsidegood to zero

Let h 5 ( h 1 h 2) be a vector containing all the parameters of themodel The vector h 1 5 ( a b ) contains the linear parameters and thevector h 2 5 (Otilde aring) the nonlinear parameters13 Combining equations(1) and (2) we have

u ij t 5 a iyi 1 d j t(x j t pj t raquoj t h 1) 1 u ij t(xj t p j t vi Di h 2) 1 e ij t

d j t 5 x j t b a p j t 1 raquoj t u ij t 5 [ pj t xj t] (OtildeDi 1 aringm i) (3)

where [ pj t x j t] is a 1 acute (K 1 1) (row) vector The indirect utilityis now expressed as a sum of three (or four) terms The rst terma i yi is given only for consistency with equation (1) and will van-ish as we will see below The second term d j t which is referredto as the mean utility is common to all consumers Finally the lasttwo terms l ij t 1 e ij t represent a mean-zero heteroskedastic devia-tion from the mean utility that captures the effects of the randomcoefcients

Consumers are assumed to purchase one unit of the good thatgives the highest utility14 Since in this model an individual is de-

13 The reasons for the names will become apparent below14 A comment is in place here about the realism of the assumption that consumers

choose no more than one good We know that many households own more than onecar that many of us buy more than one brand of cereal and so forth We note that eventhough many of us buy more than one brand at a time less actually consume morethan one at a time Therefore the discreteness of choice can be sometimes defended bydening the choice period appropriately In some cases this will still not be enough inwhich case the researcher has one of two options either claim that the above modelis an approximation or reduced-form to the true choice model or model the choiceof multiple products or continuous quantities explicitly [as in Dubin and McFadden(1984) or Hendel (1999)]

Random-Coefcients Logit Models of Demand 521

ned as a vector of demographics and product-specic shocks(Di m i e i0t e iJ t) this implicitly denes the set of individualattributes that lead to the choice of good j Formally let this set be

A j t(x t p t d t h 2) 5copy

(Di vi e i0t e iJ t) |u ij t sup3 u ilt

l 5 0 1 Jordf

where x t 5 (xlt xJ t) cent p t 5 (p lt pJ t) cent and d t 5 ( d lt d J t) cent

are observed characteristics prices and mean utilities of all brandsrespectively The set A j t denes the individuals who choose brand j inmarket t Assuming ties occur with zero probability the market shareof the j th product is just an integral over the mass of consumers inthe region A j t Formally it is given by

sj t(x t p t d t h 2) 5 HA j t

dP (D v e )

5 HA j t

dP ( e |D m ) dP ( m |D) dP D(D)

5 HA j t

dP e ( e ) dP

m ( m ) dAtildeP D(D) (4)

where P ( ) denotes population distribution functions The secondequality is a direct application of Bayesrsquo rule while the last is a con-sequence of the independence assumptions previously made

Given assumptions on the distribution of the (unobserved) indi-vidual attributes we can compute the integral in equation (4) eitheranalytically or numerically Therefore for a given set of parametersequation (4) predicts of the market share of each product in eachmarket as a function of product characteristics prices and unknownparameters One possible estimation strategy is to choose parame-ters that minimize the distance (in some metric) between the mar-ket shares predicted by equation (4) and the observed shares Thisestimation strategy will yield estimates of the parameters that deter-mine the distribution of individual attributes but it does not accountfor the correlation between prices and the unobserved product char-acteristics The method proposed by Berry (1994) and BLP which ispresented in detail in the Section 3 accounts for this correlation

22 Distributional Assumptions

The assumptions on the distribution of individual attributes made inorder to compute the integral in equation (4) have important impli-cations for the own- and cross-price elasticities of demand In thissection I discuss some possible assumptions and their implications

522 Journal of Economics amp Management Strategy

Possibly the simplest distributional assumption one can make inorder to evaluate the integral in equation (4) is that consumer hetero-geneity enters the model only through the separable additive randomshock e ij t In our model this implies h 2 5 0 or b i 5 b and a i 5 a forall i and equation (1) becomes

u ij t 5 a (yi p j t) 1 x j t b 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (5)

At this point before we specify the distribution of e ij t the modeldescribed by equation (5) is as general as the model given inequation (1)15 Once we assume that e ij t is iid then the implied sub-stitution patterns are severely restricted as we will see below If wealso assume that e ij t are distributed according to a Type I extreme-value distribution this is the (aggregate) logit model The marketshare of brand j in market t dened by equation (4) is

sj t 5exp(x j t b a p j t 1 raquoj t)

1 1 S Jk 5 1 exp(xkt b a pkt 1 raquokt)

(6)

Note that income drops out of this equation since it is common to alloptions

Although the model implied by equation (5) and the extreme-value distribution assumption is appealing due to its tractability itrestricts the substitution patterns to depend only on the market sharesThe price elasticities of the market shares dened by equation (6) are

acutejkt 5sj t pkt

pkt sj t

5

(a pj t(1 sj t ) if j 5 k

a pktskt otherwise

There are two problems with these elasticities First since inmost cases the market shares are small the factor a (1 sj t) is nearlyconstant hence the own-price elasticities are proportional to ownprice Therefore the lower the price the lower the elasticity (in abso-lute value) which implies that a standard pricing model predicts ahigher markup for the lower-priced brands This is possible only if themarginal cost of a cheaper brand is lower (not just in absolute valuebut as a percentage of price) than that of a more expensive productFor some products this will not be true Note that this problem isa direct implication of the functional form in price If for exampleindirect utility was a function of the logarithm of price rather thanprice then the implied elasticity would be roughly constant In other

15 To see this compare equation (5) with equation (3)

Random-Coefcients Logit Models of Demand 523

words the functional form directly determines the patterns of own-price elasticity

An additional problem which has been stressed in the literatureis with the cross-price elasticities For example in the context of RTEcereals the cross-price elasticities imply that if Quaker CapN Crunch(a childernrsquos cereal) and Post Grape Nuts (a wholesome simple nutri-tion cereal) have similar market shares then the substitution fromGeneral Mills Lucky Charms (a childrenrsquos cereal) toward either ofthem will be the same Intuitively if the price of one childrenrsquos cerealgoes up we would expect more consumers to substitute to anotherchildrenrsquos cereal than to a nutrition cereal Yet the logit model restrictsconsumers to substitute towards other brands in proportion to marketshares regardless of characteristics

The problem in the cross-price elasticities comes from the iidstructure of the random shock In order to understand why this is thecase examine equation (3) A consumer will choose a product eitherbecause the mean utility from the product d j t is high or because theconsumer-specic shock l ij t 1 e ij t is high The distinction becomesimportant when we consider a change in the environment Considerfor example the increase in the price of Lucky Charms discussed inthe previous paragraph For some consumers who previously con-sumed Lucky Charms the utility from this product decreases enoughso that the utility from what was the second choice is now higherIn the logit model different consumers will have different rankings ofthe products but this difference is due only to the iid shock There-fore the proportion of these consumers who rank each brand as theirsecond choice is equal to the average in the population which is justthe market share of the each product

In order to get around this problem we need the shocks to util-ity to be correlated across brands By generating correlation we pre-dict that the second choice of consumers that decide to no longerbuy Lucky Charms will be different than that of the average con-sumer In particular they will be more likely to buy a product witha shock that was positively correlated to Lucky Charms for exampleCapN Crunch As we can see in equation (3) this correlation can begenerated either through the additive separable term e ij t or throughthe term l ij t which captures the effect the demographics Di and vi Appropriately dening the distributions of either of these terms canyield the exact same results The difference is only in modeling con-venience I now consider models of the two types

Models are available that induce correlation among options byallowing e ij t to be correlated across products rather than indepen-dently distributed are available (see the generalized extreme-valuemodel McFadden 1978) One such example is the nested logit model

524 Journal of Economics amp Management Strategy

in which all brands are grouped into predetermined exhaustive andmutually exclusive sets and e ij t is decomposed into an iid shockplus a group-specic component16 This implies that correlationbetween brands within a group is higher than across groups thus inthe example given above if the price of Lucky Charms goes up con-sumers are more likely to rank CapN Crunch as their second choiceTherefore consumers that currently consume Lucky Charms are morelikely to substitute towards Grape Nuts than the average consumerWithin the group the substitution is still driven by market shares ieif some childrenrsquos cereals are closer substitutes for Lucky Charms thanothers this will not be captured by the simple grouping17 The mainadvantage of the nested logit model is that like the logit model itimplies a closed form for the integral in equation (4) As we will seein Section 3 this simplies the computation

This nested logit can t into the model described by equation (1)in one of two ways by assuming a certain distribution of e ij t as wasmotivated in the previous paragraph or by assuming one of the char-acteristics of the product is a segment-specic dummy variable andassuming a particular distribution on the random coefcient of thatcharacteristic Given that we have shown that the model described byequations (1) and (2) can also be described by equation (3) it shouldnot be surprising that these two ways of describing the nested logitare equivalent Cardell (1997) shows the distributional assumptionsrequired for this equivalence to hold

The nested logit model allows for somewhat more exible sub-stitution patterns However in many cases the a priori division ofproducts into groups and the assumption of iid shocks within agroup will not be reasonable either because the division of segmentsis not clear or because the segmentation does not fully account forthe substitution patterns Furthermore the nested logit does not helpwith the problem of own-price elasticities This is usually handledby assuming some ldquonicerdquo functional form (ie yield patterns that areconsistent with some prior) but that does not solve the problem ofhaving the elasticities driven by the functional-form assumption

In some industries the segmentation of the market will be mul-tilayered For example computers can be divided into branded ver-sus generic and into frontier versus nonfrontier technology It turns

16 For a formal presentation of the nested logit model in the context of the modelpresented here see Berry (1994) or Stern (1995)

17 Of course one does not have to stop at one level of nesting For example wecould group all childrenrsquos cereals into family-acceptable and not acceptable For anexample of such grouping for automobiles see Goldberg (1995)

Random-Coefcients Logit Models of Demand 525

out that in the nested logit specication the order of the nests mat-ters18 For this reason Bresnahan et al (1997) build on a the generalextreme-value model (McFadden 1978) to construct what they call theprinciples-of-differentiation general extreme-value (PD GEV) model ofdemand for computers In their model they are able to use two dimen-sions of differentiation without ordering them With the exception ofdealing with the problem of ordering the nests this model retains allthe advantages and disadvantages of the nested logit In particular itimplies a closed-form expression for the integral in equation (4)

In principle one could consider estimating an unrestrictedvariance-covariance matrix of the shock e ij t This however reintro-duces the dimensionality problem discussed in the Introduction19 Ifin the full model described by equations (1) and (2) we maintainthe iid extreme-value distribution assumption on e ij t Correlationbetween choices is obtained through the term l ij t The correlationwill be a function of both product and consumer characteristics thecorrelation will be between products with similar characteristics andconsumers with similar demographics will have similar rankings ofproducts and therefore similar substitution patterns Therefore ratherthan having to estimate a large number of parameters correspond-ing to an unrestricted variance-covariance matrix we only have toestimate a smaller number

The price elasticities of the market shares sj t dened byequation (4) are

acutejkt 5sj t pkt

pkt sj t

5

( pj t

sj tH a i sij t(1 sij t) d AtildeP

D(D) dP m (v) if j 5 k

pkt

sj tH a isij tsikt d AtildeP

D(D) dP m (v) otherwise

where sij t 5 exp( d j t 1 l ij t) [1 1 S Kk 5 1 exp( d kt 1 l ikt)] is the probability of

individual i purchasing product j Now the own-price elasticity willnot necessary be driven by the functional form The partial derivativeof the market shares will no longer be determined by a single param-eter a Instead each individual will have a different price sensitivitywhich will be averaged to a mean price sensitivity using the individ-ual specic probabilities of purchase as weights The price sensitivity

18 So for example classifying computers rst into brandednonbranded and theninto frontiernonfrontier technology implies different substitution patterns than clas-sifying rst into frontiernonfrontier technology and then into brandednonbrandedeven if the classication of products does not change

19 See Hausman and Wise (1978) for an example of such a model with a smallnumber of products

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 2: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

514 Journal of Economics amp Management Strategy

market-level data This methodology can be used to estimate the dem-and for a large number of products using market data and allowingfor the endogenity of price While this method retains the benets ofalternative discrete-choice models it produces more realistic demandelasticities With better estimates of demand we can for example bet-ter judge market power simulate the effects of mergers measure thebenets from new goods or formulate innovation and competitionpolicy This paper carefully discusses the recent innovations in thesemethods with the intent of reducing the barriers to entry and increas-ing the trust in these methods among researchers who are not familiarwith them

Probably the most straightforward approach to specifying de-mand for a set of closely related but not identical products is to specifya system of demand equations one for each product Each equationspecies the demand for a product as a function of its own price theprice of other products and other variables An example of such asystem is the linear expenditure model (Stone 1954) in which quan-tities are linear functions of all prices Subsequent work has focusedon specifying the relation between prices and quantities in a way thatis both exible (ie allows for general substitution patterns) and con-sistent with economic theory2

Estimating demand for differentiated products adds two addi-tional nontrivial concerns The rst is the large number of productsand hence the large number of parameters to be estimated Con-sider for example a constant-elasticity or logndashlog demand systemin which logarithms of quantities are linear functions of logarithms ofall prices Suppose we have 100 differentiated products then withoutadditional restrictions this implies estimating at least 10000 param-eters (100 demand equations one for each product with 100 pricesin each) Even if we impose symmetry and adding up restrictionsimplied by economic theory the number of parameters will still betoo large to estimate them The problem becomes even harder if wewant to allow for more general substitution patterns

An additional problem introduced when estimating demand fordifferentiated products is the heterogeneity in consumer tastes If allconsumers are identical then we would not observe the level of dif-ferentiation we see in the marketplace One could assume that pref-erences are of the right form [the Gorman form see Gorman (1959)]so that an aggregate or average consumer exists and has a demandfunction that satises the conditions specied by economic theory3

2 Examples include the Rotterdam model (Theil 1965 Barten 1966) the translogmodel (Christensen et al 1975) and the almost ideal demand system (Deaton andMuellbauer 1980)

3 For an example of a representative consumer approach to demand for differenti-ated products see Dixit and Stiglitz (1977) or Spence (1976)

Random-Coefcients Logit Models of Demand 515

However the required assumptions are strong and for many applicati-ons seem to be empirically false The difference between an aggregatemodel and a model that explicitly reects individual heterogeneitycan have profound affects on economic and policy conclusions

The logit demand model (McFadden 1973)4 solves the dimen-sionality problem by projecting the products onto a space of charac-teristics making the relevant size the dimension of this space and notthe square of the number of products A problem with this modelis the strong implication of some of the assumptions made Due tothe restrictive way in which heterogeneity is modeled substitutionbetween products is driven completely by market shares and not byhow similar the products are Extensions of the basic logit model relaxthese restrictive assumptions while maintaining the advantage of thelogit model in dealing with the dimensionality problem The essen-tial idea is to explicitly model heterogeneity in the population andestimate the unknown parameters governing the distribution of thisheterogeneity These models have been estimated using both market-and individual-level data5 The problem with the estimation is that ittreats the regressors including price as exogenously determined Thisis especially problematic when aggregate data is used to estimate themodel

This paper describes recent developments in methods for esti-mating random-coefcients discrete-choice models of demand [Berry1994 Berry et al 1995 (henceforth BLP)] The new method maintainsthe advantage of the logit model in handling a large number of prod-ucts It is superior to prior methods because (1) the model can beestimated using only market-level price and quantity data (2) it dealswith the endogeneity of prices and (3) it produces demand elasticitiesthat are more realisticmdashfor example cross-price elasticities are largerfor products that are closer together in terms of their characteristics

The rest of the paper is organized as follows Section 2 describesa model that encompasses with slight alterations the models pre-viously used in the literature The focus is on the various modelingassumptions and their implications for estimation and the results InSection 3 I discuss estimation including the data required an outlineof the algorithm and instrumental variables Many of the nitty-grittydetails of estimation are described in an appendix (available from

4 A related literature is the characteristics approach to demand or the addressapproach (Lancaster 1966 1971 Quandt 1968 Rosen 1974) For a recent exposition ofit and a proof of its equivalence to the discrete choice approach see Anderson et al

5 For example the generalized extreme-value model (McFadden 1978) and therandom-coefcients logit model (Cardell and Dunbar 1980 Boyd and Mellman 1980Tardiff 1980 Cardell 1989 and references therein) The random-coefcients model isoften called the hedonic demand model in this earlier literature it should not be con-fused with the hedonic price model (Court 1939 Griliches 1961)

516 Journal of Economics amp Management Strategy

httpelsaberkeleyedu ~ nevo) Section 4 provides a brief exampleof the type of results the estimation can produce Section 5 concludesand discusses various extensions and alternatives to the method de-scribed here

2 The Model

In this section I discuss the model with an emphasis on the vari-ous modeling assumptions and their implications In the next sectionI discuss the estimation details However for now I want to stresstwo points First the method I discuss here uses (market-level) priceand quantity data for each product in a series of markets to estimatethe model Some information regarding the distribution of consumercharacteristics might be available but a key benet of this methodol-ogy is that we do not need to observe individual consumer purchasedecisions to estimate the demand parameters6

Second the estimation allows prices to be correlated with theeconometric error term This will be modeled in the following wayA product will be dened by a set of characteristics Producers andconsumers are assumed to observe all product characteristics Theresearcher on the other hand is assumed to observe only some of theproduct characteristics Each product will be assumed to have a char-acteristic that inuences demand but that either is not observed by theresearcher or cannot be quantied into a variable that can be includedin the analysis Examples are provided below The unobserved charac-teristics will be captured by the econometric error term Since the pro-ducers know these characteristics and take them into account whensetting prices this introduces the econometric problem of endogenousprices7 The contribution of the estimation method presented belowis to transform the model in such a way that instrumental-variablemethods can be used

21 The Setup

Assume we observe t 5 1 T markets each with i 5 1 It

consumers For each such market we observe aggregate quantities

6 If individual decisions are observed the method of analysis differs somewhatfrom the one presented here For clarity of presentation I defer discussion of this caseto Section 5

7 The assumption that when setting prices rms take account of the unobserved(to the econometrician) characteristics is just one way to generate correlation betweenprices and these unobserved variables For example correlation can also result fromthe mechanics of the consumerrsquos optimization problem (Kennan 1989)

Random-Coefcients Logit Models of Demand 517

average prices and product characteristics for J products8 The de-nition of a market will depend on the structure of the data BLP useannual automobile sales over a period of twenty years and thereforedene a market as the national market for year t where t 5 1 20On the other hand Nevo (2000a) observes data in a cross section ofcities over twenty quarters and denes a market t as a cityndashquartercombination with t 5 1 1124 Yet a different example is givenby Das et al (1994) who observe sales for different income groupsand dene a market as the annual sales to consumers of a certainincome level

The indirect utility of consumer i from consuming product j inmarket t U (xj t raquoj t p j t iquesti h ) 9 is a function of observed and unobser-ved (by the researcher) product characteristics xj t and raquoj t respectivelyprice p j t individual characteristics iquesti and unknown parameters h I focus on a particular specication of this indirect utility10

u ij t 5 a i (yi p j t) 1 x j t b i 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (1)

where yi is the income of consumer i p j t is the price of product j inmarket t xj t is a K-dimensional (row) vector of observable character-istics of product j raquoj t is the unobserved (by the econometrician) prod-uct characteristic and e ij t is a mean-zero stochastic term Finally a i isconsumer i rsquos marginal utility from income and b i is a K-dimensional(column) vector of individual-specic taste coefcients

Observed characteristics vary with the product being consideredBLP examine the demand for cars and include as observed charac-teristics horsepower size and air conditioning In estimating demandfor ready-to-eat cereal Nevo (2000a) observes calories sodium andber content Unobserved characteristics for example can includethe impact of unobserved promotional activity unquantiable factors(brand equity) or systematic shocks to demand Depending on thestructure of the data some components of the unobserved characteris-tics can be captured by dummy variables For example we can modelraquoj t 5 raquoj 1 raquot 1 D raquoj t and capture raquoj and raquot by brand- and market-specicdummy variables

Implicit in the specication given by equation (1) are threethings First this form of the indirect utility can be derived from a

8 For ease of exposition I have assumed that all products are offered to all con-sumers in all markets The methods described below can easily deal with the casewhere the choice set differs between markets and also with different choice sets fordifferent consumers

9 This is sometimes called the conditional indirect utility ie the indirect utilityconditional on choosing this option

10 The methods discussed here are general and with minor adjustments can dealwith different functional forms

518 Journal of Economics amp Management Strategy

quasilinear utility function which is free of wealth effects For someproducts (for example ready-to-eat cereals) this is a reasonable ass-umption but for other products (for example cars) it is an unreason-able one Including wealth effects alters the way the term yi p j t entersequation (1) For example BLP build on a Cobb-Douglas utility func-tion to derive an indirect utility that is a function of log(yi p j t) Inprinciple we can include f (yi p j t) where f ( ) is a exible functionalform (Petrin 1999)

Second equation (1) species that the unobserved characteristicwhich among other things captures the elements of vertical productdifferentiation is identical for all consumers Since the coefcient onprice is allowed to vary among individuals this is consistent withthe theoretical literature of vertical product differentiation An alter-native is to model the distribution of the valuation of the unobservedcharacteristics as in Das et al (1994) As long as we have not madeany distributional assumptions on consumer-specic components (ieanything with subscript i) their model is not more general Once wemake such assumptions their model has slightly different implica-tions for some of the normalizations usually made An exact discus-sion of these implications is beyond the scope of this paper

Finally the specication in equation (1) assumes that all con-sumers face the same product characteristics In particular all con-sumers are offered the same price Depending on the data if differentconsumers face different prices using either a list or average trans-action price will lead to measurement error bias This just leads toanother reason why prices might be correlated with the error termand motivates the instrumental-variable procedure discussed below11

The next component of the model describes how consumer pref-erences vary as a function of the individual characteristics iquesti In thecontext of equation (1) this amounts to modeling the distribution ofconsumer taste parameters The individual characteristics consist oftwo components demographics which I refer to as observed andadditional characteristics which I refer to as unobserved denoted Di

and vi respectively Given that no individual data is observed neithercomponent of the individual characteristics is directly observed in thechoice data set The distinction between them is that even though wedo not observe individual data we know something about the distri-bution of the demographics Di while for the additional characteris-tics vi we have no such information Examples of demographics areincome age family size race and education Examples of the type of

11 However as noted by Berry (1994) the method proposed below can deal withmeasurement error only if the variable measured with error enters in a restrictive wayNamely it only enters the part of utility that is common to all consumers d j in equation(3) below

Random-Coefcients Logit Models of Demand 519

information we might have is a large sample we can use to estimatesome feature of the distribution (eg we could use Census data toestimate the mean and standard deviation of income) Alternativelywe might have a sample from the joint distribution of several demo-graphic variables (eg the Current Population Survey might tell usabout the joint distribution of income education and age in differ-ent cities in the US) The additional characteristics m i might includethings like whether the individual owns a dog a characteristic thatmight be important in the decision of which car to buy yet even verydetailed survey data will usually not include this fact

Formally this will be modeled as

a i

b i

5a

b1 OtildeDi 1 aringm i m i ~ P

m ( m ) Di ~ AtildeP D(D) (2)

where Di is a d acute 1 vector of demographic variables m i captures theadditional characteristics discussed in the previous paragraph P

m ( ) isa parametric distribution AtildeP

D( ) is either a nonparametric distributionknown from other data sources or a parametric distribution with theparameters estimated elsewhere Otilde is a (K 1 1) acute d matrix of coefcientsthat measure how the taste characteristics vary with demographicsand aring is a (K 1 1) acute (K 1 1) matrix of parameters12 If we assumethat P

m ( ) is a standard multivariate normal distribution as I do inthe example below then the matrix aring allows each component of m i

to have a different variance and allows for correlation between thesecharacteristics For simplicity I assume that m i and Di are indepen-dent Equation (2) assumes that demographics affect the distributionof the coefcients in a fairly restrictive linear way For those coef-cients that are most important to the analysis (eg the coefcients onprice) relaxing the linearity assumption could have important impli-cations [for example see the results reported in Nevo (2000a b)]

As we will see below the way we model heterogeneity hasstrong implications for the results The advantage of letting the tasteparameters vary with the observed demographics Di is twofold Firstit allows us to include additional information about the distributionof demographics in the analysis Furthermore it reduces the relianceon parametric assumptions Therefore instead of letting a key elementof the method the distribution of the random coefcients be deter-mined by potentially arbitrary distributional assumptions we bringin additional information

12 To simplify notation I assume that all characteristics have random coefcientsThis need not be the case I return to this in the appendix when I discuss the detailsof estimation

520 Journal of Economics amp Management Strategy

The specication of the demand system is completed with theintroduction of an outside good the consumers may decide not to pur-chase any of the brands Without this allowance a homogenous priceincrease (relative to other sectors) of all the products does not changequantities purchased The indirect utility from this outsideoption is

u i0t 5 a i yi 1 raquo0t 1 frac140Di 1 frac340vi0 1 e i0t

The mean utility from the outside good raquo0t is not identied (with-out either making more assumptions or normalizing one of the insidegoods) Also the coefcients frac140 and frac340 are not identied separatelyfrom coefcients on an individual-specic constant term in equation(1) The standard practice is to set raquo0t frac140 and frac340 to zero and since theterm a iyi will eventually vanish (because it is common to all prod-ucts) this is equivalent to normalizing the utility from the outsidegood to zero

Let h 5 ( h 1 h 2) be a vector containing all the parameters of themodel The vector h 1 5 ( a b ) contains the linear parameters and thevector h 2 5 (Otilde aring) the nonlinear parameters13 Combining equations(1) and (2) we have

u ij t 5 a iyi 1 d j t(x j t pj t raquoj t h 1) 1 u ij t(xj t p j t vi Di h 2) 1 e ij t

d j t 5 x j t b a p j t 1 raquoj t u ij t 5 [ pj t xj t] (OtildeDi 1 aringm i) (3)

where [ pj t x j t] is a 1 acute (K 1 1) (row) vector The indirect utilityis now expressed as a sum of three (or four) terms The rst terma i yi is given only for consistency with equation (1) and will van-ish as we will see below The second term d j t which is referredto as the mean utility is common to all consumers Finally the lasttwo terms l ij t 1 e ij t represent a mean-zero heteroskedastic devia-tion from the mean utility that captures the effects of the randomcoefcients

Consumers are assumed to purchase one unit of the good thatgives the highest utility14 Since in this model an individual is de-

13 The reasons for the names will become apparent below14 A comment is in place here about the realism of the assumption that consumers

choose no more than one good We know that many households own more than onecar that many of us buy more than one brand of cereal and so forth We note that eventhough many of us buy more than one brand at a time less actually consume morethan one at a time Therefore the discreteness of choice can be sometimes defended bydening the choice period appropriately In some cases this will still not be enough inwhich case the researcher has one of two options either claim that the above modelis an approximation or reduced-form to the true choice model or model the choiceof multiple products or continuous quantities explicitly [as in Dubin and McFadden(1984) or Hendel (1999)]

Random-Coefcients Logit Models of Demand 521

ned as a vector of demographics and product-specic shocks(Di m i e i0t e iJ t) this implicitly denes the set of individualattributes that lead to the choice of good j Formally let this set be

A j t(x t p t d t h 2) 5copy

(Di vi e i0t e iJ t) |u ij t sup3 u ilt

l 5 0 1 Jordf

where x t 5 (xlt xJ t) cent p t 5 (p lt pJ t) cent and d t 5 ( d lt d J t) cent

are observed characteristics prices and mean utilities of all brandsrespectively The set A j t denes the individuals who choose brand j inmarket t Assuming ties occur with zero probability the market shareof the j th product is just an integral over the mass of consumers inthe region A j t Formally it is given by

sj t(x t p t d t h 2) 5 HA j t

dP (D v e )

5 HA j t

dP ( e |D m ) dP ( m |D) dP D(D)

5 HA j t

dP e ( e ) dP

m ( m ) dAtildeP D(D) (4)

where P ( ) denotes population distribution functions The secondequality is a direct application of Bayesrsquo rule while the last is a con-sequence of the independence assumptions previously made

Given assumptions on the distribution of the (unobserved) indi-vidual attributes we can compute the integral in equation (4) eitheranalytically or numerically Therefore for a given set of parametersequation (4) predicts of the market share of each product in eachmarket as a function of product characteristics prices and unknownparameters One possible estimation strategy is to choose parame-ters that minimize the distance (in some metric) between the mar-ket shares predicted by equation (4) and the observed shares Thisestimation strategy will yield estimates of the parameters that deter-mine the distribution of individual attributes but it does not accountfor the correlation between prices and the unobserved product char-acteristics The method proposed by Berry (1994) and BLP which ispresented in detail in the Section 3 accounts for this correlation

22 Distributional Assumptions

The assumptions on the distribution of individual attributes made inorder to compute the integral in equation (4) have important impli-cations for the own- and cross-price elasticities of demand In thissection I discuss some possible assumptions and their implications

522 Journal of Economics amp Management Strategy

Possibly the simplest distributional assumption one can make inorder to evaluate the integral in equation (4) is that consumer hetero-geneity enters the model only through the separable additive randomshock e ij t In our model this implies h 2 5 0 or b i 5 b and a i 5 a forall i and equation (1) becomes

u ij t 5 a (yi p j t) 1 x j t b 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (5)

At this point before we specify the distribution of e ij t the modeldescribed by equation (5) is as general as the model given inequation (1)15 Once we assume that e ij t is iid then the implied sub-stitution patterns are severely restricted as we will see below If wealso assume that e ij t are distributed according to a Type I extreme-value distribution this is the (aggregate) logit model The marketshare of brand j in market t dened by equation (4) is

sj t 5exp(x j t b a p j t 1 raquoj t)

1 1 S Jk 5 1 exp(xkt b a pkt 1 raquokt)

(6)

Note that income drops out of this equation since it is common to alloptions

Although the model implied by equation (5) and the extreme-value distribution assumption is appealing due to its tractability itrestricts the substitution patterns to depend only on the market sharesThe price elasticities of the market shares dened by equation (6) are

acutejkt 5sj t pkt

pkt sj t

5

(a pj t(1 sj t ) if j 5 k

a pktskt otherwise

There are two problems with these elasticities First since inmost cases the market shares are small the factor a (1 sj t) is nearlyconstant hence the own-price elasticities are proportional to ownprice Therefore the lower the price the lower the elasticity (in abso-lute value) which implies that a standard pricing model predicts ahigher markup for the lower-priced brands This is possible only if themarginal cost of a cheaper brand is lower (not just in absolute valuebut as a percentage of price) than that of a more expensive productFor some products this will not be true Note that this problem isa direct implication of the functional form in price If for exampleindirect utility was a function of the logarithm of price rather thanprice then the implied elasticity would be roughly constant In other

15 To see this compare equation (5) with equation (3)

Random-Coefcients Logit Models of Demand 523

words the functional form directly determines the patterns of own-price elasticity

An additional problem which has been stressed in the literatureis with the cross-price elasticities For example in the context of RTEcereals the cross-price elasticities imply that if Quaker CapN Crunch(a childernrsquos cereal) and Post Grape Nuts (a wholesome simple nutri-tion cereal) have similar market shares then the substitution fromGeneral Mills Lucky Charms (a childrenrsquos cereal) toward either ofthem will be the same Intuitively if the price of one childrenrsquos cerealgoes up we would expect more consumers to substitute to anotherchildrenrsquos cereal than to a nutrition cereal Yet the logit model restrictsconsumers to substitute towards other brands in proportion to marketshares regardless of characteristics

The problem in the cross-price elasticities comes from the iidstructure of the random shock In order to understand why this is thecase examine equation (3) A consumer will choose a product eitherbecause the mean utility from the product d j t is high or because theconsumer-specic shock l ij t 1 e ij t is high The distinction becomesimportant when we consider a change in the environment Considerfor example the increase in the price of Lucky Charms discussed inthe previous paragraph For some consumers who previously con-sumed Lucky Charms the utility from this product decreases enoughso that the utility from what was the second choice is now higherIn the logit model different consumers will have different rankings ofthe products but this difference is due only to the iid shock There-fore the proportion of these consumers who rank each brand as theirsecond choice is equal to the average in the population which is justthe market share of the each product

In order to get around this problem we need the shocks to util-ity to be correlated across brands By generating correlation we pre-dict that the second choice of consumers that decide to no longerbuy Lucky Charms will be different than that of the average con-sumer In particular they will be more likely to buy a product witha shock that was positively correlated to Lucky Charms for exampleCapN Crunch As we can see in equation (3) this correlation can begenerated either through the additive separable term e ij t or throughthe term l ij t which captures the effect the demographics Di and vi Appropriately dening the distributions of either of these terms canyield the exact same results The difference is only in modeling con-venience I now consider models of the two types

Models are available that induce correlation among options byallowing e ij t to be correlated across products rather than indepen-dently distributed are available (see the generalized extreme-valuemodel McFadden 1978) One such example is the nested logit model

524 Journal of Economics amp Management Strategy

in which all brands are grouped into predetermined exhaustive andmutually exclusive sets and e ij t is decomposed into an iid shockplus a group-specic component16 This implies that correlationbetween brands within a group is higher than across groups thus inthe example given above if the price of Lucky Charms goes up con-sumers are more likely to rank CapN Crunch as their second choiceTherefore consumers that currently consume Lucky Charms are morelikely to substitute towards Grape Nuts than the average consumerWithin the group the substitution is still driven by market shares ieif some childrenrsquos cereals are closer substitutes for Lucky Charms thanothers this will not be captured by the simple grouping17 The mainadvantage of the nested logit model is that like the logit model itimplies a closed form for the integral in equation (4) As we will seein Section 3 this simplies the computation

This nested logit can t into the model described by equation (1)in one of two ways by assuming a certain distribution of e ij t as wasmotivated in the previous paragraph or by assuming one of the char-acteristics of the product is a segment-specic dummy variable andassuming a particular distribution on the random coefcient of thatcharacteristic Given that we have shown that the model described byequations (1) and (2) can also be described by equation (3) it shouldnot be surprising that these two ways of describing the nested logitare equivalent Cardell (1997) shows the distributional assumptionsrequired for this equivalence to hold

The nested logit model allows for somewhat more exible sub-stitution patterns However in many cases the a priori division ofproducts into groups and the assumption of iid shocks within agroup will not be reasonable either because the division of segmentsis not clear or because the segmentation does not fully account forthe substitution patterns Furthermore the nested logit does not helpwith the problem of own-price elasticities This is usually handledby assuming some ldquonicerdquo functional form (ie yield patterns that areconsistent with some prior) but that does not solve the problem ofhaving the elasticities driven by the functional-form assumption

In some industries the segmentation of the market will be mul-tilayered For example computers can be divided into branded ver-sus generic and into frontier versus nonfrontier technology It turns

16 For a formal presentation of the nested logit model in the context of the modelpresented here see Berry (1994) or Stern (1995)

17 Of course one does not have to stop at one level of nesting For example wecould group all childrenrsquos cereals into family-acceptable and not acceptable For anexample of such grouping for automobiles see Goldberg (1995)

Random-Coefcients Logit Models of Demand 525

out that in the nested logit specication the order of the nests mat-ters18 For this reason Bresnahan et al (1997) build on a the generalextreme-value model (McFadden 1978) to construct what they call theprinciples-of-differentiation general extreme-value (PD GEV) model ofdemand for computers In their model they are able to use two dimen-sions of differentiation without ordering them With the exception ofdealing with the problem of ordering the nests this model retains allthe advantages and disadvantages of the nested logit In particular itimplies a closed-form expression for the integral in equation (4)

In principle one could consider estimating an unrestrictedvariance-covariance matrix of the shock e ij t This however reintro-duces the dimensionality problem discussed in the Introduction19 Ifin the full model described by equations (1) and (2) we maintainthe iid extreme-value distribution assumption on e ij t Correlationbetween choices is obtained through the term l ij t The correlationwill be a function of both product and consumer characteristics thecorrelation will be between products with similar characteristics andconsumers with similar demographics will have similar rankings ofproducts and therefore similar substitution patterns Therefore ratherthan having to estimate a large number of parameters correspond-ing to an unrestricted variance-covariance matrix we only have toestimate a smaller number

The price elasticities of the market shares sj t dened byequation (4) are

acutejkt 5sj t pkt

pkt sj t

5

( pj t

sj tH a i sij t(1 sij t) d AtildeP

D(D) dP m (v) if j 5 k

pkt

sj tH a isij tsikt d AtildeP

D(D) dP m (v) otherwise

where sij t 5 exp( d j t 1 l ij t) [1 1 S Kk 5 1 exp( d kt 1 l ikt)] is the probability of

individual i purchasing product j Now the own-price elasticity willnot necessary be driven by the functional form The partial derivativeof the market shares will no longer be determined by a single param-eter a Instead each individual will have a different price sensitivitywhich will be averaged to a mean price sensitivity using the individ-ual specic probabilities of purchase as weights The price sensitivity

18 So for example classifying computers rst into brandednonbranded and theninto frontiernonfrontier technology implies different substitution patterns than clas-sifying rst into frontiernonfrontier technology and then into brandednonbrandedeven if the classication of products does not change

19 See Hausman and Wise (1978) for an example of such a model with a smallnumber of products

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 3: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 515

However the required assumptions are strong and for many applicati-ons seem to be empirically false The difference between an aggregatemodel and a model that explicitly reects individual heterogeneitycan have profound affects on economic and policy conclusions

The logit demand model (McFadden 1973)4 solves the dimen-sionality problem by projecting the products onto a space of charac-teristics making the relevant size the dimension of this space and notthe square of the number of products A problem with this modelis the strong implication of some of the assumptions made Due tothe restrictive way in which heterogeneity is modeled substitutionbetween products is driven completely by market shares and not byhow similar the products are Extensions of the basic logit model relaxthese restrictive assumptions while maintaining the advantage of thelogit model in dealing with the dimensionality problem The essen-tial idea is to explicitly model heterogeneity in the population andestimate the unknown parameters governing the distribution of thisheterogeneity These models have been estimated using both market-and individual-level data5 The problem with the estimation is that ittreats the regressors including price as exogenously determined Thisis especially problematic when aggregate data is used to estimate themodel

This paper describes recent developments in methods for esti-mating random-coefcients discrete-choice models of demand [Berry1994 Berry et al 1995 (henceforth BLP)] The new method maintainsthe advantage of the logit model in handling a large number of prod-ucts It is superior to prior methods because (1) the model can beestimated using only market-level price and quantity data (2) it dealswith the endogeneity of prices and (3) it produces demand elasticitiesthat are more realisticmdashfor example cross-price elasticities are largerfor products that are closer together in terms of their characteristics

The rest of the paper is organized as follows Section 2 describesa model that encompasses with slight alterations the models pre-viously used in the literature The focus is on the various modelingassumptions and their implications for estimation and the results InSection 3 I discuss estimation including the data required an outlineof the algorithm and instrumental variables Many of the nitty-grittydetails of estimation are described in an appendix (available from

4 A related literature is the characteristics approach to demand or the addressapproach (Lancaster 1966 1971 Quandt 1968 Rosen 1974) For a recent exposition ofit and a proof of its equivalence to the discrete choice approach see Anderson et al

5 For example the generalized extreme-value model (McFadden 1978) and therandom-coefcients logit model (Cardell and Dunbar 1980 Boyd and Mellman 1980Tardiff 1980 Cardell 1989 and references therein) The random-coefcients model isoften called the hedonic demand model in this earlier literature it should not be con-fused with the hedonic price model (Court 1939 Griliches 1961)

516 Journal of Economics amp Management Strategy

httpelsaberkeleyedu ~ nevo) Section 4 provides a brief exampleof the type of results the estimation can produce Section 5 concludesand discusses various extensions and alternatives to the method de-scribed here

2 The Model

In this section I discuss the model with an emphasis on the vari-ous modeling assumptions and their implications In the next sectionI discuss the estimation details However for now I want to stresstwo points First the method I discuss here uses (market-level) priceand quantity data for each product in a series of markets to estimatethe model Some information regarding the distribution of consumercharacteristics might be available but a key benet of this methodol-ogy is that we do not need to observe individual consumer purchasedecisions to estimate the demand parameters6

Second the estimation allows prices to be correlated with theeconometric error term This will be modeled in the following wayA product will be dened by a set of characteristics Producers andconsumers are assumed to observe all product characteristics Theresearcher on the other hand is assumed to observe only some of theproduct characteristics Each product will be assumed to have a char-acteristic that inuences demand but that either is not observed by theresearcher or cannot be quantied into a variable that can be includedin the analysis Examples are provided below The unobserved charac-teristics will be captured by the econometric error term Since the pro-ducers know these characteristics and take them into account whensetting prices this introduces the econometric problem of endogenousprices7 The contribution of the estimation method presented belowis to transform the model in such a way that instrumental-variablemethods can be used

21 The Setup

Assume we observe t 5 1 T markets each with i 5 1 It

consumers For each such market we observe aggregate quantities

6 If individual decisions are observed the method of analysis differs somewhatfrom the one presented here For clarity of presentation I defer discussion of this caseto Section 5

7 The assumption that when setting prices rms take account of the unobserved(to the econometrician) characteristics is just one way to generate correlation betweenprices and these unobserved variables For example correlation can also result fromthe mechanics of the consumerrsquos optimization problem (Kennan 1989)

Random-Coefcients Logit Models of Demand 517

average prices and product characteristics for J products8 The de-nition of a market will depend on the structure of the data BLP useannual automobile sales over a period of twenty years and thereforedene a market as the national market for year t where t 5 1 20On the other hand Nevo (2000a) observes data in a cross section ofcities over twenty quarters and denes a market t as a cityndashquartercombination with t 5 1 1124 Yet a different example is givenby Das et al (1994) who observe sales for different income groupsand dene a market as the annual sales to consumers of a certainincome level

The indirect utility of consumer i from consuming product j inmarket t U (xj t raquoj t p j t iquesti h ) 9 is a function of observed and unobser-ved (by the researcher) product characteristics xj t and raquoj t respectivelyprice p j t individual characteristics iquesti and unknown parameters h I focus on a particular specication of this indirect utility10

u ij t 5 a i (yi p j t) 1 x j t b i 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (1)

where yi is the income of consumer i p j t is the price of product j inmarket t xj t is a K-dimensional (row) vector of observable character-istics of product j raquoj t is the unobserved (by the econometrician) prod-uct characteristic and e ij t is a mean-zero stochastic term Finally a i isconsumer i rsquos marginal utility from income and b i is a K-dimensional(column) vector of individual-specic taste coefcients

Observed characteristics vary with the product being consideredBLP examine the demand for cars and include as observed charac-teristics horsepower size and air conditioning In estimating demandfor ready-to-eat cereal Nevo (2000a) observes calories sodium andber content Unobserved characteristics for example can includethe impact of unobserved promotional activity unquantiable factors(brand equity) or systematic shocks to demand Depending on thestructure of the data some components of the unobserved characteris-tics can be captured by dummy variables For example we can modelraquoj t 5 raquoj 1 raquot 1 D raquoj t and capture raquoj and raquot by brand- and market-specicdummy variables

Implicit in the specication given by equation (1) are threethings First this form of the indirect utility can be derived from a

8 For ease of exposition I have assumed that all products are offered to all con-sumers in all markets The methods described below can easily deal with the casewhere the choice set differs between markets and also with different choice sets fordifferent consumers

9 This is sometimes called the conditional indirect utility ie the indirect utilityconditional on choosing this option

10 The methods discussed here are general and with minor adjustments can dealwith different functional forms

518 Journal of Economics amp Management Strategy

quasilinear utility function which is free of wealth effects For someproducts (for example ready-to-eat cereals) this is a reasonable ass-umption but for other products (for example cars) it is an unreason-able one Including wealth effects alters the way the term yi p j t entersequation (1) For example BLP build on a Cobb-Douglas utility func-tion to derive an indirect utility that is a function of log(yi p j t) Inprinciple we can include f (yi p j t) where f ( ) is a exible functionalform (Petrin 1999)

Second equation (1) species that the unobserved characteristicwhich among other things captures the elements of vertical productdifferentiation is identical for all consumers Since the coefcient onprice is allowed to vary among individuals this is consistent withthe theoretical literature of vertical product differentiation An alter-native is to model the distribution of the valuation of the unobservedcharacteristics as in Das et al (1994) As long as we have not madeany distributional assumptions on consumer-specic components (ieanything with subscript i) their model is not more general Once wemake such assumptions their model has slightly different implica-tions for some of the normalizations usually made An exact discus-sion of these implications is beyond the scope of this paper

Finally the specication in equation (1) assumes that all con-sumers face the same product characteristics In particular all con-sumers are offered the same price Depending on the data if differentconsumers face different prices using either a list or average trans-action price will lead to measurement error bias This just leads toanother reason why prices might be correlated with the error termand motivates the instrumental-variable procedure discussed below11

The next component of the model describes how consumer pref-erences vary as a function of the individual characteristics iquesti In thecontext of equation (1) this amounts to modeling the distribution ofconsumer taste parameters The individual characteristics consist oftwo components demographics which I refer to as observed andadditional characteristics which I refer to as unobserved denoted Di

and vi respectively Given that no individual data is observed neithercomponent of the individual characteristics is directly observed in thechoice data set The distinction between them is that even though wedo not observe individual data we know something about the distri-bution of the demographics Di while for the additional characteris-tics vi we have no such information Examples of demographics areincome age family size race and education Examples of the type of

11 However as noted by Berry (1994) the method proposed below can deal withmeasurement error only if the variable measured with error enters in a restrictive wayNamely it only enters the part of utility that is common to all consumers d j in equation(3) below

Random-Coefcients Logit Models of Demand 519

information we might have is a large sample we can use to estimatesome feature of the distribution (eg we could use Census data toestimate the mean and standard deviation of income) Alternativelywe might have a sample from the joint distribution of several demo-graphic variables (eg the Current Population Survey might tell usabout the joint distribution of income education and age in differ-ent cities in the US) The additional characteristics m i might includethings like whether the individual owns a dog a characteristic thatmight be important in the decision of which car to buy yet even verydetailed survey data will usually not include this fact

Formally this will be modeled as

a i

b i

5a

b1 OtildeDi 1 aringm i m i ~ P

m ( m ) Di ~ AtildeP D(D) (2)

where Di is a d acute 1 vector of demographic variables m i captures theadditional characteristics discussed in the previous paragraph P

m ( ) isa parametric distribution AtildeP

D( ) is either a nonparametric distributionknown from other data sources or a parametric distribution with theparameters estimated elsewhere Otilde is a (K 1 1) acute d matrix of coefcientsthat measure how the taste characteristics vary with demographicsand aring is a (K 1 1) acute (K 1 1) matrix of parameters12 If we assumethat P

m ( ) is a standard multivariate normal distribution as I do inthe example below then the matrix aring allows each component of m i

to have a different variance and allows for correlation between thesecharacteristics For simplicity I assume that m i and Di are indepen-dent Equation (2) assumes that demographics affect the distributionof the coefcients in a fairly restrictive linear way For those coef-cients that are most important to the analysis (eg the coefcients onprice) relaxing the linearity assumption could have important impli-cations [for example see the results reported in Nevo (2000a b)]

As we will see below the way we model heterogeneity hasstrong implications for the results The advantage of letting the tasteparameters vary with the observed demographics Di is twofold Firstit allows us to include additional information about the distributionof demographics in the analysis Furthermore it reduces the relianceon parametric assumptions Therefore instead of letting a key elementof the method the distribution of the random coefcients be deter-mined by potentially arbitrary distributional assumptions we bringin additional information

12 To simplify notation I assume that all characteristics have random coefcientsThis need not be the case I return to this in the appendix when I discuss the detailsof estimation

520 Journal of Economics amp Management Strategy

The specication of the demand system is completed with theintroduction of an outside good the consumers may decide not to pur-chase any of the brands Without this allowance a homogenous priceincrease (relative to other sectors) of all the products does not changequantities purchased The indirect utility from this outsideoption is

u i0t 5 a i yi 1 raquo0t 1 frac140Di 1 frac340vi0 1 e i0t

The mean utility from the outside good raquo0t is not identied (with-out either making more assumptions or normalizing one of the insidegoods) Also the coefcients frac140 and frac340 are not identied separatelyfrom coefcients on an individual-specic constant term in equation(1) The standard practice is to set raquo0t frac140 and frac340 to zero and since theterm a iyi will eventually vanish (because it is common to all prod-ucts) this is equivalent to normalizing the utility from the outsidegood to zero

Let h 5 ( h 1 h 2) be a vector containing all the parameters of themodel The vector h 1 5 ( a b ) contains the linear parameters and thevector h 2 5 (Otilde aring) the nonlinear parameters13 Combining equations(1) and (2) we have

u ij t 5 a iyi 1 d j t(x j t pj t raquoj t h 1) 1 u ij t(xj t p j t vi Di h 2) 1 e ij t

d j t 5 x j t b a p j t 1 raquoj t u ij t 5 [ pj t xj t] (OtildeDi 1 aringm i) (3)

where [ pj t x j t] is a 1 acute (K 1 1) (row) vector The indirect utilityis now expressed as a sum of three (or four) terms The rst terma i yi is given only for consistency with equation (1) and will van-ish as we will see below The second term d j t which is referredto as the mean utility is common to all consumers Finally the lasttwo terms l ij t 1 e ij t represent a mean-zero heteroskedastic devia-tion from the mean utility that captures the effects of the randomcoefcients

Consumers are assumed to purchase one unit of the good thatgives the highest utility14 Since in this model an individual is de-

13 The reasons for the names will become apparent below14 A comment is in place here about the realism of the assumption that consumers

choose no more than one good We know that many households own more than onecar that many of us buy more than one brand of cereal and so forth We note that eventhough many of us buy more than one brand at a time less actually consume morethan one at a time Therefore the discreteness of choice can be sometimes defended bydening the choice period appropriately In some cases this will still not be enough inwhich case the researcher has one of two options either claim that the above modelis an approximation or reduced-form to the true choice model or model the choiceof multiple products or continuous quantities explicitly [as in Dubin and McFadden(1984) or Hendel (1999)]

Random-Coefcients Logit Models of Demand 521

ned as a vector of demographics and product-specic shocks(Di m i e i0t e iJ t) this implicitly denes the set of individualattributes that lead to the choice of good j Formally let this set be

A j t(x t p t d t h 2) 5copy

(Di vi e i0t e iJ t) |u ij t sup3 u ilt

l 5 0 1 Jordf

where x t 5 (xlt xJ t) cent p t 5 (p lt pJ t) cent and d t 5 ( d lt d J t) cent

are observed characteristics prices and mean utilities of all brandsrespectively The set A j t denes the individuals who choose brand j inmarket t Assuming ties occur with zero probability the market shareof the j th product is just an integral over the mass of consumers inthe region A j t Formally it is given by

sj t(x t p t d t h 2) 5 HA j t

dP (D v e )

5 HA j t

dP ( e |D m ) dP ( m |D) dP D(D)

5 HA j t

dP e ( e ) dP

m ( m ) dAtildeP D(D) (4)

where P ( ) denotes population distribution functions The secondequality is a direct application of Bayesrsquo rule while the last is a con-sequence of the independence assumptions previously made

Given assumptions on the distribution of the (unobserved) indi-vidual attributes we can compute the integral in equation (4) eitheranalytically or numerically Therefore for a given set of parametersequation (4) predicts of the market share of each product in eachmarket as a function of product characteristics prices and unknownparameters One possible estimation strategy is to choose parame-ters that minimize the distance (in some metric) between the mar-ket shares predicted by equation (4) and the observed shares Thisestimation strategy will yield estimates of the parameters that deter-mine the distribution of individual attributes but it does not accountfor the correlation between prices and the unobserved product char-acteristics The method proposed by Berry (1994) and BLP which ispresented in detail in the Section 3 accounts for this correlation

22 Distributional Assumptions

The assumptions on the distribution of individual attributes made inorder to compute the integral in equation (4) have important impli-cations for the own- and cross-price elasticities of demand In thissection I discuss some possible assumptions and their implications

522 Journal of Economics amp Management Strategy

Possibly the simplest distributional assumption one can make inorder to evaluate the integral in equation (4) is that consumer hetero-geneity enters the model only through the separable additive randomshock e ij t In our model this implies h 2 5 0 or b i 5 b and a i 5 a forall i and equation (1) becomes

u ij t 5 a (yi p j t) 1 x j t b 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (5)

At this point before we specify the distribution of e ij t the modeldescribed by equation (5) is as general as the model given inequation (1)15 Once we assume that e ij t is iid then the implied sub-stitution patterns are severely restricted as we will see below If wealso assume that e ij t are distributed according to a Type I extreme-value distribution this is the (aggregate) logit model The marketshare of brand j in market t dened by equation (4) is

sj t 5exp(x j t b a p j t 1 raquoj t)

1 1 S Jk 5 1 exp(xkt b a pkt 1 raquokt)

(6)

Note that income drops out of this equation since it is common to alloptions

Although the model implied by equation (5) and the extreme-value distribution assumption is appealing due to its tractability itrestricts the substitution patterns to depend only on the market sharesThe price elasticities of the market shares dened by equation (6) are

acutejkt 5sj t pkt

pkt sj t

5

(a pj t(1 sj t ) if j 5 k

a pktskt otherwise

There are two problems with these elasticities First since inmost cases the market shares are small the factor a (1 sj t) is nearlyconstant hence the own-price elasticities are proportional to ownprice Therefore the lower the price the lower the elasticity (in abso-lute value) which implies that a standard pricing model predicts ahigher markup for the lower-priced brands This is possible only if themarginal cost of a cheaper brand is lower (not just in absolute valuebut as a percentage of price) than that of a more expensive productFor some products this will not be true Note that this problem isa direct implication of the functional form in price If for exampleindirect utility was a function of the logarithm of price rather thanprice then the implied elasticity would be roughly constant In other

15 To see this compare equation (5) with equation (3)

Random-Coefcients Logit Models of Demand 523

words the functional form directly determines the patterns of own-price elasticity

An additional problem which has been stressed in the literatureis with the cross-price elasticities For example in the context of RTEcereals the cross-price elasticities imply that if Quaker CapN Crunch(a childernrsquos cereal) and Post Grape Nuts (a wholesome simple nutri-tion cereal) have similar market shares then the substitution fromGeneral Mills Lucky Charms (a childrenrsquos cereal) toward either ofthem will be the same Intuitively if the price of one childrenrsquos cerealgoes up we would expect more consumers to substitute to anotherchildrenrsquos cereal than to a nutrition cereal Yet the logit model restrictsconsumers to substitute towards other brands in proportion to marketshares regardless of characteristics

The problem in the cross-price elasticities comes from the iidstructure of the random shock In order to understand why this is thecase examine equation (3) A consumer will choose a product eitherbecause the mean utility from the product d j t is high or because theconsumer-specic shock l ij t 1 e ij t is high The distinction becomesimportant when we consider a change in the environment Considerfor example the increase in the price of Lucky Charms discussed inthe previous paragraph For some consumers who previously con-sumed Lucky Charms the utility from this product decreases enoughso that the utility from what was the second choice is now higherIn the logit model different consumers will have different rankings ofthe products but this difference is due only to the iid shock There-fore the proportion of these consumers who rank each brand as theirsecond choice is equal to the average in the population which is justthe market share of the each product

In order to get around this problem we need the shocks to util-ity to be correlated across brands By generating correlation we pre-dict that the second choice of consumers that decide to no longerbuy Lucky Charms will be different than that of the average con-sumer In particular they will be more likely to buy a product witha shock that was positively correlated to Lucky Charms for exampleCapN Crunch As we can see in equation (3) this correlation can begenerated either through the additive separable term e ij t or throughthe term l ij t which captures the effect the demographics Di and vi Appropriately dening the distributions of either of these terms canyield the exact same results The difference is only in modeling con-venience I now consider models of the two types

Models are available that induce correlation among options byallowing e ij t to be correlated across products rather than indepen-dently distributed are available (see the generalized extreme-valuemodel McFadden 1978) One such example is the nested logit model

524 Journal of Economics amp Management Strategy

in which all brands are grouped into predetermined exhaustive andmutually exclusive sets and e ij t is decomposed into an iid shockplus a group-specic component16 This implies that correlationbetween brands within a group is higher than across groups thus inthe example given above if the price of Lucky Charms goes up con-sumers are more likely to rank CapN Crunch as their second choiceTherefore consumers that currently consume Lucky Charms are morelikely to substitute towards Grape Nuts than the average consumerWithin the group the substitution is still driven by market shares ieif some childrenrsquos cereals are closer substitutes for Lucky Charms thanothers this will not be captured by the simple grouping17 The mainadvantage of the nested logit model is that like the logit model itimplies a closed form for the integral in equation (4) As we will seein Section 3 this simplies the computation

This nested logit can t into the model described by equation (1)in one of two ways by assuming a certain distribution of e ij t as wasmotivated in the previous paragraph or by assuming one of the char-acteristics of the product is a segment-specic dummy variable andassuming a particular distribution on the random coefcient of thatcharacteristic Given that we have shown that the model described byequations (1) and (2) can also be described by equation (3) it shouldnot be surprising that these two ways of describing the nested logitare equivalent Cardell (1997) shows the distributional assumptionsrequired for this equivalence to hold

The nested logit model allows for somewhat more exible sub-stitution patterns However in many cases the a priori division ofproducts into groups and the assumption of iid shocks within agroup will not be reasonable either because the division of segmentsis not clear or because the segmentation does not fully account forthe substitution patterns Furthermore the nested logit does not helpwith the problem of own-price elasticities This is usually handledby assuming some ldquonicerdquo functional form (ie yield patterns that areconsistent with some prior) but that does not solve the problem ofhaving the elasticities driven by the functional-form assumption

In some industries the segmentation of the market will be mul-tilayered For example computers can be divided into branded ver-sus generic and into frontier versus nonfrontier technology It turns

16 For a formal presentation of the nested logit model in the context of the modelpresented here see Berry (1994) or Stern (1995)

17 Of course one does not have to stop at one level of nesting For example wecould group all childrenrsquos cereals into family-acceptable and not acceptable For anexample of such grouping for automobiles see Goldberg (1995)

Random-Coefcients Logit Models of Demand 525

out that in the nested logit specication the order of the nests mat-ters18 For this reason Bresnahan et al (1997) build on a the generalextreme-value model (McFadden 1978) to construct what they call theprinciples-of-differentiation general extreme-value (PD GEV) model ofdemand for computers In their model they are able to use two dimen-sions of differentiation without ordering them With the exception ofdealing with the problem of ordering the nests this model retains allthe advantages and disadvantages of the nested logit In particular itimplies a closed-form expression for the integral in equation (4)

In principle one could consider estimating an unrestrictedvariance-covariance matrix of the shock e ij t This however reintro-duces the dimensionality problem discussed in the Introduction19 Ifin the full model described by equations (1) and (2) we maintainthe iid extreme-value distribution assumption on e ij t Correlationbetween choices is obtained through the term l ij t The correlationwill be a function of both product and consumer characteristics thecorrelation will be between products with similar characteristics andconsumers with similar demographics will have similar rankings ofproducts and therefore similar substitution patterns Therefore ratherthan having to estimate a large number of parameters correspond-ing to an unrestricted variance-covariance matrix we only have toestimate a smaller number

The price elasticities of the market shares sj t dened byequation (4) are

acutejkt 5sj t pkt

pkt sj t

5

( pj t

sj tH a i sij t(1 sij t) d AtildeP

D(D) dP m (v) if j 5 k

pkt

sj tH a isij tsikt d AtildeP

D(D) dP m (v) otherwise

where sij t 5 exp( d j t 1 l ij t) [1 1 S Kk 5 1 exp( d kt 1 l ikt)] is the probability of

individual i purchasing product j Now the own-price elasticity willnot necessary be driven by the functional form The partial derivativeof the market shares will no longer be determined by a single param-eter a Instead each individual will have a different price sensitivitywhich will be averaged to a mean price sensitivity using the individ-ual specic probabilities of purchase as weights The price sensitivity

18 So for example classifying computers rst into brandednonbranded and theninto frontiernonfrontier technology implies different substitution patterns than clas-sifying rst into frontiernonfrontier technology and then into brandednonbrandedeven if the classication of products does not change

19 See Hausman and Wise (1978) for an example of such a model with a smallnumber of products

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 4: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

516 Journal of Economics amp Management Strategy

httpelsaberkeleyedu ~ nevo) Section 4 provides a brief exampleof the type of results the estimation can produce Section 5 concludesand discusses various extensions and alternatives to the method de-scribed here

2 The Model

In this section I discuss the model with an emphasis on the vari-ous modeling assumptions and their implications In the next sectionI discuss the estimation details However for now I want to stresstwo points First the method I discuss here uses (market-level) priceand quantity data for each product in a series of markets to estimatethe model Some information regarding the distribution of consumercharacteristics might be available but a key benet of this methodol-ogy is that we do not need to observe individual consumer purchasedecisions to estimate the demand parameters6

Second the estimation allows prices to be correlated with theeconometric error term This will be modeled in the following wayA product will be dened by a set of characteristics Producers andconsumers are assumed to observe all product characteristics Theresearcher on the other hand is assumed to observe only some of theproduct characteristics Each product will be assumed to have a char-acteristic that inuences demand but that either is not observed by theresearcher or cannot be quantied into a variable that can be includedin the analysis Examples are provided below The unobserved charac-teristics will be captured by the econometric error term Since the pro-ducers know these characteristics and take them into account whensetting prices this introduces the econometric problem of endogenousprices7 The contribution of the estimation method presented belowis to transform the model in such a way that instrumental-variablemethods can be used

21 The Setup

Assume we observe t 5 1 T markets each with i 5 1 It

consumers For each such market we observe aggregate quantities

6 If individual decisions are observed the method of analysis differs somewhatfrom the one presented here For clarity of presentation I defer discussion of this caseto Section 5

7 The assumption that when setting prices rms take account of the unobserved(to the econometrician) characteristics is just one way to generate correlation betweenprices and these unobserved variables For example correlation can also result fromthe mechanics of the consumerrsquos optimization problem (Kennan 1989)

Random-Coefcients Logit Models of Demand 517

average prices and product characteristics for J products8 The de-nition of a market will depend on the structure of the data BLP useannual automobile sales over a period of twenty years and thereforedene a market as the national market for year t where t 5 1 20On the other hand Nevo (2000a) observes data in a cross section ofcities over twenty quarters and denes a market t as a cityndashquartercombination with t 5 1 1124 Yet a different example is givenby Das et al (1994) who observe sales for different income groupsand dene a market as the annual sales to consumers of a certainincome level

The indirect utility of consumer i from consuming product j inmarket t U (xj t raquoj t p j t iquesti h ) 9 is a function of observed and unobser-ved (by the researcher) product characteristics xj t and raquoj t respectivelyprice p j t individual characteristics iquesti and unknown parameters h I focus on a particular specication of this indirect utility10

u ij t 5 a i (yi p j t) 1 x j t b i 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (1)

where yi is the income of consumer i p j t is the price of product j inmarket t xj t is a K-dimensional (row) vector of observable character-istics of product j raquoj t is the unobserved (by the econometrician) prod-uct characteristic and e ij t is a mean-zero stochastic term Finally a i isconsumer i rsquos marginal utility from income and b i is a K-dimensional(column) vector of individual-specic taste coefcients

Observed characteristics vary with the product being consideredBLP examine the demand for cars and include as observed charac-teristics horsepower size and air conditioning In estimating demandfor ready-to-eat cereal Nevo (2000a) observes calories sodium andber content Unobserved characteristics for example can includethe impact of unobserved promotional activity unquantiable factors(brand equity) or systematic shocks to demand Depending on thestructure of the data some components of the unobserved characteris-tics can be captured by dummy variables For example we can modelraquoj t 5 raquoj 1 raquot 1 D raquoj t and capture raquoj and raquot by brand- and market-specicdummy variables

Implicit in the specication given by equation (1) are threethings First this form of the indirect utility can be derived from a

8 For ease of exposition I have assumed that all products are offered to all con-sumers in all markets The methods described below can easily deal with the casewhere the choice set differs between markets and also with different choice sets fordifferent consumers

9 This is sometimes called the conditional indirect utility ie the indirect utilityconditional on choosing this option

10 The methods discussed here are general and with minor adjustments can dealwith different functional forms

518 Journal of Economics amp Management Strategy

quasilinear utility function which is free of wealth effects For someproducts (for example ready-to-eat cereals) this is a reasonable ass-umption but for other products (for example cars) it is an unreason-able one Including wealth effects alters the way the term yi p j t entersequation (1) For example BLP build on a Cobb-Douglas utility func-tion to derive an indirect utility that is a function of log(yi p j t) Inprinciple we can include f (yi p j t) where f ( ) is a exible functionalform (Petrin 1999)

Second equation (1) species that the unobserved characteristicwhich among other things captures the elements of vertical productdifferentiation is identical for all consumers Since the coefcient onprice is allowed to vary among individuals this is consistent withthe theoretical literature of vertical product differentiation An alter-native is to model the distribution of the valuation of the unobservedcharacteristics as in Das et al (1994) As long as we have not madeany distributional assumptions on consumer-specic components (ieanything with subscript i) their model is not more general Once wemake such assumptions their model has slightly different implica-tions for some of the normalizations usually made An exact discus-sion of these implications is beyond the scope of this paper

Finally the specication in equation (1) assumes that all con-sumers face the same product characteristics In particular all con-sumers are offered the same price Depending on the data if differentconsumers face different prices using either a list or average trans-action price will lead to measurement error bias This just leads toanother reason why prices might be correlated with the error termand motivates the instrumental-variable procedure discussed below11

The next component of the model describes how consumer pref-erences vary as a function of the individual characteristics iquesti In thecontext of equation (1) this amounts to modeling the distribution ofconsumer taste parameters The individual characteristics consist oftwo components demographics which I refer to as observed andadditional characteristics which I refer to as unobserved denoted Di

and vi respectively Given that no individual data is observed neithercomponent of the individual characteristics is directly observed in thechoice data set The distinction between them is that even though wedo not observe individual data we know something about the distri-bution of the demographics Di while for the additional characteris-tics vi we have no such information Examples of demographics areincome age family size race and education Examples of the type of

11 However as noted by Berry (1994) the method proposed below can deal withmeasurement error only if the variable measured with error enters in a restrictive wayNamely it only enters the part of utility that is common to all consumers d j in equation(3) below

Random-Coefcients Logit Models of Demand 519

information we might have is a large sample we can use to estimatesome feature of the distribution (eg we could use Census data toestimate the mean and standard deviation of income) Alternativelywe might have a sample from the joint distribution of several demo-graphic variables (eg the Current Population Survey might tell usabout the joint distribution of income education and age in differ-ent cities in the US) The additional characteristics m i might includethings like whether the individual owns a dog a characteristic thatmight be important in the decision of which car to buy yet even verydetailed survey data will usually not include this fact

Formally this will be modeled as

a i

b i

5a

b1 OtildeDi 1 aringm i m i ~ P

m ( m ) Di ~ AtildeP D(D) (2)

where Di is a d acute 1 vector of demographic variables m i captures theadditional characteristics discussed in the previous paragraph P

m ( ) isa parametric distribution AtildeP

D( ) is either a nonparametric distributionknown from other data sources or a parametric distribution with theparameters estimated elsewhere Otilde is a (K 1 1) acute d matrix of coefcientsthat measure how the taste characteristics vary with demographicsand aring is a (K 1 1) acute (K 1 1) matrix of parameters12 If we assumethat P

m ( ) is a standard multivariate normal distribution as I do inthe example below then the matrix aring allows each component of m i

to have a different variance and allows for correlation between thesecharacteristics For simplicity I assume that m i and Di are indepen-dent Equation (2) assumes that demographics affect the distributionof the coefcients in a fairly restrictive linear way For those coef-cients that are most important to the analysis (eg the coefcients onprice) relaxing the linearity assumption could have important impli-cations [for example see the results reported in Nevo (2000a b)]

As we will see below the way we model heterogeneity hasstrong implications for the results The advantage of letting the tasteparameters vary with the observed demographics Di is twofold Firstit allows us to include additional information about the distributionof demographics in the analysis Furthermore it reduces the relianceon parametric assumptions Therefore instead of letting a key elementof the method the distribution of the random coefcients be deter-mined by potentially arbitrary distributional assumptions we bringin additional information

12 To simplify notation I assume that all characteristics have random coefcientsThis need not be the case I return to this in the appendix when I discuss the detailsof estimation

520 Journal of Economics amp Management Strategy

The specication of the demand system is completed with theintroduction of an outside good the consumers may decide not to pur-chase any of the brands Without this allowance a homogenous priceincrease (relative to other sectors) of all the products does not changequantities purchased The indirect utility from this outsideoption is

u i0t 5 a i yi 1 raquo0t 1 frac140Di 1 frac340vi0 1 e i0t

The mean utility from the outside good raquo0t is not identied (with-out either making more assumptions or normalizing one of the insidegoods) Also the coefcients frac140 and frac340 are not identied separatelyfrom coefcients on an individual-specic constant term in equation(1) The standard practice is to set raquo0t frac140 and frac340 to zero and since theterm a iyi will eventually vanish (because it is common to all prod-ucts) this is equivalent to normalizing the utility from the outsidegood to zero

Let h 5 ( h 1 h 2) be a vector containing all the parameters of themodel The vector h 1 5 ( a b ) contains the linear parameters and thevector h 2 5 (Otilde aring) the nonlinear parameters13 Combining equations(1) and (2) we have

u ij t 5 a iyi 1 d j t(x j t pj t raquoj t h 1) 1 u ij t(xj t p j t vi Di h 2) 1 e ij t

d j t 5 x j t b a p j t 1 raquoj t u ij t 5 [ pj t xj t] (OtildeDi 1 aringm i) (3)

where [ pj t x j t] is a 1 acute (K 1 1) (row) vector The indirect utilityis now expressed as a sum of three (or four) terms The rst terma i yi is given only for consistency with equation (1) and will van-ish as we will see below The second term d j t which is referredto as the mean utility is common to all consumers Finally the lasttwo terms l ij t 1 e ij t represent a mean-zero heteroskedastic devia-tion from the mean utility that captures the effects of the randomcoefcients

Consumers are assumed to purchase one unit of the good thatgives the highest utility14 Since in this model an individual is de-

13 The reasons for the names will become apparent below14 A comment is in place here about the realism of the assumption that consumers

choose no more than one good We know that many households own more than onecar that many of us buy more than one brand of cereal and so forth We note that eventhough many of us buy more than one brand at a time less actually consume morethan one at a time Therefore the discreteness of choice can be sometimes defended bydening the choice period appropriately In some cases this will still not be enough inwhich case the researcher has one of two options either claim that the above modelis an approximation or reduced-form to the true choice model or model the choiceof multiple products or continuous quantities explicitly [as in Dubin and McFadden(1984) or Hendel (1999)]

Random-Coefcients Logit Models of Demand 521

ned as a vector of demographics and product-specic shocks(Di m i e i0t e iJ t) this implicitly denes the set of individualattributes that lead to the choice of good j Formally let this set be

A j t(x t p t d t h 2) 5copy

(Di vi e i0t e iJ t) |u ij t sup3 u ilt

l 5 0 1 Jordf

where x t 5 (xlt xJ t) cent p t 5 (p lt pJ t) cent and d t 5 ( d lt d J t) cent

are observed characteristics prices and mean utilities of all brandsrespectively The set A j t denes the individuals who choose brand j inmarket t Assuming ties occur with zero probability the market shareof the j th product is just an integral over the mass of consumers inthe region A j t Formally it is given by

sj t(x t p t d t h 2) 5 HA j t

dP (D v e )

5 HA j t

dP ( e |D m ) dP ( m |D) dP D(D)

5 HA j t

dP e ( e ) dP

m ( m ) dAtildeP D(D) (4)

where P ( ) denotes population distribution functions The secondequality is a direct application of Bayesrsquo rule while the last is a con-sequence of the independence assumptions previously made

Given assumptions on the distribution of the (unobserved) indi-vidual attributes we can compute the integral in equation (4) eitheranalytically or numerically Therefore for a given set of parametersequation (4) predicts of the market share of each product in eachmarket as a function of product characteristics prices and unknownparameters One possible estimation strategy is to choose parame-ters that minimize the distance (in some metric) between the mar-ket shares predicted by equation (4) and the observed shares Thisestimation strategy will yield estimates of the parameters that deter-mine the distribution of individual attributes but it does not accountfor the correlation between prices and the unobserved product char-acteristics The method proposed by Berry (1994) and BLP which ispresented in detail in the Section 3 accounts for this correlation

22 Distributional Assumptions

The assumptions on the distribution of individual attributes made inorder to compute the integral in equation (4) have important impli-cations for the own- and cross-price elasticities of demand In thissection I discuss some possible assumptions and their implications

522 Journal of Economics amp Management Strategy

Possibly the simplest distributional assumption one can make inorder to evaluate the integral in equation (4) is that consumer hetero-geneity enters the model only through the separable additive randomshock e ij t In our model this implies h 2 5 0 or b i 5 b and a i 5 a forall i and equation (1) becomes

u ij t 5 a (yi p j t) 1 x j t b 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (5)

At this point before we specify the distribution of e ij t the modeldescribed by equation (5) is as general as the model given inequation (1)15 Once we assume that e ij t is iid then the implied sub-stitution patterns are severely restricted as we will see below If wealso assume that e ij t are distributed according to a Type I extreme-value distribution this is the (aggregate) logit model The marketshare of brand j in market t dened by equation (4) is

sj t 5exp(x j t b a p j t 1 raquoj t)

1 1 S Jk 5 1 exp(xkt b a pkt 1 raquokt)

(6)

Note that income drops out of this equation since it is common to alloptions

Although the model implied by equation (5) and the extreme-value distribution assumption is appealing due to its tractability itrestricts the substitution patterns to depend only on the market sharesThe price elasticities of the market shares dened by equation (6) are

acutejkt 5sj t pkt

pkt sj t

5

(a pj t(1 sj t ) if j 5 k

a pktskt otherwise

There are two problems with these elasticities First since inmost cases the market shares are small the factor a (1 sj t) is nearlyconstant hence the own-price elasticities are proportional to ownprice Therefore the lower the price the lower the elasticity (in abso-lute value) which implies that a standard pricing model predicts ahigher markup for the lower-priced brands This is possible only if themarginal cost of a cheaper brand is lower (not just in absolute valuebut as a percentage of price) than that of a more expensive productFor some products this will not be true Note that this problem isa direct implication of the functional form in price If for exampleindirect utility was a function of the logarithm of price rather thanprice then the implied elasticity would be roughly constant In other

15 To see this compare equation (5) with equation (3)

Random-Coefcients Logit Models of Demand 523

words the functional form directly determines the patterns of own-price elasticity

An additional problem which has been stressed in the literatureis with the cross-price elasticities For example in the context of RTEcereals the cross-price elasticities imply that if Quaker CapN Crunch(a childernrsquos cereal) and Post Grape Nuts (a wholesome simple nutri-tion cereal) have similar market shares then the substitution fromGeneral Mills Lucky Charms (a childrenrsquos cereal) toward either ofthem will be the same Intuitively if the price of one childrenrsquos cerealgoes up we would expect more consumers to substitute to anotherchildrenrsquos cereal than to a nutrition cereal Yet the logit model restrictsconsumers to substitute towards other brands in proportion to marketshares regardless of characteristics

The problem in the cross-price elasticities comes from the iidstructure of the random shock In order to understand why this is thecase examine equation (3) A consumer will choose a product eitherbecause the mean utility from the product d j t is high or because theconsumer-specic shock l ij t 1 e ij t is high The distinction becomesimportant when we consider a change in the environment Considerfor example the increase in the price of Lucky Charms discussed inthe previous paragraph For some consumers who previously con-sumed Lucky Charms the utility from this product decreases enoughso that the utility from what was the second choice is now higherIn the logit model different consumers will have different rankings ofthe products but this difference is due only to the iid shock There-fore the proportion of these consumers who rank each brand as theirsecond choice is equal to the average in the population which is justthe market share of the each product

In order to get around this problem we need the shocks to util-ity to be correlated across brands By generating correlation we pre-dict that the second choice of consumers that decide to no longerbuy Lucky Charms will be different than that of the average con-sumer In particular they will be more likely to buy a product witha shock that was positively correlated to Lucky Charms for exampleCapN Crunch As we can see in equation (3) this correlation can begenerated either through the additive separable term e ij t or throughthe term l ij t which captures the effect the demographics Di and vi Appropriately dening the distributions of either of these terms canyield the exact same results The difference is only in modeling con-venience I now consider models of the two types

Models are available that induce correlation among options byallowing e ij t to be correlated across products rather than indepen-dently distributed are available (see the generalized extreme-valuemodel McFadden 1978) One such example is the nested logit model

524 Journal of Economics amp Management Strategy

in which all brands are grouped into predetermined exhaustive andmutually exclusive sets and e ij t is decomposed into an iid shockplus a group-specic component16 This implies that correlationbetween brands within a group is higher than across groups thus inthe example given above if the price of Lucky Charms goes up con-sumers are more likely to rank CapN Crunch as their second choiceTherefore consumers that currently consume Lucky Charms are morelikely to substitute towards Grape Nuts than the average consumerWithin the group the substitution is still driven by market shares ieif some childrenrsquos cereals are closer substitutes for Lucky Charms thanothers this will not be captured by the simple grouping17 The mainadvantage of the nested logit model is that like the logit model itimplies a closed form for the integral in equation (4) As we will seein Section 3 this simplies the computation

This nested logit can t into the model described by equation (1)in one of two ways by assuming a certain distribution of e ij t as wasmotivated in the previous paragraph or by assuming one of the char-acteristics of the product is a segment-specic dummy variable andassuming a particular distribution on the random coefcient of thatcharacteristic Given that we have shown that the model described byequations (1) and (2) can also be described by equation (3) it shouldnot be surprising that these two ways of describing the nested logitare equivalent Cardell (1997) shows the distributional assumptionsrequired for this equivalence to hold

The nested logit model allows for somewhat more exible sub-stitution patterns However in many cases the a priori division ofproducts into groups and the assumption of iid shocks within agroup will not be reasonable either because the division of segmentsis not clear or because the segmentation does not fully account forthe substitution patterns Furthermore the nested logit does not helpwith the problem of own-price elasticities This is usually handledby assuming some ldquonicerdquo functional form (ie yield patterns that areconsistent with some prior) but that does not solve the problem ofhaving the elasticities driven by the functional-form assumption

In some industries the segmentation of the market will be mul-tilayered For example computers can be divided into branded ver-sus generic and into frontier versus nonfrontier technology It turns

16 For a formal presentation of the nested logit model in the context of the modelpresented here see Berry (1994) or Stern (1995)

17 Of course one does not have to stop at one level of nesting For example wecould group all childrenrsquos cereals into family-acceptable and not acceptable For anexample of such grouping for automobiles see Goldberg (1995)

Random-Coefcients Logit Models of Demand 525

out that in the nested logit specication the order of the nests mat-ters18 For this reason Bresnahan et al (1997) build on a the generalextreme-value model (McFadden 1978) to construct what they call theprinciples-of-differentiation general extreme-value (PD GEV) model ofdemand for computers In their model they are able to use two dimen-sions of differentiation without ordering them With the exception ofdealing with the problem of ordering the nests this model retains allthe advantages and disadvantages of the nested logit In particular itimplies a closed-form expression for the integral in equation (4)

In principle one could consider estimating an unrestrictedvariance-covariance matrix of the shock e ij t This however reintro-duces the dimensionality problem discussed in the Introduction19 Ifin the full model described by equations (1) and (2) we maintainthe iid extreme-value distribution assumption on e ij t Correlationbetween choices is obtained through the term l ij t The correlationwill be a function of both product and consumer characteristics thecorrelation will be between products with similar characteristics andconsumers with similar demographics will have similar rankings ofproducts and therefore similar substitution patterns Therefore ratherthan having to estimate a large number of parameters correspond-ing to an unrestricted variance-covariance matrix we only have toestimate a smaller number

The price elasticities of the market shares sj t dened byequation (4) are

acutejkt 5sj t pkt

pkt sj t

5

( pj t

sj tH a i sij t(1 sij t) d AtildeP

D(D) dP m (v) if j 5 k

pkt

sj tH a isij tsikt d AtildeP

D(D) dP m (v) otherwise

where sij t 5 exp( d j t 1 l ij t) [1 1 S Kk 5 1 exp( d kt 1 l ikt)] is the probability of

individual i purchasing product j Now the own-price elasticity willnot necessary be driven by the functional form The partial derivativeof the market shares will no longer be determined by a single param-eter a Instead each individual will have a different price sensitivitywhich will be averaged to a mean price sensitivity using the individ-ual specic probabilities of purchase as weights The price sensitivity

18 So for example classifying computers rst into brandednonbranded and theninto frontiernonfrontier technology implies different substitution patterns than clas-sifying rst into frontiernonfrontier technology and then into brandednonbrandedeven if the classication of products does not change

19 See Hausman and Wise (1978) for an example of such a model with a smallnumber of products

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 5: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 517

average prices and product characteristics for J products8 The de-nition of a market will depend on the structure of the data BLP useannual automobile sales over a period of twenty years and thereforedene a market as the national market for year t where t 5 1 20On the other hand Nevo (2000a) observes data in a cross section ofcities over twenty quarters and denes a market t as a cityndashquartercombination with t 5 1 1124 Yet a different example is givenby Das et al (1994) who observe sales for different income groupsand dene a market as the annual sales to consumers of a certainincome level

The indirect utility of consumer i from consuming product j inmarket t U (xj t raquoj t p j t iquesti h ) 9 is a function of observed and unobser-ved (by the researcher) product characteristics xj t and raquoj t respectivelyprice p j t individual characteristics iquesti and unknown parameters h I focus on a particular specication of this indirect utility10

u ij t 5 a i (yi p j t) 1 x j t b i 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (1)

where yi is the income of consumer i p j t is the price of product j inmarket t xj t is a K-dimensional (row) vector of observable character-istics of product j raquoj t is the unobserved (by the econometrician) prod-uct characteristic and e ij t is a mean-zero stochastic term Finally a i isconsumer i rsquos marginal utility from income and b i is a K-dimensional(column) vector of individual-specic taste coefcients

Observed characteristics vary with the product being consideredBLP examine the demand for cars and include as observed charac-teristics horsepower size and air conditioning In estimating demandfor ready-to-eat cereal Nevo (2000a) observes calories sodium andber content Unobserved characteristics for example can includethe impact of unobserved promotional activity unquantiable factors(brand equity) or systematic shocks to demand Depending on thestructure of the data some components of the unobserved characteris-tics can be captured by dummy variables For example we can modelraquoj t 5 raquoj 1 raquot 1 D raquoj t and capture raquoj and raquot by brand- and market-specicdummy variables

Implicit in the specication given by equation (1) are threethings First this form of the indirect utility can be derived from a

8 For ease of exposition I have assumed that all products are offered to all con-sumers in all markets The methods described below can easily deal with the casewhere the choice set differs between markets and also with different choice sets fordifferent consumers

9 This is sometimes called the conditional indirect utility ie the indirect utilityconditional on choosing this option

10 The methods discussed here are general and with minor adjustments can dealwith different functional forms

518 Journal of Economics amp Management Strategy

quasilinear utility function which is free of wealth effects For someproducts (for example ready-to-eat cereals) this is a reasonable ass-umption but for other products (for example cars) it is an unreason-able one Including wealth effects alters the way the term yi p j t entersequation (1) For example BLP build on a Cobb-Douglas utility func-tion to derive an indirect utility that is a function of log(yi p j t) Inprinciple we can include f (yi p j t) where f ( ) is a exible functionalform (Petrin 1999)

Second equation (1) species that the unobserved characteristicwhich among other things captures the elements of vertical productdifferentiation is identical for all consumers Since the coefcient onprice is allowed to vary among individuals this is consistent withthe theoretical literature of vertical product differentiation An alter-native is to model the distribution of the valuation of the unobservedcharacteristics as in Das et al (1994) As long as we have not madeany distributional assumptions on consumer-specic components (ieanything with subscript i) their model is not more general Once wemake such assumptions their model has slightly different implica-tions for some of the normalizations usually made An exact discus-sion of these implications is beyond the scope of this paper

Finally the specication in equation (1) assumes that all con-sumers face the same product characteristics In particular all con-sumers are offered the same price Depending on the data if differentconsumers face different prices using either a list or average trans-action price will lead to measurement error bias This just leads toanother reason why prices might be correlated with the error termand motivates the instrumental-variable procedure discussed below11

The next component of the model describes how consumer pref-erences vary as a function of the individual characteristics iquesti In thecontext of equation (1) this amounts to modeling the distribution ofconsumer taste parameters The individual characteristics consist oftwo components demographics which I refer to as observed andadditional characteristics which I refer to as unobserved denoted Di

and vi respectively Given that no individual data is observed neithercomponent of the individual characteristics is directly observed in thechoice data set The distinction between them is that even though wedo not observe individual data we know something about the distri-bution of the demographics Di while for the additional characteris-tics vi we have no such information Examples of demographics areincome age family size race and education Examples of the type of

11 However as noted by Berry (1994) the method proposed below can deal withmeasurement error only if the variable measured with error enters in a restrictive wayNamely it only enters the part of utility that is common to all consumers d j in equation(3) below

Random-Coefcients Logit Models of Demand 519

information we might have is a large sample we can use to estimatesome feature of the distribution (eg we could use Census data toestimate the mean and standard deviation of income) Alternativelywe might have a sample from the joint distribution of several demo-graphic variables (eg the Current Population Survey might tell usabout the joint distribution of income education and age in differ-ent cities in the US) The additional characteristics m i might includethings like whether the individual owns a dog a characteristic thatmight be important in the decision of which car to buy yet even verydetailed survey data will usually not include this fact

Formally this will be modeled as

a i

b i

5a

b1 OtildeDi 1 aringm i m i ~ P

m ( m ) Di ~ AtildeP D(D) (2)

where Di is a d acute 1 vector of demographic variables m i captures theadditional characteristics discussed in the previous paragraph P

m ( ) isa parametric distribution AtildeP

D( ) is either a nonparametric distributionknown from other data sources or a parametric distribution with theparameters estimated elsewhere Otilde is a (K 1 1) acute d matrix of coefcientsthat measure how the taste characteristics vary with demographicsand aring is a (K 1 1) acute (K 1 1) matrix of parameters12 If we assumethat P

m ( ) is a standard multivariate normal distribution as I do inthe example below then the matrix aring allows each component of m i

to have a different variance and allows for correlation between thesecharacteristics For simplicity I assume that m i and Di are indepen-dent Equation (2) assumes that demographics affect the distributionof the coefcients in a fairly restrictive linear way For those coef-cients that are most important to the analysis (eg the coefcients onprice) relaxing the linearity assumption could have important impli-cations [for example see the results reported in Nevo (2000a b)]

As we will see below the way we model heterogeneity hasstrong implications for the results The advantage of letting the tasteparameters vary with the observed demographics Di is twofold Firstit allows us to include additional information about the distributionof demographics in the analysis Furthermore it reduces the relianceon parametric assumptions Therefore instead of letting a key elementof the method the distribution of the random coefcients be deter-mined by potentially arbitrary distributional assumptions we bringin additional information

12 To simplify notation I assume that all characteristics have random coefcientsThis need not be the case I return to this in the appendix when I discuss the detailsof estimation

520 Journal of Economics amp Management Strategy

The specication of the demand system is completed with theintroduction of an outside good the consumers may decide not to pur-chase any of the brands Without this allowance a homogenous priceincrease (relative to other sectors) of all the products does not changequantities purchased The indirect utility from this outsideoption is

u i0t 5 a i yi 1 raquo0t 1 frac140Di 1 frac340vi0 1 e i0t

The mean utility from the outside good raquo0t is not identied (with-out either making more assumptions or normalizing one of the insidegoods) Also the coefcients frac140 and frac340 are not identied separatelyfrom coefcients on an individual-specic constant term in equation(1) The standard practice is to set raquo0t frac140 and frac340 to zero and since theterm a iyi will eventually vanish (because it is common to all prod-ucts) this is equivalent to normalizing the utility from the outsidegood to zero

Let h 5 ( h 1 h 2) be a vector containing all the parameters of themodel The vector h 1 5 ( a b ) contains the linear parameters and thevector h 2 5 (Otilde aring) the nonlinear parameters13 Combining equations(1) and (2) we have

u ij t 5 a iyi 1 d j t(x j t pj t raquoj t h 1) 1 u ij t(xj t p j t vi Di h 2) 1 e ij t

d j t 5 x j t b a p j t 1 raquoj t u ij t 5 [ pj t xj t] (OtildeDi 1 aringm i) (3)

where [ pj t x j t] is a 1 acute (K 1 1) (row) vector The indirect utilityis now expressed as a sum of three (or four) terms The rst terma i yi is given only for consistency with equation (1) and will van-ish as we will see below The second term d j t which is referredto as the mean utility is common to all consumers Finally the lasttwo terms l ij t 1 e ij t represent a mean-zero heteroskedastic devia-tion from the mean utility that captures the effects of the randomcoefcients

Consumers are assumed to purchase one unit of the good thatgives the highest utility14 Since in this model an individual is de-

13 The reasons for the names will become apparent below14 A comment is in place here about the realism of the assumption that consumers

choose no more than one good We know that many households own more than onecar that many of us buy more than one brand of cereal and so forth We note that eventhough many of us buy more than one brand at a time less actually consume morethan one at a time Therefore the discreteness of choice can be sometimes defended bydening the choice period appropriately In some cases this will still not be enough inwhich case the researcher has one of two options either claim that the above modelis an approximation or reduced-form to the true choice model or model the choiceof multiple products or continuous quantities explicitly [as in Dubin and McFadden(1984) or Hendel (1999)]

Random-Coefcients Logit Models of Demand 521

ned as a vector of demographics and product-specic shocks(Di m i e i0t e iJ t) this implicitly denes the set of individualattributes that lead to the choice of good j Formally let this set be

A j t(x t p t d t h 2) 5copy

(Di vi e i0t e iJ t) |u ij t sup3 u ilt

l 5 0 1 Jordf

where x t 5 (xlt xJ t) cent p t 5 (p lt pJ t) cent and d t 5 ( d lt d J t) cent

are observed characteristics prices and mean utilities of all brandsrespectively The set A j t denes the individuals who choose brand j inmarket t Assuming ties occur with zero probability the market shareof the j th product is just an integral over the mass of consumers inthe region A j t Formally it is given by

sj t(x t p t d t h 2) 5 HA j t

dP (D v e )

5 HA j t

dP ( e |D m ) dP ( m |D) dP D(D)

5 HA j t

dP e ( e ) dP

m ( m ) dAtildeP D(D) (4)

where P ( ) denotes population distribution functions The secondequality is a direct application of Bayesrsquo rule while the last is a con-sequence of the independence assumptions previously made

Given assumptions on the distribution of the (unobserved) indi-vidual attributes we can compute the integral in equation (4) eitheranalytically or numerically Therefore for a given set of parametersequation (4) predicts of the market share of each product in eachmarket as a function of product characteristics prices and unknownparameters One possible estimation strategy is to choose parame-ters that minimize the distance (in some metric) between the mar-ket shares predicted by equation (4) and the observed shares Thisestimation strategy will yield estimates of the parameters that deter-mine the distribution of individual attributes but it does not accountfor the correlation between prices and the unobserved product char-acteristics The method proposed by Berry (1994) and BLP which ispresented in detail in the Section 3 accounts for this correlation

22 Distributional Assumptions

The assumptions on the distribution of individual attributes made inorder to compute the integral in equation (4) have important impli-cations for the own- and cross-price elasticities of demand In thissection I discuss some possible assumptions and their implications

522 Journal of Economics amp Management Strategy

Possibly the simplest distributional assumption one can make inorder to evaluate the integral in equation (4) is that consumer hetero-geneity enters the model only through the separable additive randomshock e ij t In our model this implies h 2 5 0 or b i 5 b and a i 5 a forall i and equation (1) becomes

u ij t 5 a (yi p j t) 1 x j t b 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (5)

At this point before we specify the distribution of e ij t the modeldescribed by equation (5) is as general as the model given inequation (1)15 Once we assume that e ij t is iid then the implied sub-stitution patterns are severely restricted as we will see below If wealso assume that e ij t are distributed according to a Type I extreme-value distribution this is the (aggregate) logit model The marketshare of brand j in market t dened by equation (4) is

sj t 5exp(x j t b a p j t 1 raquoj t)

1 1 S Jk 5 1 exp(xkt b a pkt 1 raquokt)

(6)

Note that income drops out of this equation since it is common to alloptions

Although the model implied by equation (5) and the extreme-value distribution assumption is appealing due to its tractability itrestricts the substitution patterns to depend only on the market sharesThe price elasticities of the market shares dened by equation (6) are

acutejkt 5sj t pkt

pkt sj t

5

(a pj t(1 sj t ) if j 5 k

a pktskt otherwise

There are two problems with these elasticities First since inmost cases the market shares are small the factor a (1 sj t) is nearlyconstant hence the own-price elasticities are proportional to ownprice Therefore the lower the price the lower the elasticity (in abso-lute value) which implies that a standard pricing model predicts ahigher markup for the lower-priced brands This is possible only if themarginal cost of a cheaper brand is lower (not just in absolute valuebut as a percentage of price) than that of a more expensive productFor some products this will not be true Note that this problem isa direct implication of the functional form in price If for exampleindirect utility was a function of the logarithm of price rather thanprice then the implied elasticity would be roughly constant In other

15 To see this compare equation (5) with equation (3)

Random-Coefcients Logit Models of Demand 523

words the functional form directly determines the patterns of own-price elasticity

An additional problem which has been stressed in the literatureis with the cross-price elasticities For example in the context of RTEcereals the cross-price elasticities imply that if Quaker CapN Crunch(a childernrsquos cereal) and Post Grape Nuts (a wholesome simple nutri-tion cereal) have similar market shares then the substitution fromGeneral Mills Lucky Charms (a childrenrsquos cereal) toward either ofthem will be the same Intuitively if the price of one childrenrsquos cerealgoes up we would expect more consumers to substitute to anotherchildrenrsquos cereal than to a nutrition cereal Yet the logit model restrictsconsumers to substitute towards other brands in proportion to marketshares regardless of characteristics

The problem in the cross-price elasticities comes from the iidstructure of the random shock In order to understand why this is thecase examine equation (3) A consumer will choose a product eitherbecause the mean utility from the product d j t is high or because theconsumer-specic shock l ij t 1 e ij t is high The distinction becomesimportant when we consider a change in the environment Considerfor example the increase in the price of Lucky Charms discussed inthe previous paragraph For some consumers who previously con-sumed Lucky Charms the utility from this product decreases enoughso that the utility from what was the second choice is now higherIn the logit model different consumers will have different rankings ofthe products but this difference is due only to the iid shock There-fore the proportion of these consumers who rank each brand as theirsecond choice is equal to the average in the population which is justthe market share of the each product

In order to get around this problem we need the shocks to util-ity to be correlated across brands By generating correlation we pre-dict that the second choice of consumers that decide to no longerbuy Lucky Charms will be different than that of the average con-sumer In particular they will be more likely to buy a product witha shock that was positively correlated to Lucky Charms for exampleCapN Crunch As we can see in equation (3) this correlation can begenerated either through the additive separable term e ij t or throughthe term l ij t which captures the effect the demographics Di and vi Appropriately dening the distributions of either of these terms canyield the exact same results The difference is only in modeling con-venience I now consider models of the two types

Models are available that induce correlation among options byallowing e ij t to be correlated across products rather than indepen-dently distributed are available (see the generalized extreme-valuemodel McFadden 1978) One such example is the nested logit model

524 Journal of Economics amp Management Strategy

in which all brands are grouped into predetermined exhaustive andmutually exclusive sets and e ij t is decomposed into an iid shockplus a group-specic component16 This implies that correlationbetween brands within a group is higher than across groups thus inthe example given above if the price of Lucky Charms goes up con-sumers are more likely to rank CapN Crunch as their second choiceTherefore consumers that currently consume Lucky Charms are morelikely to substitute towards Grape Nuts than the average consumerWithin the group the substitution is still driven by market shares ieif some childrenrsquos cereals are closer substitutes for Lucky Charms thanothers this will not be captured by the simple grouping17 The mainadvantage of the nested logit model is that like the logit model itimplies a closed form for the integral in equation (4) As we will seein Section 3 this simplies the computation

This nested logit can t into the model described by equation (1)in one of two ways by assuming a certain distribution of e ij t as wasmotivated in the previous paragraph or by assuming one of the char-acteristics of the product is a segment-specic dummy variable andassuming a particular distribution on the random coefcient of thatcharacteristic Given that we have shown that the model described byequations (1) and (2) can also be described by equation (3) it shouldnot be surprising that these two ways of describing the nested logitare equivalent Cardell (1997) shows the distributional assumptionsrequired for this equivalence to hold

The nested logit model allows for somewhat more exible sub-stitution patterns However in many cases the a priori division ofproducts into groups and the assumption of iid shocks within agroup will not be reasonable either because the division of segmentsis not clear or because the segmentation does not fully account forthe substitution patterns Furthermore the nested logit does not helpwith the problem of own-price elasticities This is usually handledby assuming some ldquonicerdquo functional form (ie yield patterns that areconsistent with some prior) but that does not solve the problem ofhaving the elasticities driven by the functional-form assumption

In some industries the segmentation of the market will be mul-tilayered For example computers can be divided into branded ver-sus generic and into frontier versus nonfrontier technology It turns

16 For a formal presentation of the nested logit model in the context of the modelpresented here see Berry (1994) or Stern (1995)

17 Of course one does not have to stop at one level of nesting For example wecould group all childrenrsquos cereals into family-acceptable and not acceptable For anexample of such grouping for automobiles see Goldberg (1995)

Random-Coefcients Logit Models of Demand 525

out that in the nested logit specication the order of the nests mat-ters18 For this reason Bresnahan et al (1997) build on a the generalextreme-value model (McFadden 1978) to construct what they call theprinciples-of-differentiation general extreme-value (PD GEV) model ofdemand for computers In their model they are able to use two dimen-sions of differentiation without ordering them With the exception ofdealing with the problem of ordering the nests this model retains allthe advantages and disadvantages of the nested logit In particular itimplies a closed-form expression for the integral in equation (4)

In principle one could consider estimating an unrestrictedvariance-covariance matrix of the shock e ij t This however reintro-duces the dimensionality problem discussed in the Introduction19 Ifin the full model described by equations (1) and (2) we maintainthe iid extreme-value distribution assumption on e ij t Correlationbetween choices is obtained through the term l ij t The correlationwill be a function of both product and consumer characteristics thecorrelation will be between products with similar characteristics andconsumers with similar demographics will have similar rankings ofproducts and therefore similar substitution patterns Therefore ratherthan having to estimate a large number of parameters correspond-ing to an unrestricted variance-covariance matrix we only have toestimate a smaller number

The price elasticities of the market shares sj t dened byequation (4) are

acutejkt 5sj t pkt

pkt sj t

5

( pj t

sj tH a i sij t(1 sij t) d AtildeP

D(D) dP m (v) if j 5 k

pkt

sj tH a isij tsikt d AtildeP

D(D) dP m (v) otherwise

where sij t 5 exp( d j t 1 l ij t) [1 1 S Kk 5 1 exp( d kt 1 l ikt)] is the probability of

individual i purchasing product j Now the own-price elasticity willnot necessary be driven by the functional form The partial derivativeof the market shares will no longer be determined by a single param-eter a Instead each individual will have a different price sensitivitywhich will be averaged to a mean price sensitivity using the individ-ual specic probabilities of purchase as weights The price sensitivity

18 So for example classifying computers rst into brandednonbranded and theninto frontiernonfrontier technology implies different substitution patterns than clas-sifying rst into frontiernonfrontier technology and then into brandednonbrandedeven if the classication of products does not change

19 See Hausman and Wise (1978) for an example of such a model with a smallnumber of products

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 6: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

518 Journal of Economics amp Management Strategy

quasilinear utility function which is free of wealth effects For someproducts (for example ready-to-eat cereals) this is a reasonable ass-umption but for other products (for example cars) it is an unreason-able one Including wealth effects alters the way the term yi p j t entersequation (1) For example BLP build on a Cobb-Douglas utility func-tion to derive an indirect utility that is a function of log(yi p j t) Inprinciple we can include f (yi p j t) where f ( ) is a exible functionalform (Petrin 1999)

Second equation (1) species that the unobserved characteristicwhich among other things captures the elements of vertical productdifferentiation is identical for all consumers Since the coefcient onprice is allowed to vary among individuals this is consistent withthe theoretical literature of vertical product differentiation An alter-native is to model the distribution of the valuation of the unobservedcharacteristics as in Das et al (1994) As long as we have not madeany distributional assumptions on consumer-specic components (ieanything with subscript i) their model is not more general Once wemake such assumptions their model has slightly different implica-tions for some of the normalizations usually made An exact discus-sion of these implications is beyond the scope of this paper

Finally the specication in equation (1) assumes that all con-sumers face the same product characteristics In particular all con-sumers are offered the same price Depending on the data if differentconsumers face different prices using either a list or average trans-action price will lead to measurement error bias This just leads toanother reason why prices might be correlated with the error termand motivates the instrumental-variable procedure discussed below11

The next component of the model describes how consumer pref-erences vary as a function of the individual characteristics iquesti In thecontext of equation (1) this amounts to modeling the distribution ofconsumer taste parameters The individual characteristics consist oftwo components demographics which I refer to as observed andadditional characteristics which I refer to as unobserved denoted Di

and vi respectively Given that no individual data is observed neithercomponent of the individual characteristics is directly observed in thechoice data set The distinction between them is that even though wedo not observe individual data we know something about the distri-bution of the demographics Di while for the additional characteris-tics vi we have no such information Examples of demographics areincome age family size race and education Examples of the type of

11 However as noted by Berry (1994) the method proposed below can deal withmeasurement error only if the variable measured with error enters in a restrictive wayNamely it only enters the part of utility that is common to all consumers d j in equation(3) below

Random-Coefcients Logit Models of Demand 519

information we might have is a large sample we can use to estimatesome feature of the distribution (eg we could use Census data toestimate the mean and standard deviation of income) Alternativelywe might have a sample from the joint distribution of several demo-graphic variables (eg the Current Population Survey might tell usabout the joint distribution of income education and age in differ-ent cities in the US) The additional characteristics m i might includethings like whether the individual owns a dog a characteristic thatmight be important in the decision of which car to buy yet even verydetailed survey data will usually not include this fact

Formally this will be modeled as

a i

b i

5a

b1 OtildeDi 1 aringm i m i ~ P

m ( m ) Di ~ AtildeP D(D) (2)

where Di is a d acute 1 vector of demographic variables m i captures theadditional characteristics discussed in the previous paragraph P

m ( ) isa parametric distribution AtildeP

D( ) is either a nonparametric distributionknown from other data sources or a parametric distribution with theparameters estimated elsewhere Otilde is a (K 1 1) acute d matrix of coefcientsthat measure how the taste characteristics vary with demographicsand aring is a (K 1 1) acute (K 1 1) matrix of parameters12 If we assumethat P

m ( ) is a standard multivariate normal distribution as I do inthe example below then the matrix aring allows each component of m i

to have a different variance and allows for correlation between thesecharacteristics For simplicity I assume that m i and Di are indepen-dent Equation (2) assumes that demographics affect the distributionof the coefcients in a fairly restrictive linear way For those coef-cients that are most important to the analysis (eg the coefcients onprice) relaxing the linearity assumption could have important impli-cations [for example see the results reported in Nevo (2000a b)]

As we will see below the way we model heterogeneity hasstrong implications for the results The advantage of letting the tasteparameters vary with the observed demographics Di is twofold Firstit allows us to include additional information about the distributionof demographics in the analysis Furthermore it reduces the relianceon parametric assumptions Therefore instead of letting a key elementof the method the distribution of the random coefcients be deter-mined by potentially arbitrary distributional assumptions we bringin additional information

12 To simplify notation I assume that all characteristics have random coefcientsThis need not be the case I return to this in the appendix when I discuss the detailsof estimation

520 Journal of Economics amp Management Strategy

The specication of the demand system is completed with theintroduction of an outside good the consumers may decide not to pur-chase any of the brands Without this allowance a homogenous priceincrease (relative to other sectors) of all the products does not changequantities purchased The indirect utility from this outsideoption is

u i0t 5 a i yi 1 raquo0t 1 frac140Di 1 frac340vi0 1 e i0t

The mean utility from the outside good raquo0t is not identied (with-out either making more assumptions or normalizing one of the insidegoods) Also the coefcients frac140 and frac340 are not identied separatelyfrom coefcients on an individual-specic constant term in equation(1) The standard practice is to set raquo0t frac140 and frac340 to zero and since theterm a iyi will eventually vanish (because it is common to all prod-ucts) this is equivalent to normalizing the utility from the outsidegood to zero

Let h 5 ( h 1 h 2) be a vector containing all the parameters of themodel The vector h 1 5 ( a b ) contains the linear parameters and thevector h 2 5 (Otilde aring) the nonlinear parameters13 Combining equations(1) and (2) we have

u ij t 5 a iyi 1 d j t(x j t pj t raquoj t h 1) 1 u ij t(xj t p j t vi Di h 2) 1 e ij t

d j t 5 x j t b a p j t 1 raquoj t u ij t 5 [ pj t xj t] (OtildeDi 1 aringm i) (3)

where [ pj t x j t] is a 1 acute (K 1 1) (row) vector The indirect utilityis now expressed as a sum of three (or four) terms The rst terma i yi is given only for consistency with equation (1) and will van-ish as we will see below The second term d j t which is referredto as the mean utility is common to all consumers Finally the lasttwo terms l ij t 1 e ij t represent a mean-zero heteroskedastic devia-tion from the mean utility that captures the effects of the randomcoefcients

Consumers are assumed to purchase one unit of the good thatgives the highest utility14 Since in this model an individual is de-

13 The reasons for the names will become apparent below14 A comment is in place here about the realism of the assumption that consumers

choose no more than one good We know that many households own more than onecar that many of us buy more than one brand of cereal and so forth We note that eventhough many of us buy more than one brand at a time less actually consume morethan one at a time Therefore the discreteness of choice can be sometimes defended bydening the choice period appropriately In some cases this will still not be enough inwhich case the researcher has one of two options either claim that the above modelis an approximation or reduced-form to the true choice model or model the choiceof multiple products or continuous quantities explicitly [as in Dubin and McFadden(1984) or Hendel (1999)]

Random-Coefcients Logit Models of Demand 521

ned as a vector of demographics and product-specic shocks(Di m i e i0t e iJ t) this implicitly denes the set of individualattributes that lead to the choice of good j Formally let this set be

A j t(x t p t d t h 2) 5copy

(Di vi e i0t e iJ t) |u ij t sup3 u ilt

l 5 0 1 Jordf

where x t 5 (xlt xJ t) cent p t 5 (p lt pJ t) cent and d t 5 ( d lt d J t) cent

are observed characteristics prices and mean utilities of all brandsrespectively The set A j t denes the individuals who choose brand j inmarket t Assuming ties occur with zero probability the market shareof the j th product is just an integral over the mass of consumers inthe region A j t Formally it is given by

sj t(x t p t d t h 2) 5 HA j t

dP (D v e )

5 HA j t

dP ( e |D m ) dP ( m |D) dP D(D)

5 HA j t

dP e ( e ) dP

m ( m ) dAtildeP D(D) (4)

where P ( ) denotes population distribution functions The secondequality is a direct application of Bayesrsquo rule while the last is a con-sequence of the independence assumptions previously made

Given assumptions on the distribution of the (unobserved) indi-vidual attributes we can compute the integral in equation (4) eitheranalytically or numerically Therefore for a given set of parametersequation (4) predicts of the market share of each product in eachmarket as a function of product characteristics prices and unknownparameters One possible estimation strategy is to choose parame-ters that minimize the distance (in some metric) between the mar-ket shares predicted by equation (4) and the observed shares Thisestimation strategy will yield estimates of the parameters that deter-mine the distribution of individual attributes but it does not accountfor the correlation between prices and the unobserved product char-acteristics The method proposed by Berry (1994) and BLP which ispresented in detail in the Section 3 accounts for this correlation

22 Distributional Assumptions

The assumptions on the distribution of individual attributes made inorder to compute the integral in equation (4) have important impli-cations for the own- and cross-price elasticities of demand In thissection I discuss some possible assumptions and their implications

522 Journal of Economics amp Management Strategy

Possibly the simplest distributional assumption one can make inorder to evaluate the integral in equation (4) is that consumer hetero-geneity enters the model only through the separable additive randomshock e ij t In our model this implies h 2 5 0 or b i 5 b and a i 5 a forall i and equation (1) becomes

u ij t 5 a (yi p j t) 1 x j t b 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (5)

At this point before we specify the distribution of e ij t the modeldescribed by equation (5) is as general as the model given inequation (1)15 Once we assume that e ij t is iid then the implied sub-stitution patterns are severely restricted as we will see below If wealso assume that e ij t are distributed according to a Type I extreme-value distribution this is the (aggregate) logit model The marketshare of brand j in market t dened by equation (4) is

sj t 5exp(x j t b a p j t 1 raquoj t)

1 1 S Jk 5 1 exp(xkt b a pkt 1 raquokt)

(6)

Note that income drops out of this equation since it is common to alloptions

Although the model implied by equation (5) and the extreme-value distribution assumption is appealing due to its tractability itrestricts the substitution patterns to depend only on the market sharesThe price elasticities of the market shares dened by equation (6) are

acutejkt 5sj t pkt

pkt sj t

5

(a pj t(1 sj t ) if j 5 k

a pktskt otherwise

There are two problems with these elasticities First since inmost cases the market shares are small the factor a (1 sj t) is nearlyconstant hence the own-price elasticities are proportional to ownprice Therefore the lower the price the lower the elasticity (in abso-lute value) which implies that a standard pricing model predicts ahigher markup for the lower-priced brands This is possible only if themarginal cost of a cheaper brand is lower (not just in absolute valuebut as a percentage of price) than that of a more expensive productFor some products this will not be true Note that this problem isa direct implication of the functional form in price If for exampleindirect utility was a function of the logarithm of price rather thanprice then the implied elasticity would be roughly constant In other

15 To see this compare equation (5) with equation (3)

Random-Coefcients Logit Models of Demand 523

words the functional form directly determines the patterns of own-price elasticity

An additional problem which has been stressed in the literatureis with the cross-price elasticities For example in the context of RTEcereals the cross-price elasticities imply that if Quaker CapN Crunch(a childernrsquos cereal) and Post Grape Nuts (a wholesome simple nutri-tion cereal) have similar market shares then the substitution fromGeneral Mills Lucky Charms (a childrenrsquos cereal) toward either ofthem will be the same Intuitively if the price of one childrenrsquos cerealgoes up we would expect more consumers to substitute to anotherchildrenrsquos cereal than to a nutrition cereal Yet the logit model restrictsconsumers to substitute towards other brands in proportion to marketshares regardless of characteristics

The problem in the cross-price elasticities comes from the iidstructure of the random shock In order to understand why this is thecase examine equation (3) A consumer will choose a product eitherbecause the mean utility from the product d j t is high or because theconsumer-specic shock l ij t 1 e ij t is high The distinction becomesimportant when we consider a change in the environment Considerfor example the increase in the price of Lucky Charms discussed inthe previous paragraph For some consumers who previously con-sumed Lucky Charms the utility from this product decreases enoughso that the utility from what was the second choice is now higherIn the logit model different consumers will have different rankings ofthe products but this difference is due only to the iid shock There-fore the proportion of these consumers who rank each brand as theirsecond choice is equal to the average in the population which is justthe market share of the each product

In order to get around this problem we need the shocks to util-ity to be correlated across brands By generating correlation we pre-dict that the second choice of consumers that decide to no longerbuy Lucky Charms will be different than that of the average con-sumer In particular they will be more likely to buy a product witha shock that was positively correlated to Lucky Charms for exampleCapN Crunch As we can see in equation (3) this correlation can begenerated either through the additive separable term e ij t or throughthe term l ij t which captures the effect the demographics Di and vi Appropriately dening the distributions of either of these terms canyield the exact same results The difference is only in modeling con-venience I now consider models of the two types

Models are available that induce correlation among options byallowing e ij t to be correlated across products rather than indepen-dently distributed are available (see the generalized extreme-valuemodel McFadden 1978) One such example is the nested logit model

524 Journal of Economics amp Management Strategy

in which all brands are grouped into predetermined exhaustive andmutually exclusive sets and e ij t is decomposed into an iid shockplus a group-specic component16 This implies that correlationbetween brands within a group is higher than across groups thus inthe example given above if the price of Lucky Charms goes up con-sumers are more likely to rank CapN Crunch as their second choiceTherefore consumers that currently consume Lucky Charms are morelikely to substitute towards Grape Nuts than the average consumerWithin the group the substitution is still driven by market shares ieif some childrenrsquos cereals are closer substitutes for Lucky Charms thanothers this will not be captured by the simple grouping17 The mainadvantage of the nested logit model is that like the logit model itimplies a closed form for the integral in equation (4) As we will seein Section 3 this simplies the computation

This nested logit can t into the model described by equation (1)in one of two ways by assuming a certain distribution of e ij t as wasmotivated in the previous paragraph or by assuming one of the char-acteristics of the product is a segment-specic dummy variable andassuming a particular distribution on the random coefcient of thatcharacteristic Given that we have shown that the model described byequations (1) and (2) can also be described by equation (3) it shouldnot be surprising that these two ways of describing the nested logitare equivalent Cardell (1997) shows the distributional assumptionsrequired for this equivalence to hold

The nested logit model allows for somewhat more exible sub-stitution patterns However in many cases the a priori division ofproducts into groups and the assumption of iid shocks within agroup will not be reasonable either because the division of segmentsis not clear or because the segmentation does not fully account forthe substitution patterns Furthermore the nested logit does not helpwith the problem of own-price elasticities This is usually handledby assuming some ldquonicerdquo functional form (ie yield patterns that areconsistent with some prior) but that does not solve the problem ofhaving the elasticities driven by the functional-form assumption

In some industries the segmentation of the market will be mul-tilayered For example computers can be divided into branded ver-sus generic and into frontier versus nonfrontier technology It turns

16 For a formal presentation of the nested logit model in the context of the modelpresented here see Berry (1994) or Stern (1995)

17 Of course one does not have to stop at one level of nesting For example wecould group all childrenrsquos cereals into family-acceptable and not acceptable For anexample of such grouping for automobiles see Goldberg (1995)

Random-Coefcients Logit Models of Demand 525

out that in the nested logit specication the order of the nests mat-ters18 For this reason Bresnahan et al (1997) build on a the generalextreme-value model (McFadden 1978) to construct what they call theprinciples-of-differentiation general extreme-value (PD GEV) model ofdemand for computers In their model they are able to use two dimen-sions of differentiation without ordering them With the exception ofdealing with the problem of ordering the nests this model retains allthe advantages and disadvantages of the nested logit In particular itimplies a closed-form expression for the integral in equation (4)

In principle one could consider estimating an unrestrictedvariance-covariance matrix of the shock e ij t This however reintro-duces the dimensionality problem discussed in the Introduction19 Ifin the full model described by equations (1) and (2) we maintainthe iid extreme-value distribution assumption on e ij t Correlationbetween choices is obtained through the term l ij t The correlationwill be a function of both product and consumer characteristics thecorrelation will be between products with similar characteristics andconsumers with similar demographics will have similar rankings ofproducts and therefore similar substitution patterns Therefore ratherthan having to estimate a large number of parameters correspond-ing to an unrestricted variance-covariance matrix we only have toestimate a smaller number

The price elasticities of the market shares sj t dened byequation (4) are

acutejkt 5sj t pkt

pkt sj t

5

( pj t

sj tH a i sij t(1 sij t) d AtildeP

D(D) dP m (v) if j 5 k

pkt

sj tH a isij tsikt d AtildeP

D(D) dP m (v) otherwise

where sij t 5 exp( d j t 1 l ij t) [1 1 S Kk 5 1 exp( d kt 1 l ikt)] is the probability of

individual i purchasing product j Now the own-price elasticity willnot necessary be driven by the functional form The partial derivativeof the market shares will no longer be determined by a single param-eter a Instead each individual will have a different price sensitivitywhich will be averaged to a mean price sensitivity using the individ-ual specic probabilities of purchase as weights The price sensitivity

18 So for example classifying computers rst into brandednonbranded and theninto frontiernonfrontier technology implies different substitution patterns than clas-sifying rst into frontiernonfrontier technology and then into brandednonbrandedeven if the classication of products does not change

19 See Hausman and Wise (1978) for an example of such a model with a smallnumber of products

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 7: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 519

information we might have is a large sample we can use to estimatesome feature of the distribution (eg we could use Census data toestimate the mean and standard deviation of income) Alternativelywe might have a sample from the joint distribution of several demo-graphic variables (eg the Current Population Survey might tell usabout the joint distribution of income education and age in differ-ent cities in the US) The additional characteristics m i might includethings like whether the individual owns a dog a characteristic thatmight be important in the decision of which car to buy yet even verydetailed survey data will usually not include this fact

Formally this will be modeled as

a i

b i

5a

b1 OtildeDi 1 aringm i m i ~ P

m ( m ) Di ~ AtildeP D(D) (2)

where Di is a d acute 1 vector of demographic variables m i captures theadditional characteristics discussed in the previous paragraph P

m ( ) isa parametric distribution AtildeP

D( ) is either a nonparametric distributionknown from other data sources or a parametric distribution with theparameters estimated elsewhere Otilde is a (K 1 1) acute d matrix of coefcientsthat measure how the taste characteristics vary with demographicsand aring is a (K 1 1) acute (K 1 1) matrix of parameters12 If we assumethat P

m ( ) is a standard multivariate normal distribution as I do inthe example below then the matrix aring allows each component of m i

to have a different variance and allows for correlation between thesecharacteristics For simplicity I assume that m i and Di are indepen-dent Equation (2) assumes that demographics affect the distributionof the coefcients in a fairly restrictive linear way For those coef-cients that are most important to the analysis (eg the coefcients onprice) relaxing the linearity assumption could have important impli-cations [for example see the results reported in Nevo (2000a b)]

As we will see below the way we model heterogeneity hasstrong implications for the results The advantage of letting the tasteparameters vary with the observed demographics Di is twofold Firstit allows us to include additional information about the distributionof demographics in the analysis Furthermore it reduces the relianceon parametric assumptions Therefore instead of letting a key elementof the method the distribution of the random coefcients be deter-mined by potentially arbitrary distributional assumptions we bringin additional information

12 To simplify notation I assume that all characteristics have random coefcientsThis need not be the case I return to this in the appendix when I discuss the detailsof estimation

520 Journal of Economics amp Management Strategy

The specication of the demand system is completed with theintroduction of an outside good the consumers may decide not to pur-chase any of the brands Without this allowance a homogenous priceincrease (relative to other sectors) of all the products does not changequantities purchased The indirect utility from this outsideoption is

u i0t 5 a i yi 1 raquo0t 1 frac140Di 1 frac340vi0 1 e i0t

The mean utility from the outside good raquo0t is not identied (with-out either making more assumptions or normalizing one of the insidegoods) Also the coefcients frac140 and frac340 are not identied separatelyfrom coefcients on an individual-specic constant term in equation(1) The standard practice is to set raquo0t frac140 and frac340 to zero and since theterm a iyi will eventually vanish (because it is common to all prod-ucts) this is equivalent to normalizing the utility from the outsidegood to zero

Let h 5 ( h 1 h 2) be a vector containing all the parameters of themodel The vector h 1 5 ( a b ) contains the linear parameters and thevector h 2 5 (Otilde aring) the nonlinear parameters13 Combining equations(1) and (2) we have

u ij t 5 a iyi 1 d j t(x j t pj t raquoj t h 1) 1 u ij t(xj t p j t vi Di h 2) 1 e ij t

d j t 5 x j t b a p j t 1 raquoj t u ij t 5 [ pj t xj t] (OtildeDi 1 aringm i) (3)

where [ pj t x j t] is a 1 acute (K 1 1) (row) vector The indirect utilityis now expressed as a sum of three (or four) terms The rst terma i yi is given only for consistency with equation (1) and will van-ish as we will see below The second term d j t which is referredto as the mean utility is common to all consumers Finally the lasttwo terms l ij t 1 e ij t represent a mean-zero heteroskedastic devia-tion from the mean utility that captures the effects of the randomcoefcients

Consumers are assumed to purchase one unit of the good thatgives the highest utility14 Since in this model an individual is de-

13 The reasons for the names will become apparent below14 A comment is in place here about the realism of the assumption that consumers

choose no more than one good We know that many households own more than onecar that many of us buy more than one brand of cereal and so forth We note that eventhough many of us buy more than one brand at a time less actually consume morethan one at a time Therefore the discreteness of choice can be sometimes defended bydening the choice period appropriately In some cases this will still not be enough inwhich case the researcher has one of two options either claim that the above modelis an approximation or reduced-form to the true choice model or model the choiceof multiple products or continuous quantities explicitly [as in Dubin and McFadden(1984) or Hendel (1999)]

Random-Coefcients Logit Models of Demand 521

ned as a vector of demographics and product-specic shocks(Di m i e i0t e iJ t) this implicitly denes the set of individualattributes that lead to the choice of good j Formally let this set be

A j t(x t p t d t h 2) 5copy

(Di vi e i0t e iJ t) |u ij t sup3 u ilt

l 5 0 1 Jordf

where x t 5 (xlt xJ t) cent p t 5 (p lt pJ t) cent and d t 5 ( d lt d J t) cent

are observed characteristics prices and mean utilities of all brandsrespectively The set A j t denes the individuals who choose brand j inmarket t Assuming ties occur with zero probability the market shareof the j th product is just an integral over the mass of consumers inthe region A j t Formally it is given by

sj t(x t p t d t h 2) 5 HA j t

dP (D v e )

5 HA j t

dP ( e |D m ) dP ( m |D) dP D(D)

5 HA j t

dP e ( e ) dP

m ( m ) dAtildeP D(D) (4)

where P ( ) denotes population distribution functions The secondequality is a direct application of Bayesrsquo rule while the last is a con-sequence of the independence assumptions previously made

Given assumptions on the distribution of the (unobserved) indi-vidual attributes we can compute the integral in equation (4) eitheranalytically or numerically Therefore for a given set of parametersequation (4) predicts of the market share of each product in eachmarket as a function of product characteristics prices and unknownparameters One possible estimation strategy is to choose parame-ters that minimize the distance (in some metric) between the mar-ket shares predicted by equation (4) and the observed shares Thisestimation strategy will yield estimates of the parameters that deter-mine the distribution of individual attributes but it does not accountfor the correlation between prices and the unobserved product char-acteristics The method proposed by Berry (1994) and BLP which ispresented in detail in the Section 3 accounts for this correlation

22 Distributional Assumptions

The assumptions on the distribution of individual attributes made inorder to compute the integral in equation (4) have important impli-cations for the own- and cross-price elasticities of demand In thissection I discuss some possible assumptions and their implications

522 Journal of Economics amp Management Strategy

Possibly the simplest distributional assumption one can make inorder to evaluate the integral in equation (4) is that consumer hetero-geneity enters the model only through the separable additive randomshock e ij t In our model this implies h 2 5 0 or b i 5 b and a i 5 a forall i and equation (1) becomes

u ij t 5 a (yi p j t) 1 x j t b 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (5)

At this point before we specify the distribution of e ij t the modeldescribed by equation (5) is as general as the model given inequation (1)15 Once we assume that e ij t is iid then the implied sub-stitution patterns are severely restricted as we will see below If wealso assume that e ij t are distributed according to a Type I extreme-value distribution this is the (aggregate) logit model The marketshare of brand j in market t dened by equation (4) is

sj t 5exp(x j t b a p j t 1 raquoj t)

1 1 S Jk 5 1 exp(xkt b a pkt 1 raquokt)

(6)

Note that income drops out of this equation since it is common to alloptions

Although the model implied by equation (5) and the extreme-value distribution assumption is appealing due to its tractability itrestricts the substitution patterns to depend only on the market sharesThe price elasticities of the market shares dened by equation (6) are

acutejkt 5sj t pkt

pkt sj t

5

(a pj t(1 sj t ) if j 5 k

a pktskt otherwise

There are two problems with these elasticities First since inmost cases the market shares are small the factor a (1 sj t) is nearlyconstant hence the own-price elasticities are proportional to ownprice Therefore the lower the price the lower the elasticity (in abso-lute value) which implies that a standard pricing model predicts ahigher markup for the lower-priced brands This is possible only if themarginal cost of a cheaper brand is lower (not just in absolute valuebut as a percentage of price) than that of a more expensive productFor some products this will not be true Note that this problem isa direct implication of the functional form in price If for exampleindirect utility was a function of the logarithm of price rather thanprice then the implied elasticity would be roughly constant In other

15 To see this compare equation (5) with equation (3)

Random-Coefcients Logit Models of Demand 523

words the functional form directly determines the patterns of own-price elasticity

An additional problem which has been stressed in the literatureis with the cross-price elasticities For example in the context of RTEcereals the cross-price elasticities imply that if Quaker CapN Crunch(a childernrsquos cereal) and Post Grape Nuts (a wholesome simple nutri-tion cereal) have similar market shares then the substitution fromGeneral Mills Lucky Charms (a childrenrsquos cereal) toward either ofthem will be the same Intuitively if the price of one childrenrsquos cerealgoes up we would expect more consumers to substitute to anotherchildrenrsquos cereal than to a nutrition cereal Yet the logit model restrictsconsumers to substitute towards other brands in proportion to marketshares regardless of characteristics

The problem in the cross-price elasticities comes from the iidstructure of the random shock In order to understand why this is thecase examine equation (3) A consumer will choose a product eitherbecause the mean utility from the product d j t is high or because theconsumer-specic shock l ij t 1 e ij t is high The distinction becomesimportant when we consider a change in the environment Considerfor example the increase in the price of Lucky Charms discussed inthe previous paragraph For some consumers who previously con-sumed Lucky Charms the utility from this product decreases enoughso that the utility from what was the second choice is now higherIn the logit model different consumers will have different rankings ofthe products but this difference is due only to the iid shock There-fore the proportion of these consumers who rank each brand as theirsecond choice is equal to the average in the population which is justthe market share of the each product

In order to get around this problem we need the shocks to util-ity to be correlated across brands By generating correlation we pre-dict that the second choice of consumers that decide to no longerbuy Lucky Charms will be different than that of the average con-sumer In particular they will be more likely to buy a product witha shock that was positively correlated to Lucky Charms for exampleCapN Crunch As we can see in equation (3) this correlation can begenerated either through the additive separable term e ij t or throughthe term l ij t which captures the effect the demographics Di and vi Appropriately dening the distributions of either of these terms canyield the exact same results The difference is only in modeling con-venience I now consider models of the two types

Models are available that induce correlation among options byallowing e ij t to be correlated across products rather than indepen-dently distributed are available (see the generalized extreme-valuemodel McFadden 1978) One such example is the nested logit model

524 Journal of Economics amp Management Strategy

in which all brands are grouped into predetermined exhaustive andmutually exclusive sets and e ij t is decomposed into an iid shockplus a group-specic component16 This implies that correlationbetween brands within a group is higher than across groups thus inthe example given above if the price of Lucky Charms goes up con-sumers are more likely to rank CapN Crunch as their second choiceTherefore consumers that currently consume Lucky Charms are morelikely to substitute towards Grape Nuts than the average consumerWithin the group the substitution is still driven by market shares ieif some childrenrsquos cereals are closer substitutes for Lucky Charms thanothers this will not be captured by the simple grouping17 The mainadvantage of the nested logit model is that like the logit model itimplies a closed form for the integral in equation (4) As we will seein Section 3 this simplies the computation

This nested logit can t into the model described by equation (1)in one of two ways by assuming a certain distribution of e ij t as wasmotivated in the previous paragraph or by assuming one of the char-acteristics of the product is a segment-specic dummy variable andassuming a particular distribution on the random coefcient of thatcharacteristic Given that we have shown that the model described byequations (1) and (2) can also be described by equation (3) it shouldnot be surprising that these two ways of describing the nested logitare equivalent Cardell (1997) shows the distributional assumptionsrequired for this equivalence to hold

The nested logit model allows for somewhat more exible sub-stitution patterns However in many cases the a priori division ofproducts into groups and the assumption of iid shocks within agroup will not be reasonable either because the division of segmentsis not clear or because the segmentation does not fully account forthe substitution patterns Furthermore the nested logit does not helpwith the problem of own-price elasticities This is usually handledby assuming some ldquonicerdquo functional form (ie yield patterns that areconsistent with some prior) but that does not solve the problem ofhaving the elasticities driven by the functional-form assumption

In some industries the segmentation of the market will be mul-tilayered For example computers can be divided into branded ver-sus generic and into frontier versus nonfrontier technology It turns

16 For a formal presentation of the nested logit model in the context of the modelpresented here see Berry (1994) or Stern (1995)

17 Of course one does not have to stop at one level of nesting For example wecould group all childrenrsquos cereals into family-acceptable and not acceptable For anexample of such grouping for automobiles see Goldberg (1995)

Random-Coefcients Logit Models of Demand 525

out that in the nested logit specication the order of the nests mat-ters18 For this reason Bresnahan et al (1997) build on a the generalextreme-value model (McFadden 1978) to construct what they call theprinciples-of-differentiation general extreme-value (PD GEV) model ofdemand for computers In their model they are able to use two dimen-sions of differentiation without ordering them With the exception ofdealing with the problem of ordering the nests this model retains allthe advantages and disadvantages of the nested logit In particular itimplies a closed-form expression for the integral in equation (4)

In principle one could consider estimating an unrestrictedvariance-covariance matrix of the shock e ij t This however reintro-duces the dimensionality problem discussed in the Introduction19 Ifin the full model described by equations (1) and (2) we maintainthe iid extreme-value distribution assumption on e ij t Correlationbetween choices is obtained through the term l ij t The correlationwill be a function of both product and consumer characteristics thecorrelation will be between products with similar characteristics andconsumers with similar demographics will have similar rankings ofproducts and therefore similar substitution patterns Therefore ratherthan having to estimate a large number of parameters correspond-ing to an unrestricted variance-covariance matrix we only have toestimate a smaller number

The price elasticities of the market shares sj t dened byequation (4) are

acutejkt 5sj t pkt

pkt sj t

5

( pj t

sj tH a i sij t(1 sij t) d AtildeP

D(D) dP m (v) if j 5 k

pkt

sj tH a isij tsikt d AtildeP

D(D) dP m (v) otherwise

where sij t 5 exp( d j t 1 l ij t) [1 1 S Kk 5 1 exp( d kt 1 l ikt)] is the probability of

individual i purchasing product j Now the own-price elasticity willnot necessary be driven by the functional form The partial derivativeof the market shares will no longer be determined by a single param-eter a Instead each individual will have a different price sensitivitywhich will be averaged to a mean price sensitivity using the individ-ual specic probabilities of purchase as weights The price sensitivity

18 So for example classifying computers rst into brandednonbranded and theninto frontiernonfrontier technology implies different substitution patterns than clas-sifying rst into frontiernonfrontier technology and then into brandednonbrandedeven if the classication of products does not change

19 See Hausman and Wise (1978) for an example of such a model with a smallnumber of products

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 8: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

520 Journal of Economics amp Management Strategy

The specication of the demand system is completed with theintroduction of an outside good the consumers may decide not to pur-chase any of the brands Without this allowance a homogenous priceincrease (relative to other sectors) of all the products does not changequantities purchased The indirect utility from this outsideoption is

u i0t 5 a i yi 1 raquo0t 1 frac140Di 1 frac340vi0 1 e i0t

The mean utility from the outside good raquo0t is not identied (with-out either making more assumptions or normalizing one of the insidegoods) Also the coefcients frac140 and frac340 are not identied separatelyfrom coefcients on an individual-specic constant term in equation(1) The standard practice is to set raquo0t frac140 and frac340 to zero and since theterm a iyi will eventually vanish (because it is common to all prod-ucts) this is equivalent to normalizing the utility from the outsidegood to zero

Let h 5 ( h 1 h 2) be a vector containing all the parameters of themodel The vector h 1 5 ( a b ) contains the linear parameters and thevector h 2 5 (Otilde aring) the nonlinear parameters13 Combining equations(1) and (2) we have

u ij t 5 a iyi 1 d j t(x j t pj t raquoj t h 1) 1 u ij t(xj t p j t vi Di h 2) 1 e ij t

d j t 5 x j t b a p j t 1 raquoj t u ij t 5 [ pj t xj t] (OtildeDi 1 aringm i) (3)

where [ pj t x j t] is a 1 acute (K 1 1) (row) vector The indirect utilityis now expressed as a sum of three (or four) terms The rst terma i yi is given only for consistency with equation (1) and will van-ish as we will see below The second term d j t which is referredto as the mean utility is common to all consumers Finally the lasttwo terms l ij t 1 e ij t represent a mean-zero heteroskedastic devia-tion from the mean utility that captures the effects of the randomcoefcients

Consumers are assumed to purchase one unit of the good thatgives the highest utility14 Since in this model an individual is de-

13 The reasons for the names will become apparent below14 A comment is in place here about the realism of the assumption that consumers

choose no more than one good We know that many households own more than onecar that many of us buy more than one brand of cereal and so forth We note that eventhough many of us buy more than one brand at a time less actually consume morethan one at a time Therefore the discreteness of choice can be sometimes defended bydening the choice period appropriately In some cases this will still not be enough inwhich case the researcher has one of two options either claim that the above modelis an approximation or reduced-form to the true choice model or model the choiceof multiple products or continuous quantities explicitly [as in Dubin and McFadden(1984) or Hendel (1999)]

Random-Coefcients Logit Models of Demand 521

ned as a vector of demographics and product-specic shocks(Di m i e i0t e iJ t) this implicitly denes the set of individualattributes that lead to the choice of good j Formally let this set be

A j t(x t p t d t h 2) 5copy

(Di vi e i0t e iJ t) |u ij t sup3 u ilt

l 5 0 1 Jordf

where x t 5 (xlt xJ t) cent p t 5 (p lt pJ t) cent and d t 5 ( d lt d J t) cent

are observed characteristics prices and mean utilities of all brandsrespectively The set A j t denes the individuals who choose brand j inmarket t Assuming ties occur with zero probability the market shareof the j th product is just an integral over the mass of consumers inthe region A j t Formally it is given by

sj t(x t p t d t h 2) 5 HA j t

dP (D v e )

5 HA j t

dP ( e |D m ) dP ( m |D) dP D(D)

5 HA j t

dP e ( e ) dP

m ( m ) dAtildeP D(D) (4)

where P ( ) denotes population distribution functions The secondequality is a direct application of Bayesrsquo rule while the last is a con-sequence of the independence assumptions previously made

Given assumptions on the distribution of the (unobserved) indi-vidual attributes we can compute the integral in equation (4) eitheranalytically or numerically Therefore for a given set of parametersequation (4) predicts of the market share of each product in eachmarket as a function of product characteristics prices and unknownparameters One possible estimation strategy is to choose parame-ters that minimize the distance (in some metric) between the mar-ket shares predicted by equation (4) and the observed shares Thisestimation strategy will yield estimates of the parameters that deter-mine the distribution of individual attributes but it does not accountfor the correlation between prices and the unobserved product char-acteristics The method proposed by Berry (1994) and BLP which ispresented in detail in the Section 3 accounts for this correlation

22 Distributional Assumptions

The assumptions on the distribution of individual attributes made inorder to compute the integral in equation (4) have important impli-cations for the own- and cross-price elasticities of demand In thissection I discuss some possible assumptions and their implications

522 Journal of Economics amp Management Strategy

Possibly the simplest distributional assumption one can make inorder to evaluate the integral in equation (4) is that consumer hetero-geneity enters the model only through the separable additive randomshock e ij t In our model this implies h 2 5 0 or b i 5 b and a i 5 a forall i and equation (1) becomes

u ij t 5 a (yi p j t) 1 x j t b 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (5)

At this point before we specify the distribution of e ij t the modeldescribed by equation (5) is as general as the model given inequation (1)15 Once we assume that e ij t is iid then the implied sub-stitution patterns are severely restricted as we will see below If wealso assume that e ij t are distributed according to a Type I extreme-value distribution this is the (aggregate) logit model The marketshare of brand j in market t dened by equation (4) is

sj t 5exp(x j t b a p j t 1 raquoj t)

1 1 S Jk 5 1 exp(xkt b a pkt 1 raquokt)

(6)

Note that income drops out of this equation since it is common to alloptions

Although the model implied by equation (5) and the extreme-value distribution assumption is appealing due to its tractability itrestricts the substitution patterns to depend only on the market sharesThe price elasticities of the market shares dened by equation (6) are

acutejkt 5sj t pkt

pkt sj t

5

(a pj t(1 sj t ) if j 5 k

a pktskt otherwise

There are two problems with these elasticities First since inmost cases the market shares are small the factor a (1 sj t) is nearlyconstant hence the own-price elasticities are proportional to ownprice Therefore the lower the price the lower the elasticity (in abso-lute value) which implies that a standard pricing model predicts ahigher markup for the lower-priced brands This is possible only if themarginal cost of a cheaper brand is lower (not just in absolute valuebut as a percentage of price) than that of a more expensive productFor some products this will not be true Note that this problem isa direct implication of the functional form in price If for exampleindirect utility was a function of the logarithm of price rather thanprice then the implied elasticity would be roughly constant In other

15 To see this compare equation (5) with equation (3)

Random-Coefcients Logit Models of Demand 523

words the functional form directly determines the patterns of own-price elasticity

An additional problem which has been stressed in the literatureis with the cross-price elasticities For example in the context of RTEcereals the cross-price elasticities imply that if Quaker CapN Crunch(a childernrsquos cereal) and Post Grape Nuts (a wholesome simple nutri-tion cereal) have similar market shares then the substitution fromGeneral Mills Lucky Charms (a childrenrsquos cereal) toward either ofthem will be the same Intuitively if the price of one childrenrsquos cerealgoes up we would expect more consumers to substitute to anotherchildrenrsquos cereal than to a nutrition cereal Yet the logit model restrictsconsumers to substitute towards other brands in proportion to marketshares regardless of characteristics

The problem in the cross-price elasticities comes from the iidstructure of the random shock In order to understand why this is thecase examine equation (3) A consumer will choose a product eitherbecause the mean utility from the product d j t is high or because theconsumer-specic shock l ij t 1 e ij t is high The distinction becomesimportant when we consider a change in the environment Considerfor example the increase in the price of Lucky Charms discussed inthe previous paragraph For some consumers who previously con-sumed Lucky Charms the utility from this product decreases enoughso that the utility from what was the second choice is now higherIn the logit model different consumers will have different rankings ofthe products but this difference is due only to the iid shock There-fore the proportion of these consumers who rank each brand as theirsecond choice is equal to the average in the population which is justthe market share of the each product

In order to get around this problem we need the shocks to util-ity to be correlated across brands By generating correlation we pre-dict that the second choice of consumers that decide to no longerbuy Lucky Charms will be different than that of the average con-sumer In particular they will be more likely to buy a product witha shock that was positively correlated to Lucky Charms for exampleCapN Crunch As we can see in equation (3) this correlation can begenerated either through the additive separable term e ij t or throughthe term l ij t which captures the effect the demographics Di and vi Appropriately dening the distributions of either of these terms canyield the exact same results The difference is only in modeling con-venience I now consider models of the two types

Models are available that induce correlation among options byallowing e ij t to be correlated across products rather than indepen-dently distributed are available (see the generalized extreme-valuemodel McFadden 1978) One such example is the nested logit model

524 Journal of Economics amp Management Strategy

in which all brands are grouped into predetermined exhaustive andmutually exclusive sets and e ij t is decomposed into an iid shockplus a group-specic component16 This implies that correlationbetween brands within a group is higher than across groups thus inthe example given above if the price of Lucky Charms goes up con-sumers are more likely to rank CapN Crunch as their second choiceTherefore consumers that currently consume Lucky Charms are morelikely to substitute towards Grape Nuts than the average consumerWithin the group the substitution is still driven by market shares ieif some childrenrsquos cereals are closer substitutes for Lucky Charms thanothers this will not be captured by the simple grouping17 The mainadvantage of the nested logit model is that like the logit model itimplies a closed form for the integral in equation (4) As we will seein Section 3 this simplies the computation

This nested logit can t into the model described by equation (1)in one of two ways by assuming a certain distribution of e ij t as wasmotivated in the previous paragraph or by assuming one of the char-acteristics of the product is a segment-specic dummy variable andassuming a particular distribution on the random coefcient of thatcharacteristic Given that we have shown that the model described byequations (1) and (2) can also be described by equation (3) it shouldnot be surprising that these two ways of describing the nested logitare equivalent Cardell (1997) shows the distributional assumptionsrequired for this equivalence to hold

The nested logit model allows for somewhat more exible sub-stitution patterns However in many cases the a priori division ofproducts into groups and the assumption of iid shocks within agroup will not be reasonable either because the division of segmentsis not clear or because the segmentation does not fully account forthe substitution patterns Furthermore the nested logit does not helpwith the problem of own-price elasticities This is usually handledby assuming some ldquonicerdquo functional form (ie yield patterns that areconsistent with some prior) but that does not solve the problem ofhaving the elasticities driven by the functional-form assumption

In some industries the segmentation of the market will be mul-tilayered For example computers can be divided into branded ver-sus generic and into frontier versus nonfrontier technology It turns

16 For a formal presentation of the nested logit model in the context of the modelpresented here see Berry (1994) or Stern (1995)

17 Of course one does not have to stop at one level of nesting For example wecould group all childrenrsquos cereals into family-acceptable and not acceptable For anexample of such grouping for automobiles see Goldberg (1995)

Random-Coefcients Logit Models of Demand 525

out that in the nested logit specication the order of the nests mat-ters18 For this reason Bresnahan et al (1997) build on a the generalextreme-value model (McFadden 1978) to construct what they call theprinciples-of-differentiation general extreme-value (PD GEV) model ofdemand for computers In their model they are able to use two dimen-sions of differentiation without ordering them With the exception ofdealing with the problem of ordering the nests this model retains allthe advantages and disadvantages of the nested logit In particular itimplies a closed-form expression for the integral in equation (4)

In principle one could consider estimating an unrestrictedvariance-covariance matrix of the shock e ij t This however reintro-duces the dimensionality problem discussed in the Introduction19 Ifin the full model described by equations (1) and (2) we maintainthe iid extreme-value distribution assumption on e ij t Correlationbetween choices is obtained through the term l ij t The correlationwill be a function of both product and consumer characteristics thecorrelation will be between products with similar characteristics andconsumers with similar demographics will have similar rankings ofproducts and therefore similar substitution patterns Therefore ratherthan having to estimate a large number of parameters correspond-ing to an unrestricted variance-covariance matrix we only have toestimate a smaller number

The price elasticities of the market shares sj t dened byequation (4) are

acutejkt 5sj t pkt

pkt sj t

5

( pj t

sj tH a i sij t(1 sij t) d AtildeP

D(D) dP m (v) if j 5 k

pkt

sj tH a isij tsikt d AtildeP

D(D) dP m (v) otherwise

where sij t 5 exp( d j t 1 l ij t) [1 1 S Kk 5 1 exp( d kt 1 l ikt)] is the probability of

individual i purchasing product j Now the own-price elasticity willnot necessary be driven by the functional form The partial derivativeof the market shares will no longer be determined by a single param-eter a Instead each individual will have a different price sensitivitywhich will be averaged to a mean price sensitivity using the individ-ual specic probabilities of purchase as weights The price sensitivity

18 So for example classifying computers rst into brandednonbranded and theninto frontiernonfrontier technology implies different substitution patterns than clas-sifying rst into frontiernonfrontier technology and then into brandednonbrandedeven if the classication of products does not change

19 See Hausman and Wise (1978) for an example of such a model with a smallnumber of products

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 9: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 521

ned as a vector of demographics and product-specic shocks(Di m i e i0t e iJ t) this implicitly denes the set of individualattributes that lead to the choice of good j Formally let this set be

A j t(x t p t d t h 2) 5copy

(Di vi e i0t e iJ t) |u ij t sup3 u ilt

l 5 0 1 Jordf

where x t 5 (xlt xJ t) cent p t 5 (p lt pJ t) cent and d t 5 ( d lt d J t) cent

are observed characteristics prices and mean utilities of all brandsrespectively The set A j t denes the individuals who choose brand j inmarket t Assuming ties occur with zero probability the market shareof the j th product is just an integral over the mass of consumers inthe region A j t Formally it is given by

sj t(x t p t d t h 2) 5 HA j t

dP (D v e )

5 HA j t

dP ( e |D m ) dP ( m |D) dP D(D)

5 HA j t

dP e ( e ) dP

m ( m ) dAtildeP D(D) (4)

where P ( ) denotes population distribution functions The secondequality is a direct application of Bayesrsquo rule while the last is a con-sequence of the independence assumptions previously made

Given assumptions on the distribution of the (unobserved) indi-vidual attributes we can compute the integral in equation (4) eitheranalytically or numerically Therefore for a given set of parametersequation (4) predicts of the market share of each product in eachmarket as a function of product characteristics prices and unknownparameters One possible estimation strategy is to choose parame-ters that minimize the distance (in some metric) between the mar-ket shares predicted by equation (4) and the observed shares Thisestimation strategy will yield estimates of the parameters that deter-mine the distribution of individual attributes but it does not accountfor the correlation between prices and the unobserved product char-acteristics The method proposed by Berry (1994) and BLP which ispresented in detail in the Section 3 accounts for this correlation

22 Distributional Assumptions

The assumptions on the distribution of individual attributes made inorder to compute the integral in equation (4) have important impli-cations for the own- and cross-price elasticities of demand In thissection I discuss some possible assumptions and their implications

522 Journal of Economics amp Management Strategy

Possibly the simplest distributional assumption one can make inorder to evaluate the integral in equation (4) is that consumer hetero-geneity enters the model only through the separable additive randomshock e ij t In our model this implies h 2 5 0 or b i 5 b and a i 5 a forall i and equation (1) becomes

u ij t 5 a (yi p j t) 1 x j t b 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (5)

At this point before we specify the distribution of e ij t the modeldescribed by equation (5) is as general as the model given inequation (1)15 Once we assume that e ij t is iid then the implied sub-stitution patterns are severely restricted as we will see below If wealso assume that e ij t are distributed according to a Type I extreme-value distribution this is the (aggregate) logit model The marketshare of brand j in market t dened by equation (4) is

sj t 5exp(x j t b a p j t 1 raquoj t)

1 1 S Jk 5 1 exp(xkt b a pkt 1 raquokt)

(6)

Note that income drops out of this equation since it is common to alloptions

Although the model implied by equation (5) and the extreme-value distribution assumption is appealing due to its tractability itrestricts the substitution patterns to depend only on the market sharesThe price elasticities of the market shares dened by equation (6) are

acutejkt 5sj t pkt

pkt sj t

5

(a pj t(1 sj t ) if j 5 k

a pktskt otherwise

There are two problems with these elasticities First since inmost cases the market shares are small the factor a (1 sj t) is nearlyconstant hence the own-price elasticities are proportional to ownprice Therefore the lower the price the lower the elasticity (in abso-lute value) which implies that a standard pricing model predicts ahigher markup for the lower-priced brands This is possible only if themarginal cost of a cheaper brand is lower (not just in absolute valuebut as a percentage of price) than that of a more expensive productFor some products this will not be true Note that this problem isa direct implication of the functional form in price If for exampleindirect utility was a function of the logarithm of price rather thanprice then the implied elasticity would be roughly constant In other

15 To see this compare equation (5) with equation (3)

Random-Coefcients Logit Models of Demand 523

words the functional form directly determines the patterns of own-price elasticity

An additional problem which has been stressed in the literatureis with the cross-price elasticities For example in the context of RTEcereals the cross-price elasticities imply that if Quaker CapN Crunch(a childernrsquos cereal) and Post Grape Nuts (a wholesome simple nutri-tion cereal) have similar market shares then the substitution fromGeneral Mills Lucky Charms (a childrenrsquos cereal) toward either ofthem will be the same Intuitively if the price of one childrenrsquos cerealgoes up we would expect more consumers to substitute to anotherchildrenrsquos cereal than to a nutrition cereal Yet the logit model restrictsconsumers to substitute towards other brands in proportion to marketshares regardless of characteristics

The problem in the cross-price elasticities comes from the iidstructure of the random shock In order to understand why this is thecase examine equation (3) A consumer will choose a product eitherbecause the mean utility from the product d j t is high or because theconsumer-specic shock l ij t 1 e ij t is high The distinction becomesimportant when we consider a change in the environment Considerfor example the increase in the price of Lucky Charms discussed inthe previous paragraph For some consumers who previously con-sumed Lucky Charms the utility from this product decreases enoughso that the utility from what was the second choice is now higherIn the logit model different consumers will have different rankings ofthe products but this difference is due only to the iid shock There-fore the proportion of these consumers who rank each brand as theirsecond choice is equal to the average in the population which is justthe market share of the each product

In order to get around this problem we need the shocks to util-ity to be correlated across brands By generating correlation we pre-dict that the second choice of consumers that decide to no longerbuy Lucky Charms will be different than that of the average con-sumer In particular they will be more likely to buy a product witha shock that was positively correlated to Lucky Charms for exampleCapN Crunch As we can see in equation (3) this correlation can begenerated either through the additive separable term e ij t or throughthe term l ij t which captures the effect the demographics Di and vi Appropriately dening the distributions of either of these terms canyield the exact same results The difference is only in modeling con-venience I now consider models of the two types

Models are available that induce correlation among options byallowing e ij t to be correlated across products rather than indepen-dently distributed are available (see the generalized extreme-valuemodel McFadden 1978) One such example is the nested logit model

524 Journal of Economics amp Management Strategy

in which all brands are grouped into predetermined exhaustive andmutually exclusive sets and e ij t is decomposed into an iid shockplus a group-specic component16 This implies that correlationbetween brands within a group is higher than across groups thus inthe example given above if the price of Lucky Charms goes up con-sumers are more likely to rank CapN Crunch as their second choiceTherefore consumers that currently consume Lucky Charms are morelikely to substitute towards Grape Nuts than the average consumerWithin the group the substitution is still driven by market shares ieif some childrenrsquos cereals are closer substitutes for Lucky Charms thanothers this will not be captured by the simple grouping17 The mainadvantage of the nested logit model is that like the logit model itimplies a closed form for the integral in equation (4) As we will seein Section 3 this simplies the computation

This nested logit can t into the model described by equation (1)in one of two ways by assuming a certain distribution of e ij t as wasmotivated in the previous paragraph or by assuming one of the char-acteristics of the product is a segment-specic dummy variable andassuming a particular distribution on the random coefcient of thatcharacteristic Given that we have shown that the model described byequations (1) and (2) can also be described by equation (3) it shouldnot be surprising that these two ways of describing the nested logitare equivalent Cardell (1997) shows the distributional assumptionsrequired for this equivalence to hold

The nested logit model allows for somewhat more exible sub-stitution patterns However in many cases the a priori division ofproducts into groups and the assumption of iid shocks within agroup will not be reasonable either because the division of segmentsis not clear or because the segmentation does not fully account forthe substitution patterns Furthermore the nested logit does not helpwith the problem of own-price elasticities This is usually handledby assuming some ldquonicerdquo functional form (ie yield patterns that areconsistent with some prior) but that does not solve the problem ofhaving the elasticities driven by the functional-form assumption

In some industries the segmentation of the market will be mul-tilayered For example computers can be divided into branded ver-sus generic and into frontier versus nonfrontier technology It turns

16 For a formal presentation of the nested logit model in the context of the modelpresented here see Berry (1994) or Stern (1995)

17 Of course one does not have to stop at one level of nesting For example wecould group all childrenrsquos cereals into family-acceptable and not acceptable For anexample of such grouping for automobiles see Goldberg (1995)

Random-Coefcients Logit Models of Demand 525

out that in the nested logit specication the order of the nests mat-ters18 For this reason Bresnahan et al (1997) build on a the generalextreme-value model (McFadden 1978) to construct what they call theprinciples-of-differentiation general extreme-value (PD GEV) model ofdemand for computers In their model they are able to use two dimen-sions of differentiation without ordering them With the exception ofdealing with the problem of ordering the nests this model retains allthe advantages and disadvantages of the nested logit In particular itimplies a closed-form expression for the integral in equation (4)

In principle one could consider estimating an unrestrictedvariance-covariance matrix of the shock e ij t This however reintro-duces the dimensionality problem discussed in the Introduction19 Ifin the full model described by equations (1) and (2) we maintainthe iid extreme-value distribution assumption on e ij t Correlationbetween choices is obtained through the term l ij t The correlationwill be a function of both product and consumer characteristics thecorrelation will be between products with similar characteristics andconsumers with similar demographics will have similar rankings ofproducts and therefore similar substitution patterns Therefore ratherthan having to estimate a large number of parameters correspond-ing to an unrestricted variance-covariance matrix we only have toestimate a smaller number

The price elasticities of the market shares sj t dened byequation (4) are

acutejkt 5sj t pkt

pkt sj t

5

( pj t

sj tH a i sij t(1 sij t) d AtildeP

D(D) dP m (v) if j 5 k

pkt

sj tH a isij tsikt d AtildeP

D(D) dP m (v) otherwise

where sij t 5 exp( d j t 1 l ij t) [1 1 S Kk 5 1 exp( d kt 1 l ikt)] is the probability of

individual i purchasing product j Now the own-price elasticity willnot necessary be driven by the functional form The partial derivativeof the market shares will no longer be determined by a single param-eter a Instead each individual will have a different price sensitivitywhich will be averaged to a mean price sensitivity using the individ-ual specic probabilities of purchase as weights The price sensitivity

18 So for example classifying computers rst into brandednonbranded and theninto frontiernonfrontier technology implies different substitution patterns than clas-sifying rst into frontiernonfrontier technology and then into brandednonbrandedeven if the classication of products does not change

19 See Hausman and Wise (1978) for an example of such a model with a smallnumber of products

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 10: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

522 Journal of Economics amp Management Strategy

Possibly the simplest distributional assumption one can make inorder to evaluate the integral in equation (4) is that consumer hetero-geneity enters the model only through the separable additive randomshock e ij t In our model this implies h 2 5 0 or b i 5 b and a i 5 a forall i and equation (1) becomes

u ij t 5 a (yi p j t) 1 x j t b 1 raquoj t 1 e ij t

i 5 1 It j 5 1 J t 5 1 T (5)

At this point before we specify the distribution of e ij t the modeldescribed by equation (5) is as general as the model given inequation (1)15 Once we assume that e ij t is iid then the implied sub-stitution patterns are severely restricted as we will see below If wealso assume that e ij t are distributed according to a Type I extreme-value distribution this is the (aggregate) logit model The marketshare of brand j in market t dened by equation (4) is

sj t 5exp(x j t b a p j t 1 raquoj t)

1 1 S Jk 5 1 exp(xkt b a pkt 1 raquokt)

(6)

Note that income drops out of this equation since it is common to alloptions

Although the model implied by equation (5) and the extreme-value distribution assumption is appealing due to its tractability itrestricts the substitution patterns to depend only on the market sharesThe price elasticities of the market shares dened by equation (6) are

acutejkt 5sj t pkt

pkt sj t

5

(a pj t(1 sj t ) if j 5 k

a pktskt otherwise

There are two problems with these elasticities First since inmost cases the market shares are small the factor a (1 sj t) is nearlyconstant hence the own-price elasticities are proportional to ownprice Therefore the lower the price the lower the elasticity (in abso-lute value) which implies that a standard pricing model predicts ahigher markup for the lower-priced brands This is possible only if themarginal cost of a cheaper brand is lower (not just in absolute valuebut as a percentage of price) than that of a more expensive productFor some products this will not be true Note that this problem isa direct implication of the functional form in price If for exampleindirect utility was a function of the logarithm of price rather thanprice then the implied elasticity would be roughly constant In other

15 To see this compare equation (5) with equation (3)

Random-Coefcients Logit Models of Demand 523

words the functional form directly determines the patterns of own-price elasticity

An additional problem which has been stressed in the literatureis with the cross-price elasticities For example in the context of RTEcereals the cross-price elasticities imply that if Quaker CapN Crunch(a childernrsquos cereal) and Post Grape Nuts (a wholesome simple nutri-tion cereal) have similar market shares then the substitution fromGeneral Mills Lucky Charms (a childrenrsquos cereal) toward either ofthem will be the same Intuitively if the price of one childrenrsquos cerealgoes up we would expect more consumers to substitute to anotherchildrenrsquos cereal than to a nutrition cereal Yet the logit model restrictsconsumers to substitute towards other brands in proportion to marketshares regardless of characteristics

The problem in the cross-price elasticities comes from the iidstructure of the random shock In order to understand why this is thecase examine equation (3) A consumer will choose a product eitherbecause the mean utility from the product d j t is high or because theconsumer-specic shock l ij t 1 e ij t is high The distinction becomesimportant when we consider a change in the environment Considerfor example the increase in the price of Lucky Charms discussed inthe previous paragraph For some consumers who previously con-sumed Lucky Charms the utility from this product decreases enoughso that the utility from what was the second choice is now higherIn the logit model different consumers will have different rankings ofthe products but this difference is due only to the iid shock There-fore the proportion of these consumers who rank each brand as theirsecond choice is equal to the average in the population which is justthe market share of the each product

In order to get around this problem we need the shocks to util-ity to be correlated across brands By generating correlation we pre-dict that the second choice of consumers that decide to no longerbuy Lucky Charms will be different than that of the average con-sumer In particular they will be more likely to buy a product witha shock that was positively correlated to Lucky Charms for exampleCapN Crunch As we can see in equation (3) this correlation can begenerated either through the additive separable term e ij t or throughthe term l ij t which captures the effect the demographics Di and vi Appropriately dening the distributions of either of these terms canyield the exact same results The difference is only in modeling con-venience I now consider models of the two types

Models are available that induce correlation among options byallowing e ij t to be correlated across products rather than indepen-dently distributed are available (see the generalized extreme-valuemodel McFadden 1978) One such example is the nested logit model

524 Journal of Economics amp Management Strategy

in which all brands are grouped into predetermined exhaustive andmutually exclusive sets and e ij t is decomposed into an iid shockplus a group-specic component16 This implies that correlationbetween brands within a group is higher than across groups thus inthe example given above if the price of Lucky Charms goes up con-sumers are more likely to rank CapN Crunch as their second choiceTherefore consumers that currently consume Lucky Charms are morelikely to substitute towards Grape Nuts than the average consumerWithin the group the substitution is still driven by market shares ieif some childrenrsquos cereals are closer substitutes for Lucky Charms thanothers this will not be captured by the simple grouping17 The mainadvantage of the nested logit model is that like the logit model itimplies a closed form for the integral in equation (4) As we will seein Section 3 this simplies the computation

This nested logit can t into the model described by equation (1)in one of two ways by assuming a certain distribution of e ij t as wasmotivated in the previous paragraph or by assuming one of the char-acteristics of the product is a segment-specic dummy variable andassuming a particular distribution on the random coefcient of thatcharacteristic Given that we have shown that the model described byequations (1) and (2) can also be described by equation (3) it shouldnot be surprising that these two ways of describing the nested logitare equivalent Cardell (1997) shows the distributional assumptionsrequired for this equivalence to hold

The nested logit model allows for somewhat more exible sub-stitution patterns However in many cases the a priori division ofproducts into groups and the assumption of iid shocks within agroup will not be reasonable either because the division of segmentsis not clear or because the segmentation does not fully account forthe substitution patterns Furthermore the nested logit does not helpwith the problem of own-price elasticities This is usually handledby assuming some ldquonicerdquo functional form (ie yield patterns that areconsistent with some prior) but that does not solve the problem ofhaving the elasticities driven by the functional-form assumption

In some industries the segmentation of the market will be mul-tilayered For example computers can be divided into branded ver-sus generic and into frontier versus nonfrontier technology It turns

16 For a formal presentation of the nested logit model in the context of the modelpresented here see Berry (1994) or Stern (1995)

17 Of course one does not have to stop at one level of nesting For example wecould group all childrenrsquos cereals into family-acceptable and not acceptable For anexample of such grouping for automobiles see Goldberg (1995)

Random-Coefcients Logit Models of Demand 525

out that in the nested logit specication the order of the nests mat-ters18 For this reason Bresnahan et al (1997) build on a the generalextreme-value model (McFadden 1978) to construct what they call theprinciples-of-differentiation general extreme-value (PD GEV) model ofdemand for computers In their model they are able to use two dimen-sions of differentiation without ordering them With the exception ofdealing with the problem of ordering the nests this model retains allthe advantages and disadvantages of the nested logit In particular itimplies a closed-form expression for the integral in equation (4)

In principle one could consider estimating an unrestrictedvariance-covariance matrix of the shock e ij t This however reintro-duces the dimensionality problem discussed in the Introduction19 Ifin the full model described by equations (1) and (2) we maintainthe iid extreme-value distribution assumption on e ij t Correlationbetween choices is obtained through the term l ij t The correlationwill be a function of both product and consumer characteristics thecorrelation will be between products with similar characteristics andconsumers with similar demographics will have similar rankings ofproducts and therefore similar substitution patterns Therefore ratherthan having to estimate a large number of parameters correspond-ing to an unrestricted variance-covariance matrix we only have toestimate a smaller number

The price elasticities of the market shares sj t dened byequation (4) are

acutejkt 5sj t pkt

pkt sj t

5

( pj t

sj tH a i sij t(1 sij t) d AtildeP

D(D) dP m (v) if j 5 k

pkt

sj tH a isij tsikt d AtildeP

D(D) dP m (v) otherwise

where sij t 5 exp( d j t 1 l ij t) [1 1 S Kk 5 1 exp( d kt 1 l ikt)] is the probability of

individual i purchasing product j Now the own-price elasticity willnot necessary be driven by the functional form The partial derivativeof the market shares will no longer be determined by a single param-eter a Instead each individual will have a different price sensitivitywhich will be averaged to a mean price sensitivity using the individ-ual specic probabilities of purchase as weights The price sensitivity

18 So for example classifying computers rst into brandednonbranded and theninto frontiernonfrontier technology implies different substitution patterns than clas-sifying rst into frontiernonfrontier technology and then into brandednonbrandedeven if the classication of products does not change

19 See Hausman and Wise (1978) for an example of such a model with a smallnumber of products

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 11: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 523

words the functional form directly determines the patterns of own-price elasticity

An additional problem which has been stressed in the literatureis with the cross-price elasticities For example in the context of RTEcereals the cross-price elasticities imply that if Quaker CapN Crunch(a childernrsquos cereal) and Post Grape Nuts (a wholesome simple nutri-tion cereal) have similar market shares then the substitution fromGeneral Mills Lucky Charms (a childrenrsquos cereal) toward either ofthem will be the same Intuitively if the price of one childrenrsquos cerealgoes up we would expect more consumers to substitute to anotherchildrenrsquos cereal than to a nutrition cereal Yet the logit model restrictsconsumers to substitute towards other brands in proportion to marketshares regardless of characteristics

The problem in the cross-price elasticities comes from the iidstructure of the random shock In order to understand why this is thecase examine equation (3) A consumer will choose a product eitherbecause the mean utility from the product d j t is high or because theconsumer-specic shock l ij t 1 e ij t is high The distinction becomesimportant when we consider a change in the environment Considerfor example the increase in the price of Lucky Charms discussed inthe previous paragraph For some consumers who previously con-sumed Lucky Charms the utility from this product decreases enoughso that the utility from what was the second choice is now higherIn the logit model different consumers will have different rankings ofthe products but this difference is due only to the iid shock There-fore the proportion of these consumers who rank each brand as theirsecond choice is equal to the average in the population which is justthe market share of the each product

In order to get around this problem we need the shocks to util-ity to be correlated across brands By generating correlation we pre-dict that the second choice of consumers that decide to no longerbuy Lucky Charms will be different than that of the average con-sumer In particular they will be more likely to buy a product witha shock that was positively correlated to Lucky Charms for exampleCapN Crunch As we can see in equation (3) this correlation can begenerated either through the additive separable term e ij t or throughthe term l ij t which captures the effect the demographics Di and vi Appropriately dening the distributions of either of these terms canyield the exact same results The difference is only in modeling con-venience I now consider models of the two types

Models are available that induce correlation among options byallowing e ij t to be correlated across products rather than indepen-dently distributed are available (see the generalized extreme-valuemodel McFadden 1978) One such example is the nested logit model

524 Journal of Economics amp Management Strategy

in which all brands are grouped into predetermined exhaustive andmutually exclusive sets and e ij t is decomposed into an iid shockplus a group-specic component16 This implies that correlationbetween brands within a group is higher than across groups thus inthe example given above if the price of Lucky Charms goes up con-sumers are more likely to rank CapN Crunch as their second choiceTherefore consumers that currently consume Lucky Charms are morelikely to substitute towards Grape Nuts than the average consumerWithin the group the substitution is still driven by market shares ieif some childrenrsquos cereals are closer substitutes for Lucky Charms thanothers this will not be captured by the simple grouping17 The mainadvantage of the nested logit model is that like the logit model itimplies a closed form for the integral in equation (4) As we will seein Section 3 this simplies the computation

This nested logit can t into the model described by equation (1)in one of two ways by assuming a certain distribution of e ij t as wasmotivated in the previous paragraph or by assuming one of the char-acteristics of the product is a segment-specic dummy variable andassuming a particular distribution on the random coefcient of thatcharacteristic Given that we have shown that the model described byequations (1) and (2) can also be described by equation (3) it shouldnot be surprising that these two ways of describing the nested logitare equivalent Cardell (1997) shows the distributional assumptionsrequired for this equivalence to hold

The nested logit model allows for somewhat more exible sub-stitution patterns However in many cases the a priori division ofproducts into groups and the assumption of iid shocks within agroup will not be reasonable either because the division of segmentsis not clear or because the segmentation does not fully account forthe substitution patterns Furthermore the nested logit does not helpwith the problem of own-price elasticities This is usually handledby assuming some ldquonicerdquo functional form (ie yield patterns that areconsistent with some prior) but that does not solve the problem ofhaving the elasticities driven by the functional-form assumption

In some industries the segmentation of the market will be mul-tilayered For example computers can be divided into branded ver-sus generic and into frontier versus nonfrontier technology It turns

16 For a formal presentation of the nested logit model in the context of the modelpresented here see Berry (1994) or Stern (1995)

17 Of course one does not have to stop at one level of nesting For example wecould group all childrenrsquos cereals into family-acceptable and not acceptable For anexample of such grouping for automobiles see Goldberg (1995)

Random-Coefcients Logit Models of Demand 525

out that in the nested logit specication the order of the nests mat-ters18 For this reason Bresnahan et al (1997) build on a the generalextreme-value model (McFadden 1978) to construct what they call theprinciples-of-differentiation general extreme-value (PD GEV) model ofdemand for computers In their model they are able to use two dimen-sions of differentiation without ordering them With the exception ofdealing with the problem of ordering the nests this model retains allthe advantages and disadvantages of the nested logit In particular itimplies a closed-form expression for the integral in equation (4)

In principle one could consider estimating an unrestrictedvariance-covariance matrix of the shock e ij t This however reintro-duces the dimensionality problem discussed in the Introduction19 Ifin the full model described by equations (1) and (2) we maintainthe iid extreme-value distribution assumption on e ij t Correlationbetween choices is obtained through the term l ij t The correlationwill be a function of both product and consumer characteristics thecorrelation will be between products with similar characteristics andconsumers with similar demographics will have similar rankings ofproducts and therefore similar substitution patterns Therefore ratherthan having to estimate a large number of parameters correspond-ing to an unrestricted variance-covariance matrix we only have toestimate a smaller number

The price elasticities of the market shares sj t dened byequation (4) are

acutejkt 5sj t pkt

pkt sj t

5

( pj t

sj tH a i sij t(1 sij t) d AtildeP

D(D) dP m (v) if j 5 k

pkt

sj tH a isij tsikt d AtildeP

D(D) dP m (v) otherwise

where sij t 5 exp( d j t 1 l ij t) [1 1 S Kk 5 1 exp( d kt 1 l ikt)] is the probability of

individual i purchasing product j Now the own-price elasticity willnot necessary be driven by the functional form The partial derivativeof the market shares will no longer be determined by a single param-eter a Instead each individual will have a different price sensitivitywhich will be averaged to a mean price sensitivity using the individ-ual specic probabilities of purchase as weights The price sensitivity

18 So for example classifying computers rst into brandednonbranded and theninto frontiernonfrontier technology implies different substitution patterns than clas-sifying rst into frontiernonfrontier technology and then into brandednonbrandedeven if the classication of products does not change

19 See Hausman and Wise (1978) for an example of such a model with a smallnumber of products

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 12: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

524 Journal of Economics amp Management Strategy

in which all brands are grouped into predetermined exhaustive andmutually exclusive sets and e ij t is decomposed into an iid shockplus a group-specic component16 This implies that correlationbetween brands within a group is higher than across groups thus inthe example given above if the price of Lucky Charms goes up con-sumers are more likely to rank CapN Crunch as their second choiceTherefore consumers that currently consume Lucky Charms are morelikely to substitute towards Grape Nuts than the average consumerWithin the group the substitution is still driven by market shares ieif some childrenrsquos cereals are closer substitutes for Lucky Charms thanothers this will not be captured by the simple grouping17 The mainadvantage of the nested logit model is that like the logit model itimplies a closed form for the integral in equation (4) As we will seein Section 3 this simplies the computation

This nested logit can t into the model described by equation (1)in one of two ways by assuming a certain distribution of e ij t as wasmotivated in the previous paragraph or by assuming one of the char-acteristics of the product is a segment-specic dummy variable andassuming a particular distribution on the random coefcient of thatcharacteristic Given that we have shown that the model described byequations (1) and (2) can also be described by equation (3) it shouldnot be surprising that these two ways of describing the nested logitare equivalent Cardell (1997) shows the distributional assumptionsrequired for this equivalence to hold

The nested logit model allows for somewhat more exible sub-stitution patterns However in many cases the a priori division ofproducts into groups and the assumption of iid shocks within agroup will not be reasonable either because the division of segmentsis not clear or because the segmentation does not fully account forthe substitution patterns Furthermore the nested logit does not helpwith the problem of own-price elasticities This is usually handledby assuming some ldquonicerdquo functional form (ie yield patterns that areconsistent with some prior) but that does not solve the problem ofhaving the elasticities driven by the functional-form assumption

In some industries the segmentation of the market will be mul-tilayered For example computers can be divided into branded ver-sus generic and into frontier versus nonfrontier technology It turns

16 For a formal presentation of the nested logit model in the context of the modelpresented here see Berry (1994) or Stern (1995)

17 Of course one does not have to stop at one level of nesting For example wecould group all childrenrsquos cereals into family-acceptable and not acceptable For anexample of such grouping for automobiles see Goldberg (1995)

Random-Coefcients Logit Models of Demand 525

out that in the nested logit specication the order of the nests mat-ters18 For this reason Bresnahan et al (1997) build on a the generalextreme-value model (McFadden 1978) to construct what they call theprinciples-of-differentiation general extreme-value (PD GEV) model ofdemand for computers In their model they are able to use two dimen-sions of differentiation without ordering them With the exception ofdealing with the problem of ordering the nests this model retains allthe advantages and disadvantages of the nested logit In particular itimplies a closed-form expression for the integral in equation (4)

In principle one could consider estimating an unrestrictedvariance-covariance matrix of the shock e ij t This however reintro-duces the dimensionality problem discussed in the Introduction19 Ifin the full model described by equations (1) and (2) we maintainthe iid extreme-value distribution assumption on e ij t Correlationbetween choices is obtained through the term l ij t The correlationwill be a function of both product and consumer characteristics thecorrelation will be between products with similar characteristics andconsumers with similar demographics will have similar rankings ofproducts and therefore similar substitution patterns Therefore ratherthan having to estimate a large number of parameters correspond-ing to an unrestricted variance-covariance matrix we only have toestimate a smaller number

The price elasticities of the market shares sj t dened byequation (4) are

acutejkt 5sj t pkt

pkt sj t

5

( pj t

sj tH a i sij t(1 sij t) d AtildeP

D(D) dP m (v) if j 5 k

pkt

sj tH a isij tsikt d AtildeP

D(D) dP m (v) otherwise

where sij t 5 exp( d j t 1 l ij t) [1 1 S Kk 5 1 exp( d kt 1 l ikt)] is the probability of

individual i purchasing product j Now the own-price elasticity willnot necessary be driven by the functional form The partial derivativeof the market shares will no longer be determined by a single param-eter a Instead each individual will have a different price sensitivitywhich will be averaged to a mean price sensitivity using the individ-ual specic probabilities of purchase as weights The price sensitivity

18 So for example classifying computers rst into brandednonbranded and theninto frontiernonfrontier technology implies different substitution patterns than clas-sifying rst into frontiernonfrontier technology and then into brandednonbrandedeven if the classication of products does not change

19 See Hausman and Wise (1978) for an example of such a model with a smallnumber of products

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 13: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 525

out that in the nested logit specication the order of the nests mat-ters18 For this reason Bresnahan et al (1997) build on a the generalextreme-value model (McFadden 1978) to construct what they call theprinciples-of-differentiation general extreme-value (PD GEV) model ofdemand for computers In their model they are able to use two dimen-sions of differentiation without ordering them With the exception ofdealing with the problem of ordering the nests this model retains allthe advantages and disadvantages of the nested logit In particular itimplies a closed-form expression for the integral in equation (4)

In principle one could consider estimating an unrestrictedvariance-covariance matrix of the shock e ij t This however reintro-duces the dimensionality problem discussed in the Introduction19 Ifin the full model described by equations (1) and (2) we maintainthe iid extreme-value distribution assumption on e ij t Correlationbetween choices is obtained through the term l ij t The correlationwill be a function of both product and consumer characteristics thecorrelation will be between products with similar characteristics andconsumers with similar demographics will have similar rankings ofproducts and therefore similar substitution patterns Therefore ratherthan having to estimate a large number of parameters correspond-ing to an unrestricted variance-covariance matrix we only have toestimate a smaller number

The price elasticities of the market shares sj t dened byequation (4) are

acutejkt 5sj t pkt

pkt sj t

5

( pj t

sj tH a i sij t(1 sij t) d AtildeP

D(D) dP m (v) if j 5 k

pkt

sj tH a isij tsikt d AtildeP

D(D) dP m (v) otherwise

where sij t 5 exp( d j t 1 l ij t) [1 1 S Kk 5 1 exp( d kt 1 l ikt)] is the probability of

individual i purchasing product j Now the own-price elasticity willnot necessary be driven by the functional form The partial derivativeof the market shares will no longer be determined by a single param-eter a Instead each individual will have a different price sensitivitywhich will be averaged to a mean price sensitivity using the individ-ual specic probabilities of purchase as weights The price sensitivity

18 So for example classifying computers rst into brandednonbranded and theninto frontiernonfrontier technology implies different substitution patterns than clas-sifying rst into frontiernonfrontier technology and then into brandednonbrandedeven if the classication of products does not change

19 See Hausman and Wise (1978) for an example of such a model with a smallnumber of products

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 14: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

526 Journal of Economics amp Management Strategy

will be different for different brands So if for example consumers ofKelloggrsquos Corn Flakes have high price sensitivity then the own-priceelasticity of Kelloggrsquos Corn Flakes will be high despite the low pricesand the fact that prices enter linearly Therefore substitution patternsare not driven by functional form but by the differences in the pricesensitivity or the marginal utility from income between consumersthat purchase the various products

The full model also allows for exible substitution patternswhich are not constrained by a priori segmentation of the market (yetat the same time can take advantage of this segmentation by includinga segment dummy variable as a product characteristic) The compos-ite random shock l ij t 1 e ij t is then no longer independent of theproduct and consumer characteristics Thus if the price of a brandgoes up consumers are more likely to switch to brands with similarcharacteristics rather than to the most popular brand

Unfortunately these advantages do not come without cost Esti-mation of the model specied in equation (3) is not as simple as thatof the logit nested logit or GEV models There are two immediateproblems First equation (4) no longer has an analytic closed form[like that given in equation (6) for the Logit case] Furthermore thecomputation of the integral in equation (4) is difcult This problemis solved using simulation methods as described below Second wenow require information about the distribution of consumer hetero-geneity in order to compute the market shares This could come inthe form of a parametric assumption on the functional form of thedistribution or by using additional data sources Demographics of thepopulation for example can be obtained by sampling from the CPS

3 Estimation

31 The Data

The market-level data required to consistently estimate the model pre-viously described consists of the following variables market sharesand prices in each market and brand characteristics In additioninformation on the distribution of demographics AtildeP

D is useful20 asare marketing mix variables (such as advertising expenditures or theavailability of coupons or promotional activity) In principle someof the parameters of the model are identied even with data on one

20 Recall that we divided the demographic variables into two types The rst werethose variables for which we had some information regarding the distribution If suchinformation is not available we are left with only the second type ie variables forwhich we assume a parametric distribution

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 15: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 527

market However it is highly recommended to gather data on sev-eral markets with variation in relative prices of the products andorproducts offered

Market shares are dened using a quantity variable whichdepends on the context and should be determined by the specicsof the problem BLP use the number of automobiles sold while Nevo(2000a b) converts pounds of cereal into servings Probably the mostimportant consideration in choosing the quantity variable is the needto dene a market share for the outside good This share will rarelybe observed directly and will usually be dened as the total size ofthe market minus the shares of the inside goods The total size of themarket is assumed according to the context So for example Nevo(2000a b) assumes the size of the market for ready-to-eat cereal to beone serving of cereal per capita per day Bresnahan et al (1997) inestimating demand for computers take the potential market to be thetotal number of ofce-based employees

In general I found the following rules useful when dening themarket size You want to make sure to dene the market large enoughto allow for a nonzero share of the outside good When looking at his-torical data one can use eventual growth to learn about the potentialmarket size One should check the sensitivity of the results to themarket denition if the results are sensitive consider an alternativeThere are two parts to dening the market size choosing the variableto which the market size is proportional and choosing the propor-tionality factor For example one can assume that the market size isproportional to the size of the population with the proportionalityfactor equal to a constant factor which can be estimated (Berry et al1996) From my own (somewhat limited) experience getting the rightvariable from which to make things proportional is the harder andmore important component of this process

An important part of any data set required to implement themodels described in Section 2 consists of the product characteristicsThese can include physical product characteristics and market seg-mentation information They can be collected from manufacturerrsquosdescriptions of the product the trade press or the researcherrsquos priorIn collecting product characteristics we recall the two roles they playin the analysis explaining the mean utility level d ( ) in equation (3)and driving the substitution patterns through the term l ( ) inequation (3) Ideally these two roles should be kept separate If thenumber of markets is large enough relative to the number of prod-ucts the mean utility can be explained by including product dummyvariables in the regression These variables will absorb any productcharacteristics that are constant across markets A discussion of theissues arising from including brand dummy variables is given below

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 16: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

528 Journal of Economics amp Management Strategy

Relying on product dummy variables to guide substitution pat-terns is equivalent to estimating an unrestricted variance-covariancematrix of the random shock e ij t in equation (1) Both imply estimat-ing J (J 1) 2 parameters Since part of our original motivation wasto reduce the number of parameters to be estimated this is usuallynot a feasible option The substitution patterns are explained by theproduct characteristics and in deciding which attributes to collect theresearcher should keep this in mind

The last component of the data is information regarding thedemographics of the consumers in different markets Unlike marketshares prices or product characteristics this estimation can proceedwithout demographic information In this case the estimation will relyon assumed distributional assumptions rather than empirical distribu-tions The Current Population Survey (CPS) is a good widely avail-able source for demographic information

32 Identi cation

This section discusses informally some of the identication issuesThere are several layers to the argument First I discuss how in gen-eral a discrete choice model helps us identify substitution patternsusing aggregate data from (potentially) a small number of marketsSecond I ask what in the data helps us distinguish between differentdiscrete choice modelsmdashfor example how we can distinguish betweenthe logit and the random-coefcients logit

A useful starting point is to ask how one would approach theproblem of estimating price elasticities if a controlled experimentcould be conducted The answer is to expose different consumers torandomly assigned prices and record their purchasing patterns Fur-thermore one could relate these purchasing patterns to individualcharacteristics If individual purchases could not be observed theexperiment could still be run by comparing the different aggregatepurchases of different groups Once again in principle these patternscould be related to the difference in individual characteristics betweenthe groups

There are two potential problems with mapping the data des-cribed in the previous section into the data that arises from the idealcontrolled experiment described in the previous paragraph Firstprices are not randomly assigned rather they are set by prot-maxim-izing rms that take into account information that due to inferiorknowledge the researcher has to include in the error term This prob-lem can be solved at least in principle by using instrumentalvariables

The second somewhat more conceptual difculty arises becausediscrete choice models for example the logit model can be estimated

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 17: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 529

using data from just one market hence we are not mimicking theexperiment previously described Instead in our experiment we askconsumers to choose between products which are perceived as bun-dles of attributes We then reveal the preferences for these attributesone of which is price The data from each market should not be seenas one observation of purchases when faced with a particular pricevector rather it is an observation on the relative likelihood of pur-chasing J different bundles of attributes The discrete choice modelties these probabilities to a utility model that allows us to computeprice elasticities The identifying power of this experiment increasesas more markets are included with variation both in the characteristicsof products and in the choice set The same (informal) identicationargument holds for the nested logit GEV and random-coefcientsmodels which are generalized forms of the logit model

There are two caveats to the informal argument previouslygiven If one wants to tie demographic variables to observed pur-chases [ie allow for OtildeDi in equation (2)] several markets withvariation in the distribution of demographics have to be observedSecond if not all the product characteristics are observed and theseunobserved attributes are correlated with some of the observed char-acteristics then we are faced with an endogeneity problem The prob-lem can be solved by using instrumental variables but we note thatthe formal requirements from these instrumental variables dependon what we believe goes into the error term In particular if brand-specic dummy variables are included we will need the instrumentalvariables to satisfy different requirements I return to this point inSection 34

A different question is What makes the random-coefcients logitrespond differently to product characteristics In other words whatpins down the substitution patterns The answer goes back to the dif-ference in the predictions of the two models and can be best explainedwith an example Suppose we observe three products A B and CProducts A and B are very similar in their characteristics while prod-ucts B and C have the same market shares Suppose we observe mar-ket shares and prices in two periods and suppose the only changeis that the price of product A increases The logit model predicts thatthe market shares of both products B and C should increase by thesame amount On the other hand the random-coefcients logit allowsfor the possibility that the market share of product B the one moresimilar to product A will increase by more By observing the actualrelative change in the market shares of products B and C we can dis-tinguish between the two models Furthermore the degree of changewill allow us to identify the parameters that govern the distributionof the random coefcients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 18: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

530 Journal of Economics amp Management Strategy

This argument suggests that having data from more marketshelps identify the parameters that govern the distribution of the ran-dom coefcients Furthermore observing the change in market sharesas new products enter or as characteristics of existing products changeprovides variation that is helpful in the estimation

33 The Estimation Algorithm

In this subsection I outline how the parameters of the models des-cribed in Section 2 can be consistently estimated using the data des-cribed in Section 31 Following Berryrsquos (1994) suggestion a GMMestimator is constructed Given a value of the unknown parametersthe implied error term is computed and interacted with the instru-ments to form the GMM objective function Next a search is per-formed over all the possible parameter values to nd those valuesthat minimize the objective function In this subsection I discuss whatthe error term is how it can be computed and some computationaldetails Discussion of the instrumental variables is deferred to the nextsection

As previously pointed out a straightforward approach to theestimation is to solve

Minh

d s(x p d (x p raquo h 1) h 2) S d (7)

where s( ) are the market shares given by equation (4) and S are theobserved market shares However this approach is usually not takenfor several reasons First all the parameters enter the minimizationin equation (7) in a nonlinear fashion In some applications the inclu-sion of brand and time dummy variables results in a large number ofparameters and a costly nonlinear minimization problem The estima-tion procedure suggested by Berry (1994) which is described belowavoids this problem by transforming the minimization problem so thatsome (or all) of the parameters enter the objective function linearly

Fundamentally though the main contribution of the estimationmethod proposed by Berry (1994) is that it allows one to deal withcorrelation between the (structural) error term and prices (or othervariables that inuence demand) As we saw in Section 21 there areseveral variables that are unobserved by the researcher These includethe individual-level characteristics denoted (Di vi e i) as well as theunobserved product characteristics raquoj As we saw in equation (4)the unobserved individual attributes (Di vi e i ) were integrated overTherefore the econometric error term will be the unobserved productcharacteristics raquoj t Since it is likely that prices are correlated with thisterm the econometric estimation will have to take account of this The

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 19: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 531

standard nonlinear simultaneous-equations model [see for exampleAmemiya (1985 Chapter 8)] allows both parameters and variables toenter in a nonlinear way but requires a separable additive error termEquation (7) does not meet this requirement The estimation methodproposed by Berry (1994) and described below shows how to adaptthe model described in the previous section to t into the standard(linear or) nonlinear simultaneous-equations model

Formally let Z 5 [z1 zM ] be a set of instruments such that

E[Zm x ( h ) ] 5 0 m 5 1 M (8)

where x a function of the model parameters is an error term denedbelow and h denotes the ldquotruerdquo values of the parameters The GMMestimate is

Atildeh 5 argminh

x ( h ) cent Z F 1Z cent x ( h ) (9)

where F is a consistent estimate of E[Z cent x x cent Z] The logic driving thisestimate is simple enough At the true parameter value h the pop-ulation moment dened by equation (8) is equal to zero So wechoose our estimate such that it sets the sample analog of the momentsdened in equation (8) ie Z cent Atildex to zero If there are more indepen-dent moment equations than parameters [ie dim(Z) gt dim( h )] wecannot set all the sample analogs exactly to zero and will have to setthem as close to zero as possible The weight matrix F 1 denes themetric by which we measure how close to zero we are By using theinverse of the variance-covariance matrix of the moments we giveless weight to those moments (equations) that have a higher variance

Following Berry (1994) the error term is not dened as the differ-ence between the observed and predicted market shares as inequation (7) rather it is dened as the structural error raquoj t The advan-tage of working with a structural error is that the link to economictheory is tighter allowing us to think of economic theories that wouldjustify various instrumental variables

In order to use equation (9) we need to express the error termas an explicit function of the parameters of the model and the dataThe key insight which can be seen in equation (3) is that the errorterm raquoj t only enters the mean utility level d ( ) Furthermore themean utility level is a linear function of raquoj t thus in order to obtain anexpression for the error term we need to express the mean utility as a

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 20: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

532 Journal of Economics amp Management Strategy

linear function of the variables and parameters of the model In orderto do this we solve for each market the implicit system of equations

s( d t h 2) 5 S t t 5 1 T (10)

where s( ) are the market shares given by equation (4) and S are theobserved market shares

The intuition for why we want to do this is given below Insolving this system of equations we have two steps First we need away to compute the left-hand side of equation (10) which is denedby equation (4) For some special cases of the general model (eg logitnested logit and PD GEV) the market-share equation has an analyticformula For the full random-coefcients model the integral deningthe market shares has to be computed by simulation There are severalways to do this Probably the most common is to approximate theintegral given by equation (4) by

sj t (p t x t d t Pns h 2)

51ns

ns

Si 5 1

sj ti 51ns

ns

Si 5 1

acuteexp d j t 1 S K

k 5 1 x kj t(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

1 1 S Jm 5 1 exp d mt 1 S K

k 5 1 x kmt(frac34kv

ki 1 frac14k1Di1 1 1 frac14kdDid)

(11)

where ( m 1i m K

i ) and (Di1 Did) i 5 1 ns are draws fromAtildeP m (v) and P

D(D) respectively while xkj t k 5 1 K are the variables

that have random slope coefcients Note the we use the extreme-valuedistribution P

e ( e ) to integrate the e cent s analytically Issues regardingsampling from P

m v and AtildeP D(D) alternative methods to approximate

the market shares and their advantages are discussed in detail in theappendix (available from httpelsaberkeleyedu ~ nevo)

Second using the computation of the market share we invert thesystem of equations For the simplest special case the logit modelthis inversion can be computed analytically by d j t 5 ln Sj t ln S0t where S0t is the market share of the outside good Note that it is theobserved market shares that enter this equation This inversion canalso be computed analytically in the nested logit model (Berry 1994)and the PD GEV model (Bresnahan et al 1997)

For the full random-coefcients model the system of equations(10) is nonlinear and is solved numerically It can be solved by using

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 21: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 533

the contraction mapping suggested by BLP (see there for a proof ofconvergence) which amounts to computing the series

d h 1 1 t 5 d h

t 1 ln S t ln S(p t x t d h t Pns h 2)

t 5 1 T h 5 0 H (12)

where s( ) are the predicted market shares computed in the rst stepH is the smallest integer such that | | d H

t d H 1 t | | is smaller than some

tolerance level and d H t is the approximation to d t

Once the inversion has been computed either analytically ornumerically the error term is dened as

x j t 5 d j t(S t h 2) (xj t b 1 a pj t) ordm raquoj t (13)

Note that it is the observed market shares S that enter this equationAlso we can now see the reason for distinguishing between h 1 andh 2 h 1 enters this term and the GMM objective in a linear fashionwhile h 2 enters nonlinearly

The intuition to the denition is as follows For given values ofthe nonlinear parameters h 2 we solve for the mean utility levels d t( ) that set the predicted market shares equal to the observed marketshares We dene the residual as the difference between this valuationand the one predicted by the linear parameters a and b The estimatordened by equation (9) is the one the minimizes the distance betweenthese different predictions

Usually21 the error term as dened by equation (13) is theunobserved product characteristic raquoj t However if enough marketsare observed then brand-specic dummy variables can be included asproduct characteristics The coefcients on these dummy variable cap-ture both the mean quality of observed characteristics that do not varyover markets b xj and the overall mean of the unobserved characteris-tics raquoj Thus the error term is the market-specic deviation from themain valuation ie D raquoj t ordm raquoj t raquoj The inclusion of brand dummyvariables introduces a challenge in estimating the taste parameters b which is dealt with below

In the logit and nested logit models with the appropriate choiceof a weight matrix22 this procedure simplies to two-stage leastsquares In the full random-coefcients model both the computationof the market shares and the inversion in order to get d j t( ) have tobe done numerically The value of the estimate in equation (9) is thencomputed using a nonlinear search This search is simplied by noting

21 See for example Berry (1994) BLP Berry et al (1996) and Bresnahan et al (1997)22 That is F 5 Z cent Z which is the ldquooptimalrdquo weight matrix under the assumption

of homoskedastic errors

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 22: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

534 Journal of Economics amp Management Strategy

that the rst-order conditions of the minimization problem dened inequation (9) with respect to h 1 are linear in these parameters There-fore these linear parameters can be solved for (as a function of theother parameters) and plugged into the rest of the rst-order condi-tions limiting the nonlinear search to the nonlinear parameters only

The details of the computation are given in the appendix

34 Instruments

The identifying assumption in the algorithm previously given isequation (8) which requires a set of exogenous instrumental variablesAs is the case with many hard problems there is no global solutionthat applies to all industries and data sets Listed below are some ofthe solutions offered in the literature A precise discussion of howappropriate each set of assumptions has to be done on a case-by-casebasis but several advantages and problems are mentioned below

The rst set of variables that comes to mind are the instrumen-tal variables dened by ordinary (or nonlinear) least squares namelythe regressors (or more generally the derivative of the moment func-tion with respect to the parameters) As previously discussed thereare several reasons why these are invalid For example a variety ofdifferentiated-products pricing models predict that prices are a func-tion of marginal cost and a markup term The markup term is a func-tion of the unobserved product characteristic which is also the errorterm in the demand equation Therefore prices will be correlated withthe error term and the estimate of the price sensitivity will be biased

A standard place to start the search for demand-side instrumen-tal variables is to look for variables that shift cost and are uncorre-lated with the demand shock These are the textbook instrumentalvariables which work quite well when estimating demand for homo-geneous products The problem with the approach is that we rarelyobserve cost data ne enough that the cost shifters will vary by brandA restricted version of this approach uses whatever cost information isavailable in combination with some restrictions on the demand spec-ication [for example see the cost variables used in Nevo (2000a)]Even the restricted version is rarely feasible due to lack of anycost data

The most popular identifying assumption used to deal with theabove endogeneity problem is to assume that the location of productsin the characteristics space is exogenous or at least determined priorto the revelation of the consumersrsquo valuation of the unobserved prod-uct characteristics This assumption can be combined with a specicmodel of competition and functional-form assumptions to generatean implicit set of instrumental variables (as in Bresnahan 1981 1987)

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 23: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 535

BLP derive a slightly more explicit set of instrumental variables whichbuild on a similar economic assumption They use the observed prod-uct characteristics (excluding price and other potentially endogenousvariables) the sums of the values of the same characteristics of otherproducts offered by that rm (if the rm produces more than oneproduct) and the sums of the values of the same characteristics ofproducts offered by other rms23

Instrumental variables of this type have been quite successfulin the study of many industries including automobiles computersand pharmaceutical drugs One advantage of this approach is that theinstrumental variables vary by brand The main problem is that insome cases the assumption that observed characteristics are uncorre-lated with the unobserved components is not valid One example iswhen certain types of products are better characterized by observedattributes Another example is if the time required to change theobserved characteristics is short and therefore changes in character-istics could be reacting to the same sort of shocks as prices Finallyonce a brand dummy variable is introduced a problem arises withthese instrumental variables unless there is variation in the productsoffered in different markets there is no variation between markets inthese instruments

The last set of instrumental variables I discuss here was intro-duced by Hausman et al (1994) and Hausman (1996) and was used inthe context of the model described here by Nevo (2000a b) The essen-tial ideal is to exploit the panel structure of the data This argumentis best demonstrated by an example Nevo (2000a b) observes quan-tities and prices for ready-to-eat cereal in a cross section of cities overtwenty quarters Following Hausman (1996) the identifying assump-tion made is that controlling for brand-specic intercepts and demo-graphics the city-specic valuations of the product D raquoj t 5 raquoj t raquoj are independent across cities but are allowed to be correlated withina city over time Given this assumption the prices of the brand inother cities are valid instruments prices of brand j in two cities willbe correlated due to the common marginal cost but due to the inde-pendence assumption will be uncorrelated with the market-specicvaluation of the product

There are several plausible situations in which the independenceassumption will not hold Suppose there is a national (or regional)demand shock for example discovery that ber may reduce the risk

23 Just to be sure suppose the product has two characteristics horsepower (HP)and size (S) and assume there are two rms producing three products each Then wehave six instrumental variables The values of HP and S for each product the sum ofHP and S for the rmrsquos other two products and the sum of HP and S for the threeproducts produced by the competition

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 24: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

536 Journal of Economics amp Management Strategy

of cancer This discovery will increase the unobserved valuation ofall ber-intensive cereal brands in all cities and the independenceassumption will be violated Alternatively suppose one believes thatlocal advertising and promotions are coordinated across city bordersand that these activities inuence demand Then the independenceassumption will be violated

The extent to which the assumptions needed to support anyof the above instrumental variables are valid in any given situationis an empirical issue Resolving this issue beyond any reasonabledoubt is difcult and requires comparing results from several setsof instrumental variables combing additional data sources and usingthe researcherrsquos knowledge of the industry

35 Brand-Speci c Dummy Variables

As previously pointed out I believe that brand-specic xed effectsshould be used whenever possible There are at least two good reasonsto include these dummy variables First in any case where we areunsure that the observed characteristics capture the true factors thatdetermine utility xed effects should be included in order to improvethe t of the model We note that this helps t the mean utility leveld j while substitution patterns are driven by observed characteristics(either physical characteristics or market segmentation) as is the caseif we do not include a brand xed effect

Furthermore the major motivation (Berry 1994) for the estima-tion scheme previously described is the need to instrument for thecorrelation between prices and the unobserved quality of the prod-uct raquoj t A brand-specic dummy variable captures the characteris-tics that do not vary by market and the product-specic mean ofunobserved components namely xj b 1 raquoj Therefore the correlationbetween prices and the brand-specic mean of unobserved quality isfully accounted for and does not require an instrument In order tointroduce a brand dummy variable we require observations on morethan one market However even without brand dummy variables t-ting the model using observations from a single market is difcult(see BLP footnote 30)

Once brand dummy variables are introduced the error term isno longer the unobserved characteristics Rather it is the market-specic deviation from this unobserved mean This additional vari-ance was not introduced by the dummy variables it is present in allmodels that use observations from more than one market The use ofbrand dummy variables forces the researcher to discuss this additionalvariance explicitly

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 25: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 537

There are two potential objections to the use of brand dummyvariables First as previously mentioned a major difculty in esti-mating demand in differentiated product markets is that the numberof parameters increases proportionally to the square of the numberof products The main motivation for the use of discrete choice mod-els was to reduce this dimensionality problem Does the introductionof parameters that increase in proportion to the number of brandsdefeat the whole purpose No The number of parameters increasesonly with J (the number of brands) and not J 2 Furthermore the branddummy variables are linear parameters and do not increase the com-putational difculty If the number of brands is large the size of thedesign matrix might be problematic but given the computing powerrequired to run the full model this is unlikely to be a serious difculty

A more serious objection to the use of brand dummy variablesis that the taste coefcients b cannot be identied Fortunately this isnot true The taste parameters can be retrieved by using a minimum-distance procedure (as in Chamberlain 1982) Let d 5 (d1 dj ) cent

denote the J acute 1 vector of brand dummy coefcients X be the J acuteK (K lt J ) matrix of product characteristics that are xed across mar-kets and raquo 5 (raquo1 raquoj ) cent be the J acute 1 vector of unobserved productqualities Then from equation (1)

d 5 X b 1 raquo

If we assume that E[raquo | X] 5 0 24 then the estimates of b and raquo are

Atildeb 5 X cent V 1d X

1X cent V 1

dAtilded Atilderaquo 5 Atilded X Atildeb

where Atilded is the vector of coefcients estimated from the proceduredescribed in the previous section and Vd is the variance-covariancematrix of these estimates This is simply a GLS regression where theindependent variable consists of the estimated brand effects estimatedusing the GMM procedure previously described and the full sam-ple The number of ldquoobservationsrdquo in this regression is the numberof brands The correlation in the values of the dependent variableis treated by weighting the regression by the estimated covariancematrix Vd which is the estimate of this correlation The coefcientson the brand dummy variables provide an unrestricted estimate ofthe mean utility The minimum-distance procedure project these esti-mate onto a lower-dimensional space which is implied by a restricted

24 Note that this is the assumption required to justify the use of observed productcharacteristics as instrumental variables Here however this assumption is only usedto recover the taste parameters If one is unwilling to make it the price sensitivity canstill be recovered using the other assumptions discussed in the previous section

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 26: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

538 Journal of Economics amp Management Strategy

model that sets raquo to zero Chamberlain (1982) provides a chi-squaretest to evaluate these restrictions

4 AN APPLICATION

In this section I briey present the type of estimates one can obtainfrom the random-coefcients Logit model The data used for thisdemonstration were motivated by real scanner data (from the ready-to-eat industry) However the data is not real and should not be usedfor any analysis The focus is on providing a data set that can eas-ily be used to learn the method Therefore I estimate a somewhatrestricted version of the model (with a limited amount of data) For amore detailed and realistic use of the models presented here see eitherBLP or Nevo (2000a b) The data set used to generate the results andthe Matlab code used to perform the computation is available fromhttpelsaberkeleyedu ~ nevo

The data used for the analysis below consists of quantity andprices for 24 brands of a differentiated product in 47 cities over 2 quar-ters The data was generated from a model of demand and supply25

The marginal cost and the parameters required to simulate this modelwere motivated by the estimates of Nevo (2000b) I use two prod-uct characteristics Sugar which measures sugar content and Mushya dummy variable equal to one if the product gets soggy in milkDemographics were drawn from the Current Population Survey Theyinclude the log of income (Income) the log of income squared(Income Sq) Age and Child a dummy variable equal to one if theindividual is less than sixteen The unobserved demographics vi weredrawn from a standard normal distribution For each market I draw20 individuals [ie ns 5 20 in equation (11)]

The results of the estimation can be found in Table I The meansof the distribution of marginal utilities ( b cent s) are estimated by aminimum-distance procedure described above and presented in therst column The results suggest that for the average consumer moresugar increases the utility from the product Estimates of heterogene-ity around these means are presented in the next few columns Thecolumn labeled ldquoStandard Deviationsrdquo captures the effects of theunobserved demographics The effects are insignicant both econom-ically and statistically I return to this below The last four columnspresent the effect of demographics on the slope parameters The pointestimates are economically signicant I return to statistical signi-cance below The estimates suggest that while the average consumer

25 The demand model of Section 2 was used Supply was modeled using a standardmultiproduct-rms differentiated-product Bertrand model

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 27: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 539

TA

BL

EI

Re

su

lts

fr

om

the

Fu

ll

Mo

de

l

Mea

nsSt

and

ard

Inte

ract

ions

wit

hD

emog

rap

hic

Var

iabl

es

Dev

iati

ons

Var

iabl

eb

frac34In

com

eIn

com

eSq

Age

Chi

ld

Pri

ce32

433

184

816

598

065

9mdash

116

25( 7

743

)( 1

075

)( 1

723

34)

( 89

55)

( 52

07)

Con

stan

t1

841a

037

73

089

mdash1

186

mdash( 0

258

)( 0

129

)( 1

213

)( 1

016

)

Suga

r0

148a

000

40

193

mdash0

029

mdash( 0

258

)( 0

012

)( 0

005

)( 0

036

)

Mu

shy

078

8a0

081

146

8mdash

151

4mdash

( 00

13)

( 02

05)

( 06

97)

( 11

03)

GM

Mob

ject

ive

149

(Deg

rees

offr

eed

om)

(7)

Bas

edon

2256

obse

rvat

ion

sE

xcep

tw

her

en

oted

p

aram

eter

sar

eG

MM

esti

mat

es

All

regr

essi

ons

incl

ud

ebr

and

and

tim

ed

um

my

vari

able

sA

sym

pto

tica

lly

robu

stst

and

ard

erro

rsar

egi

ven

inp

aren

thes

es

aE

stim

ates

from

am

inim

um

-dis

tan

cep

roce

du

re

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 28: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

540 Journal of Economics amp Management Strategy

might like a soggy cereal the marginal valuation of sogginess de-creases with age and income In other words adults are less sensitiveto the crispness of a cereal as are the wealthier consumers The dis-tribution of the Mushy coefcient can be seen in Figure 1 Most of theconsumers value sogginess in a positive way but approximately 31of consumers actually prefer a crunchy cereal

The mean price coefcient is negative Coefcients on the inter-action of price with demographics are economically signicant whilethe estimate of the standard deviation suggests that most of the het-erogeneity is explained by the demographics (an issue we shall returnto below) Children and above-average-income consumers tend to beless price-sensitive The distribution of the individual price sensitivitycan be seen in Figure 2 It does not seem to be normal which is aresult of the empirical distribution of demographics In principle thetail of the distribution can reach positive valuesmdashimplying that thehigher the price the higher the utility However this is not the casefor these results

Most of the coefcients are not statistically signicant This isdue for the most part to the simplications I made in order to makethis example more accessible If the focus were on the real use ofthese estimates their efciency could be greatly improved by usingmore data increasing the number of simulations (ns) improving thesimulation methods (see the appendix on the web) or adding a supplyside (see Section 5) For now I focus on the economic signicance

FIGURE 1 FREQUENCY DISTRIBUTION OF TASTE FORSOGGINESS

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 29: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 541

FIGURE 2 FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT

As noted above all the estimates of the standard deviationsare economically insignicant26 suggesting that the heterogeneity inthe coefcients is mostly explained by the included demographicsA measure of the relative importance of the demographics and ran-dom shocks can be obtained from the ratios of the variance explainedby the demographics to the total variation in the distribution of theestimated coefcients these are over 90 This result is somewhatat odds with previous work27 The results here do not suggest thatobserved demographics can explain all heterogeneity they only sug-gest that the data rejects the assumed normal distribution

An alternative explanation for this result has to do with thestructure of the data used here Unlike other work (for example BLP)by construction I have no variation across markets in the choice set AsI mentioned in Section 32 this sort of variation in the choice set helpsidentify the variance of the random shocks This explanation how-ever does not explain why the point estimates are low (as opposedto the standard errors being high) and why the effect of demographicvariables is signicant

Table II presents a sample of estimated own- and cross-priceelasticities Each entry i j where i indexes row and j column givesthe elasticity of brand i with respect to a change in the price of j Sincethe model does not imply a constant elasticity this matrix will depend

26 Unlike the interactions with demographics even after taking measures toimprove the efciency of the estimates they will still stay statistically insignicant

27 Rossi et al (1996) nd that using previous purchasing history helps explainheterogeneity above and beyond what is explained by demographics alone Berry et al(1998) reach a similar conclusion using second-choice data

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 30: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

542 Journal of Economics amp Management StrategyTA

BL

EII

M

ed

ian

Ow

n-

an

dC

ro

ss

-Pr

ice

El

astic

itie

s

Ela

stic

ity

Cha

ract

eris

tics

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

ndB

rand

Bra

nd

Bra

ndSu

gar

Mus

hy1

23

45

1415

1924

12

12

1689

005

620

2429

011

610

0645

005

960

0611

007

590

2375

218

10

1119

345

050

1302

009

240

1905

023

050

2606

043

800

1165

34

10

2715

006

742

9208

010

830

0709

006

270

0684

009

440

2237

43

00

1868

007

360

1501

327

090

1470

015

570

1564

010

370

2670

512

00

0965

015

400

0947

013

265

5116

028

520

3085

029

880

1715

614

00

0699

016

420

0698

011

230

2518

036

280

3971

035

130

1369

73

10

2828

006

060

2390

010

350

0648

005

630

0598

008

650

2225

84

00

1743

007

890

1476

016

220

1524

016

200

1656

011

800

2555

914

00

0735

017

040

0717

011

280

2596

035

860

3966

034

430

1333

101

00

2109

006

430

1682

016

650

1224

011

990

1298

008

170

2809

1111

00

0967

015

560

0890

013

770

2426

033

220

3427

028

990

1769

124

10

2711

006

910

2339

011

210

0735

006

620

0704

009

850

2260

133

10

2827

005

970

2390

009

830

0632

005

540

0582

008

660

2189

1413

00

0834

018

250

0786

012

470

2697

383

930

3922

033

800

1600

1513

00

0889

017

630

0814

012

560

2491

035

424

3982

033

830

1582

1616

10

1430

022

820

1523

008

730

1447

014

420

1702

032

700

1264

1710

00

1138

013

990

1005

014

660

2172

028

670

2912

024

980

1875

183

00

1900

007

170

1542

016

560

1362

015

080

1578

010

260

2660

1920

10

0944

026

910

1130

007

710

2163

026

180

3107

333

350

1013

207

00

1420

009

990

1270

015

460

1791

020

740

2080

016

280

2251

2114

00

0773

017

060

0741

011

600

2534

035

730

3794

035

300

1416

226

00

1500

009

500

1283

016

020

1797

020

960

2085

015

300

2311

2312

00

0854

015

610

0815

013

070

2559

034

840

3753

031

540

1608

240

00

2249

005

480

1731

016

190

1110

010

440

1114

007

334

1215

Out

sid

ego

od0

0303

024

330

0396

005

520

2279

033

490

3716

038

960

0399

Cel

len

trie

si

jw

her

ei

ind

exes

row

and

jco

lum

ng

ive

the

per

cen

tch

ange

inm

arke

tsh

are

ofbr

and

iw

ith

aon

e-p

erce

nt

chan

gein

pri

ceof

jE

ach

entr

yre

pre

sent

sth

em

edia

nof

the

elas

tici

ties

from

the

94m

arke

ts

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 31: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 543

on the values of the variables used to evaluate it Rather than choosinga particular value (say the average or a value at a particular market)I present the median of each entry over the 94 markets in the sampleThe results demonstrate how the substitution patterns are determinedin this model Products with similar characteristics will have largersubstitution patterns all else equal For example brands 14 and 15have identical observed characteristics and therefore their cross-priceelasticities are essentially identical

A diagnostic of how far the results are from the restrictive formimposed by the logit model is given by examining the variation inthe cross-price elasticities in each column As discussed in Section 2the logit model restricts all elasticities within a column to be equalTherefore an indicator of how well the model has overcome theserestrictions is to examine the variation in the estimated elasticitiesOne such measure is given by examining the ratio of the maximumto the minimum cross-price elasticity within a column (the logit modelimplies that all cross-price elasticities within a column are equal andtherefore a ratio of one) This ratio varies from 9 to 3 Not only doesthis tell us the results have overcome the logit restrictions but moreimportantly it suggests for which brands the characteristics do notseem strong enough to overcome the restrictions This test thereforesuggests which characteristics we might want to add28

5 CONCLUDING REMARKS

This paper has carefully discussed recent developments in methodsof estimating random-coefcients (mixed) logit models The emphasiswas on simplifying the exposition and as a result several possibleextensions were not discussed I briey mention these now

51 Supply Side

In the above presentation the supply side was used only in orderto motivate the instrumental variables it was not fully specied andestimated In some cases we will want to fully specify a supply rela-tionship and estimate it jointly with the demand-side equations (forexample see BLP) This ts into the above model easily by addingmoment conditions to the GMM objective function The increase incomputational and programming complexity is small for standardstatic supply-side models As usual estimating demand and supply

28 A formal specication test of the logit model [in the spirit of Hausman andMcFadden (1984)] is the test of the hypothesis that all the nonlinear parameters arejointly zero This hypothesis is easily rejected

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 32: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

544 Journal of Economics amp Management Strategy

jointly has the advantage of increasing the efciency of the estimatesat the cost of requiring more structure The cost and benets are spe-cic to each application and data set

52 Consumer-Level Data

This paper has assumed that the researcher does not observe the pur-chase decisions of individuals There are many cases where this is nottrue In cases where only consumer data is observed usually estima-tion is conducted using either maximum likelihood or the simulatedmethod of moments [for recent examples and details see Goldberg(1995) Rossi et al (1996) McFadden and Train (2000) or Shum (1999)]The method discussed here can be applied in such cases by using theconsumer-level data to estimate the mean utility level d j t The esti-mated mean utility levels can now be treated in a similar way to themean utility levels computed from the inversion of the aggregate mar-ket shares Care has to be taken when computing the standard errorssince the mean utility levels are now measured with error

In most studies that use consumer-level data the correlationbetween the regressors and the error term which was the main moti-vation for the method discussed here is usually ignored [one notableexception is Villas-Boas and Winer (1999)] This correlation might stillbe present for at least two reasons First even though consumers takeprices and other product characteristics as given their optimal choicefrom a menu of offerings could imply that econometric endogeneitymight still exist (Kennan 1989) Second unless enough control vari-ables are included common unobserved characteristics raquoj t could stillbias the estimates The method proposed here could in principle dealwith the latter problem

Potentially one could observe both consumer and aggregatedata In such cases the analysis proposed here could be enrichedPetrin (1999) observes in addition to the aggregate market sharesof automobile models the probability of purchase by consumers ofdifferent demographic groups He uses this information in the formof additional moment restriction (thus forcing the estimated proba-bilities of purchase to predict the observed probabilities) Althoughtechnically somewhat different the idea is similar to using multipleobservations on the same product in different markets (ie differentdemographic groups) Berry et al (1998) generalize this strategy by t-ting three sets of moments to their sample counterparts (1) the marketshares as above (2) the covariance of the product characteristics andthe observed demographics and (3) the covariance of rst and secondchoice (they have a survey that describes what the consumerrsquos secondchoice was) As Berry et al (1998) point out the algorithm they use is

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 33: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 545

very similar to the one they introduced in BLP which was the basisfor the discussion above

53 Alternative Methods

An alternative to the discrete-choice methods discussed here is a mul-tilevel demand model The essential idea is to use aggregation andseparability assumptions to justify different levels of demand [seeGorman (1959 1971) or Deaton and Muellbauer (1980) and refer-ences therein] Originally these methods were developed to deal withdemand for broad categories like food clothing and shelter Recentlyhowever they have been adapted to demand for differentiated prod-ucts [see Hausman et al (1994) or Hausman (1996)] The top levelis the overall demand for the product category (for example RTEcereal) Intermediate levels of the demand system model substitutionbetween various market segments (eg between childrenrsquos and natu-ral cereals) The bottom level is the choice of a brand within a segmentEach level of the demand system can be estimated using a exiblefunctional form This segmentation of the market reduces the num-ber of parameters in inverse proportion to the number of segmentsTherefore with either a small number of brands or a large numberof (a priori) reasonable segments this methods can use exible func-tional forms [for example the almost ideal demand system of Deatonand Muellbauer (1980)] to give good rst-order approximations to anydemand system However as the number of brands in each segmentincreases beyond a handful this method becomes less feasible For acomparison between the methods described below and these multi-level models see Nevo (1997 Chapter 6)

54 Dynamics

The model presented here is static However it has close links to sev-eral dynamic models The rst class of dynamic models are models ofdynamic rm behavior The links are twofold The model used herecan feed into the dynamic model as in Pakes and McGuire (1994)On the other hand the dynamic model can be used to characterizethe endogenous choice of product characteristics therefore supplyingmore general identifying conditions

An alternative class of dynamic models examine demand-sidedynamics (Erdem and Keane 1996 Ackerberg 1996) These modelsgeneralize the demand model described here and are estimated usingconsumer-level data Although in principle these models could alsobe estimated using aggregate (high-frequency) data consumer-leveldata is better suited for the task

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 34: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

546 Journal of Economics amp Management Strategy

55 Instruments and Additional Applications

As was mentioned in Sections 32ndash34 the identication of parametersin these models relies heavily on having an adequate set of exoge-nous instrumental variables Finding such instrumental variables iscrucial for any consistent estimation of demand parameters and inmodels of demand for differentiated products this problem is furthercomplicated by the fact that cost data are rarely observed and proxiesfor cost will rarely exhibit much cross-brand variation Some of thesolutions available in the literature have been presented yet all suf-fer from potential drawbacks It is important not to get carried awayin the technical reworks and to remember this most basic yet verydifcult identication problem

This paper has surveyed some of the growing literature that usesthe methods described here The scope of application and potential ofuse are far from exhausted Of course there are many more potentialapplications within the study of industrial economics both in study-ing new industries and in answering different questions Howeverthe full scope of these methods is not limited to industrial organiza-tion It is my hope that this paper will facilitate further application ofthese methods

REFERENCES

Ackerberg D 1996 ldquoEmpirically Distinguishing Informative and Prestige Effects ofAdvertisingrdquo Mimeo Boston University

Anderson S A de Palma and JF Thisse 1992 Discrete Choice Theory of Product Differ-entiation Cambridge MA The MIT Press

Amemiya T 1985 Advanced Econometrics Cambridge MA Harvard University PressBarten AP 1966 ldquoTheorie en Empirie van een Volledig Stelsel van Vraagvergelijkingenrdquo

Doctoral Dissertation Rotterdam University of RotterdamBerry S 1994 ldquoEstimating Discrete-Choice Models of Product Differentiationrdquo Rand

Journal of Economics 25 242ndash262 J Levinsohn and A Pakes 1995 ldquoAutomobile Prices in Market Equilibriumrdquo

Econometrica 63 841ndash890 M Carnall and P Spiller 1996 ldquoAirline Hubs Costs and Markups and the Impli-

cations of Consumer Heterogeneityrdquo Working Paper No 5561 National Bureau ofEconomic Research

J Levinsohn and A Pakes 1998 ldquoDifferentiated Products Demand Systems froma Combination of Micro and Macro Data The New Car Marketrdquo Working PaperNo 6481 National Bureau of Economic Research also available at httpwwwecon yaleedu ~ steveb

and 1999 ldquoVoluntary Export Restraints on Automobiles Evaluatinga Strategic Trade Policyrdquo American Economic Review 89 (3) 400ndash430

Boyd HJ and RE Mellman 1980 ldquoThe Effect of Fuel Economy Standards on the USAutomotive Market An Hedonic Demand Analysisrdquo Transportation Research 14A367ndash378

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 35: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

Random-Coefcients Logit Models of Demand 547

Bresnahan T 1981 ldquoDepartures from Marginal-Cost Pricing in the American Automo-bile Industryrdquo Journal of Econometrics 17 201ndash227

1987 ldquoCompetition and Collusion in the American Automobile Oligopoly The1955 Price Warrdquo Journal of Industrial Economics 35 457ndash482

S Stern and M Trajtenberg 1997 ldquoMarket Segmentation and the Sources of Rentsfrom Innovation Personal Computers in the Late 1980rsquosrdquo RAND Journal of Economics28 S17ndashS44

Cardell NS 1989 ldquoExtensions of the Multinominal Logit The Hedonic DemandModel The Non-independent Logit Model and the Ranked Logit Modelrdquo PhDDissertation Harvard University

1997 ldquoVariance Components Structures for the Extreme-Value and Logistic Dis-tributions with Application to Models of Heterogeneityrdquo Econometric Theory 13 (2)185ndash213

and F Dunbar 1980 ldquoMeasuring the Societal Impacts of Automobile Downsiz-ingrdquo Transportation Research 14A 423ndash434

Chamberlain G 1982 ldquoMultivariate Regression Models for Panel Datardquo Journal ofEconometrics 18 (1) 5ndash46

Christensen LR DW Jorgenson and LJ Lau 1975 ldquoTranscendental Logarithmic Util-ity Functionsrdquo American Economic Review 65 367ndash383

Court AT 1939 ldquoHedonic Price Indexes with Automotive Examplesrdquo in Anon TheDynamics of Automobile Demand New York General Motors

Das S S Olley and A Pakes 1994 ldquoEvolution of Brand Qualities of Consumer Elec-tronics in the USrdquo Mimeo Yale University

Davis P 1998 ldquoSpatial Competition in Retail Markets Movie Theatersrdquo Mimeo YaleUniversity

Deaton A and J Muellbauer 1980 ldquoAn Almost Ideal Demand Sustemrdquo AmericanEconomic Review 70 312ndash326

Dixit A and JE Stiglitz 1977 ldquoMonopolistic Competition and Optimum ProductDiversityrdquo American Economic Review 67 297ndash308

Dubin J and D McFadden 1984 ldquoAn Econometric Analysis of Residential ElectricAppliance Holding and Consumptionrdquo Econometrica 52 345ndash362

Erdem T and M Keane 1996 ldquoDecision-making under Uncertainty CapturingDynamic Brand Choice Processes in Turbulent Consumer Goods Marketsrdquo Mar-keting Science 15 (1) 1ndash20

Gasmi F JJ Laffont and Q Vuong 1992 ldquoEconometric Analysis of Collusive Behaviorin a Soft-Drink Marketrdquo Journal of Economics amp Strategy 1 (2) 277ndash311

Goldberg P 1995 ldquoProduct Differentiation and Oligopoly in International MarketsThe Case of the Automobile Industryrdquo Econometrica 63 891ndash951

Gorman WM 1959 ldquoSeparable Utility and Aggregationrdquo Econometrica 27 469ndash481 1971 Lecture Notes Mimeo London School of Economics

Griliches Z 1961 ldquoHedonic Price Indexes for Automobiles An Econometric Analysisof Quality Changerdquo in The Price Statistics of the Federal Government hearing beforethe Joint Economic Committee of the US Congress 173-76 Pt 1 87th Cong 1stsess Reprinted in Z Griliches ed 1971 Price Indexes and Quality Change Studies inNew Methods in Measurement Cambridge MA Harvard University Press

Hausman J 1996 ldquoValuation of New Goods under Perfect and Imperfect Competi-tionrdquo in T Bresnahan and R Gordon eds The Economics of New Goods Studies inIncome and Wealth Vol 58 Chicago National Bureau of Economic Research

and D McFadden 1984 ldquoSpecication Tests for the Multinominal Logit ModelrdquoEconometrica 52 (5) 1219ndash1240

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338

Page 36: APractitioner’sGuideto EstimationofRandom-Coef”cients ...faculty.wcas.northwestern.edu/~ane686/research/RAs_guide.pdf · APractitioner’sGuideto EstimationofRandom-Coef”cients

548 Journal of Economics amp Management Strategy

and D Wise 1978 ldquoA Conditional Probit Model for Qualitative Choice DiscreteDecisions Recognizing Interdependence and Heterogeneous Preferencesrdquo Economet-rica 49 403ndash426

G Leonard and JD Zona 1994 ldquoCompetitive Analysis with Differentiated Prod-uctsrdquo Annales drsquoEconomie et de Statistique 34 159ndash180

Hendel I 1999 ldquoEstimating Multiple Discrete Choice Models An Application to Com-puterization Returnsrdquo Review of Economic Studies 66 423ndash446

Lancaster K 1966 ldquoA New Approach to Consumer Theoryrdquo Journal of Political Econ-omy 74 132ndash157

1971 Consumer Demand A New Approach New York Columbia University PressKennan J 1989 ldquoSimultaneous Equations Bias in Disaggregated Econometric Modelsrdquo

Review of Economic Studies 56 151ndash156McFadden D 1973 ldquoConditional Logit Analysis of Qualitative Choice Behaviorrdquo in

P Zarembka ed Frontiers of Econometrics New York Academic Press 1978 ldquoModeling the Choice of Residential Locationrdquo in A Karlgvist et al eds

Spatial Interaction Theory and Planning Models Amsterdam North-Holland and K Train 2000 ldquoMixed MNL Models for Discrete Responserdquo Journal of Applied

Econometrics forthcoming available from httpemlabberkeleyedu ~ trainNevo A 1997 ldquoDemand for Ready-to-Eat Cereal and Its Implications for Price Compe-

tition Merger Analysis and Valuation of New Goodsrdquo PhD Dissertation HarvardUniversity

2000a ldquoMeasuring Market Power in the Ready-to-Eat Cereal IndustryrdquoEconometrica forthcoming also available from httpemlabberkeleyedu ~ nevo

2000b ldquoMergers with Differentiated Products The Case of the Ready-to-EatCereal Industryrdquo Rand Journal Economics forthcoming 31 (Autumn)

Pakes A and P McGuire 1994 ldquoComputation of Markov Perfect Equilibria NumericalImplications of a Dynamic Differentiated Product Modelrdquo Rand Journal of Economics25 (4) 555ndash589

Petrin A 1999 ldquoQuantifying the Benets of New Products The Case of the MinivanrdquoMimeo University of Chicago available from httpgsbwwwuchicagoedufacamilpetrin

Quandt RE 1968 ldquoEstimation of Model Splitsrdquo Transportation Research 2 41ndash50Rosen S 1974 ldquoHedonic Prices and Implicit Marketsrdquo Journal of Political Economy 82

34ndash55Rossi P RE McCulloch and GM Allenby 1996 ldquoThe Value of Purchase History Data

in Target Marketingrdquo Marketing Science 15 (4) 321ndash340Shum M 1999 ldquoAdvertising and Switching Behavior in the Breakfast Cereal Marketrdquo

Mimeo University of Toronto available from httpwwwchassutorontocaecoecohtml

Spence M 1976 ldquoProduct Selection Fixed Costs and Monopolistic CompetitionrdquoReview of Economic Studies 43 217ndash235

Stern S 1995 ldquoProduct Demand in Pharmaceutical Marketsrdquo Mimeo StanfordUniversity

Stone J 1954 ldquoLinear Expenditure Systems and Demand Analysis An Application tothe Pattern of British Demandrdquo Economic Journal 64 511ndash527

Tardiff TJ 1980 ldquoVehicle Choice Models Review of Previous Studies and Directionsfor Further Researchrdquo Transportation Research 14A 327ndash335

Theil H 1965 ldquoThe Information Approach to Demand Analysisrdquo Econometrica 6375ndash380

Villas-Boas M and R Winer 1999 ldquoEndogeneity in Brand Choice Modelsrdquo ManagementScience 45 1324ndash1338