A Population-Growth Model for Multiple Generations of ...dieter/papers/msom_2013.pdf · Li, Armbruster, and Kempf: A Population-Growth Model for Multiple Product Generations 344 Manufacturing

This article was downloaded by: [129.219.247.33] On: 28 August 2014, At: 16:34Publisher: Institute for Operations Research and the Management Sciences (INFORMS)INFORMS is located in Maryland, USA

Manufacturing & Service Operations Management

Publication details, including instructions for authors and subscription information:http://pubsonline.informs.org

A Population-Growth Model for Multiple Generations ofTechnology ProductsHongmin Li, Dieter Armbruster, Karl G. Kempf,

To cite this article:Hongmin Li, Dieter Armbruster, Karl G. Kempf, (2013) A Population-Growth Model for Multiple Generations of TechnologyProducts. Manufacturing & Service Operations Management 15(3):343-360. http://dx.doi.org/10.1287/msom.2013.0430

Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions

This article may be used only for the purposes of research, teaching, and/or private study. Commercial useor systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisherapproval, unless otherwise noted. For more information, contact [email protected].

The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitnessfor a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, orinclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, orsupport of claims made of that product, publication, or service.

Copyright © 2013, INFORMS

Please scroll down for article—it is on subsequent pages

INFORMS is the largest professional society in the world for professionals in the fields of operations research, managementscience, and analytics.For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org

http://pubsonline.informs.org

http://dx.doi.org/10.1287/msom.2013.0430

http://pubsonline.informs.org/page/terms-and-conditions

http://www.informs.org

MANUFACTURING & SERVICEOPERATIONS MANAGEMENT

Vol. 15, No. 3, Summer 2013, pp. 343–360ISSN 1523-4614 (print) � ISSN 1526-5498 (online) http://dx.doi.org/10.1287/msom.2013.0430

© 2013 INFORMS

A Population-Growth Model for MultipleGenerations of Technology Products

Hongmin LiW. P. Carey School of Business, Arizona State University, Tempe, Arizona 85287, [email protected]

Dieter ArmbrusterSchool of Mathematical and Statistical Sciences, Arizona State University, Tempe, Arizona 85287,

[email protected]

Karl G. KempfDecision Engineering Group, Intel Corporation, Chandler, Arizona 85226, [email protected]

In this paper, we consider the demand for multiple, successive generations of products and develop apopulation-growth model that allows demand transitions across multiple product generations and takes into

consideration the effect of competition. We propose an iterative-descent method for obtaining the parameterestimates and the covariance matrix, and we show that the method is theoretically sound and overcomes thedifficulty that the units-in-use population of each product is not observable. We test the model on both sim-ulated sales data and Intel’s high-end desktop processor sales data. We use two alternative specifications forproduct strength in this market: performance and performance/price ratio. The former demonstrates better fitand forecast accuracy, likely due to the low price sensitivity of this high-end market. In addition, the param-eter estimate suggests that, for the innovators in the diffusion of product adoption, brand switchings are morestrongly influenced by product strength than within-brand product upgrades in this market. Our results indi-cate that compared with the Bass model, Norton–Bass model, and Jun–Park choice-based diffusion model, ourapproach is a better fit for strategic forecasting that occurs many months or years before the actual productlaunch.

Key words : product transitions; forecasting; multiple-generation demand model; diffusionHistory : Received: May 24, 2011; accepted: December 13, 2012. Published online in Articles in Advance

April 11, 2013.

1. IntroductionMarketing, producing, and delivering multiple gener-ations of products is becoming an evermore challeng-ing task for manufacturers of technology products.This paper originates from a collaborative effort withIntel Corporation to build models to support fore-casting when the company periodically introducesnewer generations of products in the presence of com-petition. The pace of new product introduction atIntel is driven by advances in both silicon manu-facturing technology and product architecture design(Shenoy and Daniel 2006). Every new product intro-duces changes in many dimensions: speed, cache size,power consumption, price, and so on. Not only doa product’s characteristics affect its own demand,they also dramatically influence the sales of adjacentgenerations of products, all of which complicate thetask of demand forecasting. To deliver its technol-ogy roadmap to the market, Intel develops and syn-chronizes plans for investing in factories, equipment,production, and distribution, each with a differentplanning time horizon, but all depending critically ona good demand forecast.

We focus on long-range forecasting, for which thecompany needs to model the demand of multiple,successive generations of products. Several elementsof the forecast are critical. First, long-range planning,which includes building new factories and procuringexpensive equipment, occurs many months or evenyears before the actual products are released to themarket. These decisions require information on theaggregate demand of each product over its life cycleas well as details such as when the demand begins,how fast it ramps, when it peaks, and the peak-leveldemand. Next, the model needs to capture interac-tions among the products and account for the com-petition that Intel faces. Finally, the model should beable to estimate forecast uncertainty because the pri-mary challenge of long-range planning is to mitigatethe risk of future uncertainty (Peng et al. 2012, Kempfet al. 2013). In this paper, we abstract from the situa-tion at Intel and develop a general demand model formultiple product generations and show its usefulnessin long-range forecasting.

When products are introduced to a marketwith multiple previous generations of products, amultitude of dynamics and interactions are in effect.

343

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.

Li, Armbruster, and Kempf: A Population-Growth Model for Multiple Product Generations344 Manufacturing & Service Operations Management 15(3), pp. 343–360, © 2013 INFORMS

We consider three major market dynamics that con-tribute to demand: (i) existing customers (i.e., thosewho own an older-version product) upgrading tonewer products, (ii) brand switching by customers,and (iii) market expansion. In this paper, we developa model that focuses on product upgrades and brandswitchings while incorporating market expansion asa trend correction. In other words, we do not modelthe macro dynamics driving the total market expan-sion (such as the state of the economy, the trend ofend-customer consumption, and the changing marketappetite for technology) but view our model as a toolfor forecasting the demand curve of each individualproduct, given the trend of market expansion.

1.1. Relationship to Prior ResearchBass (1969) characterizes the consumers for durablegoods as a combination of innovators who adoptthe product at a constant rate and imitators whoseadoption rate depends on the current population ofadopters. The resulting demand resembles a diffu-sion process. Compared with the time-series methodsthat are primarily data oriented, the Bass model takesinto consideration the underlying market dynamics topredict demand. Researchers have since extended theBass model to incorporate demand-influencing fac-tors such as advertising, price, and product-specificattributes (Bass 1980, Bass et al. 1994, Kamakura andBalasubramanian 1988, Jain and Rao 1990, Kalish1985) as well as Bayesian updating of the diffusionparameters using early market data (Wu et al. 2010).However, these extensions are limited to a single-product diffusion model.

Two recent review papers (Meade and Islam 2006,Peres et al. 2010) summarize related work on dif-fusion between technology generations. Fisher andPry (1971) model the substitution of a new technol-ogy for the old technology assuming that the marketshare of the new technology grows at an exponentialrate. Their model is limited to two products and cap-tures the demand during only the transition periodinstead of each product’s entire life cycle. Nortonand Bass (1987) consider the diffusion of successivegenerations of products (which we refer to as theNorton–Bass model hereafter). They combine prod-uct substitution with diffusion and allow the adop-tion of the next-generation product to be composed oftwo parts: those from the untapped market potentialand those from adopters of the old product upgrad-ing to the newer product. The Norton–Bass modelyields the overlapping bell-shaped demand curvescommonly observed when multiple generations ofproducts are sold concurrently. However, the com-plexity of this model increases dramatically with thenumber of products. Another limitation of this modelis that product substitution only occurs between two

adjacent generations, not across multiple generations.For semiconductor products, customers often leapfrogas they upgrade and the ability to capture such detailallows a firm to design market strategies that tar-get specific populations (see Gordon 2009). Moreover,both the Bass model and its extensions usually requiredata observations that include the demand peak.Therefore, these models are more useful if a substan-tial number of sales observations are already availablefor the product to be forecasted. This inevitably limitsthe prediction window to a much shorter time periodthan that required by long-range planning decisions.

Our paper is also related to choice-theory-based demand models, such as Melnikov (2001),Song and Chintagunta (2003), Gordon (2009), andGowrisankaran and Rysman (2009), in which con-sumers’ purchasing behavior is modeled as a utility-maximization problem. A general drawback of thesemodels is that parametrization is computationallyintensive and often product aggregation is neces-sary (see Gordon 2009). In comparison, our approachreproduces complicated time-series data at the levelof individual product with relatively small computa-tional effort.

Jun and Park (1999) combine a choice model witha diffusion model to predict sales of multiple gener-ations of products. They assume an aggregate Bassdiffusion for the entire market and let the shareof sales for each product be determined by a logitchoice probability. In particular, the “type II” model(which we refer to as the Jun–Park model hereafter)described in this paper can be parameterized in theabsence of units-in-use data. They achieve this bymixing product upgrades with first-time purchases.In contrast, our model differentiates these differentsources of sales, enabling the design of population-specific marketing strategies. The Jun–Park modeluses product-specific parameters, and thus its appli-cation is restricted to two-step-ahead or three-step-ahead forecasts or to naively copy sales of a previous-generation product as the forecast for a new product.In addition, they model customers’ utility as a linearfunction of time, thus customers’ valuation of a prod-uct is assumed to change monotonically with timethroughout its lifetime. Consequently, for a new prod-uct to replace the older generations, the time coeffi-cient has to always increase from one product to thenext, regardless of product strength. This confoundsparameter interpretation and makes the model diffi-cult to apply (because one cannot predict what thetime coefficient would be for a new product). In con-trast, our model provides both clear interpretationsfor the parameters and a clear path for how to forecastsales of future products.

Bayus et al. (2000) review a two-product population-growth model and show that several previously stud-ied models, including the Norton–Bass model and the

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.

Li, Armbruster, and Kempf: A Population-Growth Model for Multiple Product GenerationsManufacturing & Service Operations Management 15(3), pp. 343–360, © 2013 INFORMS 345

Lotka–Volterra (Murray 2002) predator-prey model,can all be considered special cases of this model.The population-growth model, often used in ecol-ogy (Pielou 1977) and sociology (Tuma and Hannan1984), has clear advantages over the Norton–Bassmodel: it captures product interactions, allows gen-eration leapfrogging, and allows an arbitrary num-ber of products to coexist. However, existing applica-tions are limited to cases where the population sizesare directly observable or can be easily estimated—for example, Mahajan and Muller (1996) on thedemand for IBM mainframe computers and Kim et al.(2000) on subscriptions of telecommunication ser-vices. In both papers, the units-in-use population iseasily identifiable by the number of service contractsin place. This is not the case for most other prod-ucts. For example, at Intel, a customer who purchasesthe newest generation i microprocessor could previ-ously be a user of generation i − 11 i − 2, or a userof the competitor’s products, which is not observedby Intel. In addition, the sales of generation i productdo not reveal how many customers have left genera-tion i, making it impossible to track the size of pop-ulation i. Furthermore, sales data for the competitionare difficult to obtain. Our approach builds upon apopulation-growth model but overcomes the limita-tion of nonobservable population size and the lack ofsales data of the competition.

In this paper, we do not consider supply constraintsand use the terms “demand” and “sales” interchange-ably. For new product diffusion models under sup-ply constraints, one may refer to Ho et al. (2002) andKumar and Swaminathan (2003), which extend thesingle-product Bass model. In addition, we do notconsider used or remanufactured products and theirimpacts on the diffusion dynamics, which is the sub-ject of a related paper by Debo et al. (2006).

1.2. Summary of Contribution and OrganizationWe present a demand model for multiple genera-tions of products and develop a novel parametriza-tion method that takes advantage of the flexibilityafforded by the population-growth model even whenthe population data cannot be obtained. We show thatthis method performs well on synthetic data, gener-ated by a known demand obscured by noise. We thenapply this method to Intel’s microprocessor sales dataand show that it outperforms other alternatives.

Our model is more appropriate for long-rangeforecast than existing models because it does notneed product-specific parameters to forecast sales. Forinstance, the Bass model requires sales data for a par-ticular product to first derive the diffusion parame-ters of this product and then forecast for its remain-ing lifetime. With multiple products, the number ofparameters grows combinatorially: Not only does

each product add its own set of diffusion parameters,but for each pair of products, additional parametersare needed to model product interactions (see, e.g.,Mahajan and Muller 1996, Danaher et al. 2001). Fur-thermore, it is not clear how product-specific differ-ences should be taken into account to modify theseparameters for future products. In comparison, weparameterize the model based on product strength,which enables forecast for products that are not yetreleased to the market and even years away from thetime of forecast.

To our knowledge, our model is the first to com-bine brand switchings and within-brand productupgrades among multiple product generations intoone model framework. Existing work on diffusionmodels with competition only considers one productfor the focal firm (see, e.g., Savin and Terwiesch 2005,Libai et al. 2009).

Finally, we show in this paper how to estimate theparameter variances that characterize the confidenceof the forecast as well as how to adjust the varianceestimation when the assumption of independent andidentical noise does not hold.

The rest of this paper is organized as follows.Section 2 describes the multiproduct demand modelin detail. In §3, we present the basic idea forovercoming the problem of unobservable popula-tion size. We examine the identification condition forthis model and prove convergence of the proposedmethod. In §4, we test the model using stochasti-cally generated sales data. We apply the model to themicroprocessor data supplied by Intel in §5. We thencompare the model’s fit and forecast performancewith the Bass, Norton–Bass, and Jun–Park models.We conclude in §6, summarizing the key assumptionsand discussing the limitations.

2. Model Description andAssumptions

In this section, we present a discrete-time population-growth model for multiple generations of products.Assume that a company is currently selling a total of ngenerations of product on the market, indexed by theorder of each product’s market entry. We associate apopulation xi with each product i = 11 0 0 0 1n, indicat-ing the current number of units in use for this prod-uct. We assume that a customer will never purchase aproduct that is older (in terms of the product’s intro-duction time) than the one he currently owns. In addi-tion, once a customer purchases a new product, hewill scrap the old product he previously owned ordowngrade it to a secondary usage. Therefore, thestate of a customer can be represented by i—the lat-est product he owns. Similar to the Bass model, weassume that each customer purchases at most one unitof product each time.

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


We consider H time periods. Let xi4t5 be the popu-lation of product i at the beginning of time period t,and let si4t5 be the sales of product i during period t.At the beginning of the focal time horizon, we assumethat the market starts with an existing units-in-usepopulation of some earlier generation(s). These mayinclude product generations that are older than prod-uct 1, which are not selling any more but still havea units-in-use population. Let K = 8−k1 0 0 0 1−1109be the set of these older products. We assume thatxi4051 i = −k1 0 0 0 10111 0 0 0 1n are given, with xi405 = 0if product i has not been introduced yet. As we showlater in applications on both simulated and Intel data,the method is robust to perturbations in the initialpopulation size.

2.1. Product UpgradesAs customers of an older product upgrade to a newerproduct, sales occur and the population xi evolves.Specifically, the value of xi increases if a customerwho previously owned an older product purchasesproduct i and decreases if a customer who previouslyowned product i decides to buy a newer product. LetPij be the fractional flow rate from population i to j ,i.e., the fractional rate at which a customer of prod-uct i will buy product j . The population evolutioncould then be described by the difference equation

xi4t + 15− xi4t5

=∑

j<i

xj4t5Pji − xi4t5∑

j>i

Pij1 i = 11 0 0 0 1n1 (1)

and the sales rate of product i due to upgrades isgiven by

∑

j<i xj4t5Pji, which is the first term of theright side of Equation (1).

2.2. Brand SwitchingThe diffusion of a new product is affected not onlyby adjacent generations of products sold by the samefirm but also by products from competitors. In manycases, the competing firms also sell successive gen-erations of products to the same pool of customers.As a result, the population flow could occur acrossbrands and between any two products on the market.However, modeling the flow between each productof the focal company and each competing product isnot desirable because product-level sales data fromcompetition are not readily available. In this paper,we do not differentiate individual products sold bycompetitors but instead treat them as one single pop-ulation y, which has a time-varying strength fy4t5,reflecting the improvement of competitive productsover time. In practice, there may be multiple compet-ing products and one has to determine fy4t5 carefully.For example, one may view fy4t5 as either the aver-age strength of competing products, or the strengthof the best competing product at time t. Similar to

the assumption of known xi405, we assume that y405is known.

We assume that the population flow from popu-lation j to the competition or from the competitionto population j is determined by the gap of productstrength. Specifically, if the strength of product j ishigher (lower) than fy4t5, then there is a flow frompopulation y to population j (population j to y) butnone from j to y (y to j). Let J 4t5 4J 4t5) be the setof products stronger (weaker) than the competition.Clearly, the set J may change with time. Denote thefractional flow rate from xi1 i ∈ J 4t5 to y as Piy and thatfrom y to xi1 i ∈ J 4t5 as Pyi. Therefore, sales of product idue to brand switching are given by 6y4t5PyiI4i ∈ J 4t557,where I4 · 5 is an indicator function.

2.3. Market ExpansionA third source of sales comes from “new” customers,i.e., customers who have not previously purchased aproduct in this market, whether from the focal com-pany or from competition. As discussed earlier, thisis driven by multifacet macroeconomic factors suchas world economy and overall development of tech-nology. At Intel, forecast for the total market is aseparate process from that for individual products.In this paper, we follow the Intel practice and proposea simple approach to correct for the overall markettrend. We incorporate this demand source through aknown percentage growth rate �4t51 t = 11 0 0 0 1H . Lets4t5 define the total sales of this market (includingboth the focal firm and its competition) in period t.We assume that sales due to market expansion inperiod t is given by �4t5 · s4t − 15 and that this salesgrowth is split proportionally among products of boththe focal firm and the competition based on eachproduct’s most recent market share. In other words,if the total market grows by �4t5 · s4t − 15 in period t,where s4t−15 is the total sales in the previous period,then product i gains �4t5 · si4t − 15. Although this isa simplification, if one can safely assume that mar-ket growth is from a population that is similar tothe current adopter population, this proportional splitassumption is reasonable. In addition, the assump-tion of exogenous market expansion leads to a flexiblemodel that accommodates essentially any trend of theoverall market expansion.

2.4. Resulting SalesSummarizing the three sources of population flows,we obtain the sales for each product at time t:

si4t5 =∑

j<i

xj4t5Pji + y4t5PyiI4i ∈ J 4t55+�4t5si4t − 151

∀ t = 1121 0 0 0 1H1 (2)

where the three terms on the right represent sales dueto upgrade, brand switching, and market expansion,respectively.

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


We assume that, similar to the Bass model, the frac-tional flow rate from population i to j is given by

Pij = pij + qijxj1 (3)

where pij represents an innovation effect and qij theword-of-mouth effect. Furthermore, we assume that theparameters pij and qij are linearly dependent on prod-uct strength:

pij = �1 +�2fij1 qij = �3 +�4fij1 (4)

where fij is the difference in product strength betweenproduct i and product j measured in percentageimprovement. The parameters �2 and �4 character-ize the importance of product strength, whereas �1and �3 incorporate transitions that are independent ofproduct strength. A linear relationship is commonlyadopted by researchers for estimating the impact ofinfluencing factors because of its simplicity (e.g., Basset al. 1994). We take a similar approach for includingproduct strength.

Let fiy 4fyi) represent the percentage improvementof product i over product y (product y over product i),we assume that the diffusion parameters piy1 pyi, qiy ,and qyi satisfy

piy = �5 +�6fiy1 qiy = �7 +�8fiy1 ∀ i ∈ J 1 (5)

pyi = �5 +�6fyi1 qyi = �7 +�8fyi1 ∀ i ∈ J 0 (6)

Note that the flows from xi to y and from y to xi havethe same coefficients �k, k = 5–8. This is based on theobservation at Intel that customers who switch brandstend to have similar characteristics. In the rare casewhere this is not true, the model could be extendedat the cost of additional parameters.

We obtain the sales of product i by substitutingEquations (4)–(6) into Equation (2):

si4t5 = �1

[

∑

j<i

xj4t5

]

+�2

[

∑

j<i

fjixj4t5

]

+�3

[

∑

j<i

xi4t5xj4t5

]

+�4

[

∑

j<i

fjixi4t5xj4t5

]

+�5

[

y4t5I4i∈ J 5]

+�6

[

fyiy4t5I4i∈ J 57+�76xi4t5y4t5I4i∈ J 5]

+�8

[

fyixi4t5y4t5I4i∈ J 5]

+�4t5si4t−150 (7)

Therefore, conditional on x4t5 = 4x14t51 0 0 0 1 xn4t55and y4t5, si4t5 − �4t5sy4t − 15 is a linear function ofthe parameter vector Â = 4�11 0 0 0 1�85. We define amatrix X with dimension nH × 8 such that

Xt+4i−15H

=

(

∑

j<i

xj4t51∑

j<i

fjixj4t51∑

j<i

xi4t5xj4t51∑

j<i

fjixi4t5xj4t51

y4t5I4i ∈ J 51 fyiy4t5I4i ∈ J 51 xi4t5y4t5I4i ∈ J 51

fyixi4t5y4t5I4i ∈ J 5

)

1 (8)

where Xt+4i−15H is the 6t+ 4i− 15H7th row vector of X.

Then we can rewrite Equation (7) as

s = XÂ1 (9)

where

s =(

8si4t5−�4t5si4t − 159i=11210001n3 t=11210001H

)

(10)

is a vector with its 6t + 4i − 15H7th element equalto si4t5− �4t5si4t − 15, representing the sales of prod-uct i during period t “corrected” for the market trend.Throughout the rest of the paper, we assume that thematrix X has full rank.

To ensure that the discrete time model is wellbehaved, we assume Â is small such that sales in andout of each population are small relative to the cur-rent population size. Consequently, the values of xi4t5are finite and nonnegative. From an implementationperspective, this is equivalent to keeping the discretetime unit sufficiently small.

Suppose we know the values of xi4t5, and y4t5 andthe sales si4t5 ∀i1 t, then we can solve the linear sys-tem of equations given by Equation (9) to obtain theparameter Â. In a case with measurement error insales, one can obtain the estimate for Â using the lin-ear regression model

s = XÂ+ Å0 (11)

where Å is a vector of independent and normally dis-tributed noises. Unfortunately, as discussed earlier,xi4t5 are in most cases not available, and therefore, wecannot use linear regression methods to estimate Â.Rather, the model we need to estimate is

s = X4Â5Â+ Å1 (12)

where X is a function of Â.

3. Solving the Nonlinear RegressionThe conventional nonlinear regression method forestimating Â in Equation (12) involves minimizing thesum of squares

v4Â5≡ 4X4Â5Â− s5T 4X4Â5Â− s5 (13)

by optimizing Â. Substituting Equation (3) into Equa-tions (1) and (2), we obtain the population andsales as quadratic recursive equations (for details, seeOnline Appendix A.1, available at http://dx.doi.org/10.1287/msom.2013.0430), which implies that si4t5,xi4t5 (similarly, sy4t5 and y4t5) are polynomial func-tions of the parameter vector Â with order 2t . Hence,the problem of minimizing v4Â5 over the parametervector Â is of polynomial order 22t , which is practi-cally infeasible to solve for any reasonably large t.

Given the special structure of this problem, wepropose an iterative procedure that takes advantage

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


of the linear structure of Equation (7) to obtain theoptimal parameter estimates. Although the popula-tion paths xi4t5 and y4t5, t = 11 0 0 0 1H are not known,we can construct them based on the current parame-ter estimates. Then, from the constructed populationpaths, we obtain an updated estimate of the param-eter vector using the constructed data matrix andthe actual sales vector. In particular, we perform lin-ear regression using Equation (11). We then repeatthis process until the parameter estimates converge.In other words, we use the constructed data matrixX4Â5 as the “pseudoregressor.” The following is astep-by-step description of the procedure.

• Step 1. Assume that we know the values of xiand y at time t = 0. In the kth iteration, we use thecurrent estimate Âk for the parameter vector Â toconstruct the population path xi4t5 and y4t5 for t =

11 0 0 0 1H following Equations (A.1)–(A.3) (note thatsy4t5 is also constructed as an intermediary). We thenderive the sales path si4t5 using Equation (7).

• Step 2. Construct the matrix X4Âk5 and the col-umn vector s4Âk5 as defined in Equations (8) and (10).Next, run linear regression of s4Âk5 against X4Âk5 toobtain an updated set of parameters given by

Âk+1≡(

�k+11 1�k+1

2 1 0 0 0 1�k+18

)T

=[

X4Âk5T X4Âk5]−1

X4Âk5T s4Âk50 (14)

• Repeat Steps 1 and 2 using the updated parame-ters Âk+1 until convergence, i.e., when the percentageimprovement of the residual sum of square (whichapproximates the scaled norm of the gradient) fallsbelow a very small number.

The iterative method described above is concep-tually similar to a fixed-point iteration method forsolving a system of nonlinear equations. This canbe seen by omitting the error term � and rewritingEquation (14) as Â= 6X4Â5T X4Â57−1X4Â5T s. For a fixedpoint method to work, the mapping from Âk to Âk+1

needs to be a contraction. However, this is not gen-erally true, even without the error term Å. Therefore,the method described above does not always con-verge. Indeed, we observe both cases of convergenceand cases of local divergence where the sequence Âk

oscillates around the fixed point but never converges.In the next two subsections, we examine the math-ematical conditions required for the model in Equa-tion (12) to be identifiable and show that convergencecan be achieved by modifying Equation (14).

Before we proceed, we note that this iterativeapproach is analogous to the well-known Gauss–Newton nonlinear regression method (Amemiya1985), in which the nonlinear model is linearizedbased on the Taylor series approximation at an ini-tial parameter estimate and then a new set of param-eter estimate obtained from the linear regression is

used as the new starting parameter value for subse-quent iterations. In our model, we also take advan-tage of a linear regression step, but it is based on thespecial structure of this multiproduct demand modelinstead of the Taylor series approximation. In addi-tion, our method bears some resemblance to theexpectation–maximization (EM) method (Dempsteret al. 1977), which obtains the maximum-likelihoodestimate under incomplete data. In the EM method,one uses an initial estimate of the parameter to com-pute conditional distributions of the missing dataand then a new estimate of the parameter is derivedby maximizing the expected log likelihood function.In our problem, we derive the expected values of the“missing data”—namely, the population sizes—basedon the current parameter estimate so as to reduce acomplex nonlinear regression to a series of simple lin-ear regressions.

3.1. Model IdentificationIn the iterative approach described above, we circum-vent the problem of unobservable population pathsby constructing them using current best estimates ofthe parameters. Because of the missing information,we encounter the problem of parameter identifiability.

A set of model parameters is identifiable if no otherset of parameter values leads to the same probabil-ity distribution of the dependent variables, in whichcase the two parameter points are observationally equiv-alent (Rothenberg 1971). If two parameter points areobservationally equivalent, then we cannot statisti-cally distinguish one from the other. In this problem,this would imply that there might be multiple sets ofÂ parameter values that could generate the same salesdistribution. In the special case where the measure-ment error is zero, identifiability is equivalent to theexistence of a unique fixed point.

In our problem, if the population paths xi4t5,t = 11 0 0 0 1H are observable, then the model describedby Equation (7) is a linear model, s = XÂ + Å, whichis identifiable if the error term �i is zero mean,independent of X, and the matrix X is of full rank(Greene 2003).

However, the observation of X is in general notreadily available and the true model is s = X4Â5Â+ Å,which is nonlinear. Rothenberg (1971) shows thata nonlinear model is locally identifiable if theinformation matrix as defined by R4Â5 = 6rij4Â57 =

E6¡ log f /¡�i · ¡ log f /¡�j 7, where f is the probabil-ity density of the dependent variable for a given setof parameter values Â, is nonsingular at any regularpoint of the matrix R4Â5. In addition, if f belongs toa special class of the exponential family (e.g., multi-variate normal), then the parameter vector Â is glob-ally identifiable. A straightforward application of theRothenberg result leads to the following proposition.

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Proposition 3.1. Assume that a noise term �i4t5 isadded to the sales si4t5 (Equation (7)), where �i4t5, i = 110 0 0 1n are of independent normal distributions. Then globalidentification requires 6ïÂ4X4Â5Â57T to be of full rank.

As pointed out by Amemiya (1985), nonlinearitygenerally helps identification so that the full rankrequirement does not imply that the number ofvariables needs to be greater than or equal to thenumber of parameters. Therefore, the condition that6ïÂ4X4Â5Â57T be of full rank is not restrictive butin fact is easier to satisfy than in an entirely linearsystem.

3.2. Model ConvergenceThe identification condition ensures that no twoparameter sets generate the same probability distri-bution of sales. However, it does not guarantee that,for any given set of sales data, the procedure con-verges or that it converges to a stationary point of theoptimization problem Â = arg minÂ v4Â5, where v4Â5is defined in Equation (13).

In this section, we present a modification of Equa-tion (14) that leads to convergence. Specifically, theparameter values used in the next iteration Âk+1 isdetermined as follows: Let X4Âk5 be the populationmatrix constructed using the parameter vector Âk

and let bk be the optimal parameter values obtainedthrough the linear regression using Equation (11), i.e.,bk = 6X4�k5T X4Âk57−1X4Âk5T s. We define the directionvector dk = bk − Âk and update the parameter esti-mates by Âk+1 = Âk +�kdk, where �k ∈ 40117 representsa scalar step size.

To fix ideas, we follow the definition in Bertsekas(2003) regarding a descent direction in gradientdescent algorithms.

Definition 1. Let f 4Â5 be a continuously differen-tiable function of the vector Â. A sequence 8dk9k∈�is gradient related to Âk if 8dk9k∈� is bounded andlim supk→�1 k∈�8ïf 4Â

k5T dk9 < 0.

Lemma 3.2. Define the sequence 8dk9 and 8Âk9 suchthat dk = bk − Âk and Âk+1 = Âk + �kdk. Assumethat 6ïÂ4XÂk57T 6XT X7−1XT is positive definite. Then thesequence 8dk9 is gradient related to 8Âk9.

The proof of Lemma 3.2 is provided in OnlineAppendix A.2. Employing a result in Bertsekas (2003,Proposition 1.2.1, p. 43), we show that the sequence8Âk9 converges to a stationary point of v4Â5 if Âk is suf-ficiently small and the step size �k is properly chosen.

Corollary 3.3. Assume that the positive definitenesscondition in Lemma 3.2 is satisfied. If the step size �k ischosen by the Armijo rule or any step size rule that yieldsa larger cost reduction, i.e., a larger reduction in v4Â5, ateach iteration step than the Armijo rule, the sequence 8Âk9converges to a stationary point of v4Â5.

The Armijo rule is a successive reduction rule suchthat a sufficiently large cost reduction is achieved (seeBertsekas 2003 for more details). The proof of Corol-lary 3.3 is straightforward: The positive definitenesscondition leads to a gradient-related sequence 8dk9.According to Proposition 1.2.1 in Bertsekas (2003), ifthe step size is determined by the Armijo rule orone with higher cost reduction than the Armijo rulein each step, then every limit point of 8Âk9 is a sta-tionary point. A direct consequence of Corollary 3.3is that the estimator obtained using this approachis the nonlinear least squares estimator by definition(see Greene 2003).

Corollary 3.4. Suppose that the positive definitenesscondition in Lemma 3.2 holds. Then the iterative approachdescribed in Corollary 3.3 yields the nonlinear least squaresestimator.

We refer to the iterative approach in Corollary 3.3 asthe “iterative-descent” approach hereafter. In general,the condition in Lemma 3.2 is difficult to verify evenfor a given Âk. However, we show that this conditionis always satisfied asymptotically when Âk → 0.

Proposition 3.5. The matrix limÂ→06ï4XÂ57T 4XT X5−1·

XT is positive definite and thus the iterative-descentapproach converges to a stationary point of v4Â5, i.e., thenonlinear least squares estimator of Â, if Âk → 0.

The proof of Proposition 3.5 is given in OnlineAppendix A.3.

Next, we remark that as Â → 0, our method notonly converges, but also converges fast. Because X isfull rank, XT X is invertible. Thus, we can rewrite dk as

dk= −4Âk

− bk5= −[

Âk− 4XT X5−1XT s

]

= −4XT X5−1XT 4XÂk− s51

where we omit the argument Âk of X for brevity. Fromthe definition of v4Â5, we have ïv4Âk5 = 26ï4XÂk57 ·

4XÂk − s5. Therefore, as Âk → 0, the descent direc-tion dk = −4XT X5−1XT 4XÂk − s5 → −

12 4X

T X5−1XT ·

4XXT 5−1Xïv4Âk5. Because of the full-rank assump-tion of X, the matrix 4XT X5−1XT 4XXT 5−1X is positivedefinite (see Online Appendix A.4). Therefore, ourmethod is a quasi-Newton method, which typicallyconverges fast (Bertsekas 2003, p. 148) when used incombination with the Armijo rule or the minimiza-tion rule (in which a search along the direction of dk

is performed to find the step size that maximizes costreduction).

3.3. Estimate the CovarianceWhen the condition in Proposition 3.5 is satisfied,the method converges to the set of parameter val-ues that minimizes the sum of squared errors, i.e.,the nonlinear least squares estimator. Consequently,

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


it retains the asymptotic properties of the nonlinearleast squares estimator, i.e., consistency and asymptoticnormality under general conditions.

Proposition 3.6. Suppose the parameter space of Â iscompact. Further, suppose limN→� v4 · 5 is a continuousand differentiable function and has a unique minimumat the true parameter Â, where N ≡ nH . In addition,assume that the errors Å are independent and homoscedas-tic, i.e., E6ÅÅT 7 = �2I where �2 is a scalar and I isthe identity matrix. Let Â be the estimator obtainedfrom the iterative-descent approach. Then � is consis-tent, i.e., limN→� Â = Â, and asymptotically normal, i.e.,limN→� Â ∼ Normal4Â1 4�2/N5Q−15, where �2 is thevariance of Å, and

Q = limN→�

1N

N∑

i=1

(

¡6Õi4Â5Â7

¡Â

)(

¡6Õi4Â5Â7

¡ÂT

)

1

where Õi4Â5 is the ith row of X4Â5.

The result follows from Greene (2003, Theorems 9.1and 9.2). A useful consequence of this is that we canestimate the asymptotic variance of the estimator Â.It is easy to show that ¡Õi4Â5/¡Â is bounded (seethe proof of Proposition 3.5 in Online Appendix A.3).Since ¡6Õi4�5�7/¡� = Õi4�5 + �4¡Õi4�5/¡�5, we havelimÂ→0 Q = limN→�4XT X5. In addition, �2 can be esti-mated with eT e/N , where e is the residual vector.Therefore, when � is small, we can estimate thecovariance matrix of Â by 44eT e5/N 54XT X5−1.

3.3.1. Correction for Heteroscedasticity and Auto-correlation. While Proposition 3.6 suggests a straight-forward method to estimate the covariance matrix ofthe parameter estimator, it relies on the assumptionthat the error term Å is spherical, i.e., E6ÅÅT 7 = �2Iwhere I is the identity matrix. If one could assumethat the error term Å is independent of the underlyingdemand-generating process, but rather just a book-keeping error of sales, this would be a reasonable wayto estimate the covariance. In general, it is difficult toargue that the error term has constant variance overtime. To deal with heteroscedasticity, we can use theWhite estimator (White 1980) to estimate the asymp-totic variance:

1N

(

ZT ZN

)−1( 1N

N∑

i=1

e2i ziz

Ti

)(

ZT ZN

)−1

1

where Z = ¡6X4Â5Â7/¡Â and zi is the ith row of Z.As Â→ 0, we estimate the covariance matrix of Â withN4XT X5−1S04XT X5−1 where S0 = 41/N5

∑Ni=1 e

2iÕiÕ

Ti .

For more general cases in which autocorrelation inthe data cannot be ignored, one can use the Newey–West (Newey and West 1988) covariance estimator

N4XT X5−1Q4XT X5−1, in which the matrix Q, whenapplied to our problem, is estimated with

S0 +1N

n∑

i=1

n∑

j=1

L∑

l=1

H∑

t=l+1

wle�4i1 t5e�4j1 t−l5

·(

Õ�4i1 t5ÕT�4j1 t−l5 +Õ�4j1 t−l5Õ

T�4i1 t5

)

1 (15)

where the subscript �4i1 t5 = t + 4i − 15H , the weightwl = 1 − l/4L+ 15, and L is typically set such thatL≈H 1/4.

To summarize the results in §3, we show that ifthe true Â values are sufficiently small, which canbe enforced by restricting the discrete time unit to asmall interval, the iterative-descent approach alwaysconverges to a stationary point of v4Â5. With theassumption of full rank (Proposition 3.1), the modelis identifiable. Moreover, we show that our methodyields an estimator that is consistent and asymp-totically normal. We also show how the covariancematrix can be estimated and corrected in the pres-ence of heteroscedasticity and autocorrelation. In thenext section, we apply this iterative-descent approachto sales data that are stochastically generated fromthe model given by Equation (12) and demonstrateits performance. We have applied both the Armijorule and a “limited” minimization rule that searchesalong the direction dk but within a bounded intervalbetween Âk and bk. Both work well for the simulateddata, although the latter appears to work better forthe Intel application. We present the results obtainedwith the limited minimization rule.

4. Performance on SimulatedDemand Data

In this section, we assume that the underlying modelis s = X4Â5Â+Å, where Å is normally distributed withzero mean. We assume that errors are uncorrelatedbut allow them to be heteroscedastic. For the ease ofreference, denote the variance of observation i with�2i and define vector Ñ = 4�11�21 0 0 0 1�N 5. We apply

the iterative-descent procedure described in §3.2 tosolve for the optimal Â that minimizes the meansquared error.

4.1. Estimation and FitWe present an example where sales are generatedfrom a model described in Equation (11) whereÂ = 4000051000210001100301000021000110000510025. Allthree sources of demand identified in §2 are present,and we correct for a market expansion of 001% perperiod. At the start of the time horizon (time 50),there are four existing generations of products (salesof the first product has already dropped to zero) withknown population sizes 000156100039910027441001131and the competition population is 1056. The focal

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


firm introduces new products at a constant pace (oneproduct every 20 time periods). We generate salesdata during time 65012007, which includes sales of10 products, seven of which are new product intro-ductions. The focal firm and competition start atabout the same performance level. Product improve-ment is 10%111%1 0 0 0 119% (over the immediate pre-decessor), respectively, for the 10 products of the focalfirm and a constant 5% for the competition. (Improve-ment occurs at the same time for both the focalfirm and its competition.) In this section, we choosethe sales data for the time window 65011507 (whichincludes sales of three existing products and five newproducts) to estimate the parameters, assuming thatwe have an accurate estimate of the population mixat time 50. In §4.2, we use the remaining sales data(which includes sales of three existing products andtwo new products) to test forecast performance; wealso vary the initial population sizes and the size ofthe fit/forecast window to test forecast sensitivity.

To validate the iterative procedure, we first test thecase with Ñ = 0 to confirm that it converges to thetrue parameters. See Figure 1. In the presence of noise(Ñ 6= 0), the procedure in general does not converge tothe true parameter value that is used to generate thesales, but rather the nonlinear least-squares estimatorof the model s = X4Â5Â+ Å, as expected. See Table 1for three examples generated at different noise levelswhere we use Ñ = 5%110%120% to loosely denote thecase when the noise term Å has a standard deviationthat is 5%110%120% of the mean sales, respectively.Note that “Std. error” and “White error” are the stan-dard error of parameters without and with correctingfor heteroscedasticity, respectively.

The statistical significance of the parameters dropsas the uncertainty level increases. The smaller param-eters tend to lose statistical significance fast (i.e., dropbelow 90% significance) with increasing uncertain lev-els. This suggests that, like most other methods, ourmodel is not appropriate for drawing statistical infer-ences for data sets with high noise. However, as we

Table 1 Parameter Estimates and Errors for Various Levels of Noise

Ñ = 5% Ñ = 10% Ñ = 20%

Parameters Estimate Std. error White error Estimate Std. error White error Estimate Std. error White error

�1 000052∗∗ 000035 000007 000048∗∗ 000058 000012 000042∗∗ 000106 000024�2 000209∗∗ 000141 000026 000230∗∗ 000229 000043 000260∗∗ 000421 000088�3 000178∗∗ 000053 000030 000135∗∗ 000079 000041 000192∗∗ 000152 000073�4 002547∗∗ 000366 000183 002650∗∗ 000553 000259 002280∗∗ 001025 000439�5 000017∗∗ 000009 000003 000029∗∗ 000014 000006 000010 000032 000014�6 000110∗∗ 000047 000019 000039∗ 000075 000032 000109 000178 000090�7 000064∗∗ 000026 000009 000015 000042 000018 000059∗ 000095 000040�8 001889∗∗ 000163 000044 002237∗∗ 000260 000089 002583∗∗ 000619 000240

∗ and ∗∗ denote significance at the 90% and 95% levels, respectively, computed based on the White error estimate.

Figure 1 Convergence to the True Parameter Values

2 4 6 8 10

0

0.02

0.04

0.06

Iteration

�1

�3

�5

�7

2 4 6 8 10

0

0.2

0.4

0.6

Iteration

Fitte

d va

lue

for �

Fitte

d va

lue

for �

�2

�4

�6

�8

(a)

(b)

show next in this section and in §4.2, it works rea-sonably well for the purpose of fitting and forecastingsales, even at a high noise level.

To examine the model fit, we generate stochasticsales data for a given set of parameter values andapply the iterative-descent procedure to obtain thecorresponding parameter estimates. We then generatethe fitted sales curve and compare it with the 95%

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Figure 2 Fitted Sales Curves and Population Deviation

confidence level sales band predicted by the trueparameters. Figure 2(a) illustrates such a comparison.The true parameters for the underlying model are thesame as in Figure 1. The noise term (Å) has a standarddeviation that is 20% of the mean sales (i.e., Ñ = 20%of mean sales). We generate 500 sets of sales data andapply the iterative-descent method to each data set.We observe that the majority of the fitted curves (solidcurves) center around the “true” curve (the dashedcurve on the top is the upper bound of the 95% con-fidence interval and the dotted curve on the bottomis the lower bound). More importantly, this observa-tion stays true even as the uncertainty level increases:As the band for the fitted sales curve becomes wider,the confidence intervals of the true sales curve alsobroadens.

Furthermore, we examine how well the true pop-ulation variables xi4t5 are recovered. We predict thepopulation path from the parameter estimates foreach of the 500 simulated data sets and computethe mean absolute percentage error (MAPE) of thepopulation for each set of predictions. Figure 2(b)

Table 2 Forecast Performance Averaged over 500 Data Sets

Error measure Ñ = 5% Ñ = 10% Ñ = 20%

Average RMSE 000030 000054 000102Average MAE 000022 000038 000069Average MAPE (%) 9.71 15.36 28.01

shows the error distribution of the predicted popula-tion sizes for Ñ = 20%. We observe that the iterativeapproach recovers the true population well. (The pop-ulation deviations are well within 10%, in most cases3% or 4%.)

4.2. Forecast Performance and SensitivityNext, we illustrate the forecast accuracy of the modelusing a portion of the data to calibrate the model andthe remaining to test forecast performance. In par-ticular, for each randomly generated data set, weparameterize the model with sales during time period65011507 and then forecast the sales for time 615112007.We compute forecast errors for each data set, andTable 2 presents the mean forecast errors averagedover the 500 data sets. RMSE is the root mean squarederror; MAE is the mean absolute error; MAPE is themean absolute percentage error.

Because RMSE is the measure that least squareregression optimizes, we use the RMSE (instead ofother error measures) to compare the model fit andforecast performance of different methods. We pres-ent the MAE and MAPE to provide additionalinformation.

We remark that the observations shown in Table 2and Figure 3 both demonstrate that our model iswell-behaved. In the majority of the simulated cases,the forecast largely replicates the inherent uncertaintyof sales data generated at each noise level, and wedo not observe significant amplifications as the noiselevel increases.

Figure 3 Forecast Error Distribution (Ñ = 20%)

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Table 3 Forecast Sensitivity to Initial Population (Ñ = 10%,Fit/Forecast Window = 65011507/615112007)

Case Base case 1 (over 10%) 2 (under 10%) 3 (mixed) 4 (mixed)

Average 0000531 0000533 0000535 0000533 0000531RMSE

Average 000038 000039 000038 000037 000038MAE

Average 15.10 17.35 14.48 14.54 16.31MAPE (%)

4.2.1. Sensitivity to Initial Population. We haveassumed that the initial population sizes are known.In the following, we experiment with cases wherethe company may over- or underestimate these val-ues. Table 3 illustrates forecast performance of themodel when the population sizes are overestimatedby 10% (case 1) and underestimated by 10% (case 2)as well as in cases in which some products are over-estimated while others are underestimated (“mixed”).For example, in case 3, the first two products areoverestimated by 10% and the next two productsare underestimated by 10%, whereas in case 4 theopposite is true. The results are averaged over 100data sets.

Performance resilience to perturbations in the ini-tial population estimation is evident, which appearscounterintuitive but is indeed an explainable charac-teristic of this method. Specifically, the method canself correct with respect to errors in the initial popula-tion estimation; as time elapses and new sales occur,the memory of the initial population and its effectfades away rather quickly.

4.2.2. Sensitivity to Fit/Forecast Time Window.Table 4 demonstrates the sensitivity of forecast errorsto the number of time periods (or similarly, the num-ber of products) used to parameterize the model.(Data are generated with Ñ = 10%; forecasts are aver-aged over 100 random data sets.) As expected, fore-cast accuracy decreases as the training set becomessmaller. Nevertheless, one still obtains reasonableforecasts when the size of the training data is at leastcomparable to the test data.

4.2.3. Constrained vs. Unconstrained Regression.We interpret the parameters Â as nonnegative becausethe gap of product strength between product i and

Table 4 Sensitivity of Forecast Errors to the Number of Time Periods(Ñ = 10%)

65011007/ 65011257/ 65011507/Fit/forecast window 610111507 612611757 615112007

Average RMSE 000056 000052 000053Average MAE 000041 000038 000038Average MAPE (%) 23.85 16.72 15.10

Table 5 Fit and Forecast Comparison (Ñ = 10%, Fit/ForecastWindow = 65011507/615112007)

Constrained Constrained Unconstrained Unconstrained(fit) (forecast) (fit) (forecast)

Average RMSE 000034 000053 000034 000056Average MAE 000021 000038 000022 000040Average 11.57 15.10 13.52 18.36

MAPE (%)

product j , ãfij , has a nonnegative effect on the tran-sition parameters pij and qij . In the presence of noise,it is possible that an unconstrained linear regres-sion may result in negative value(s) for some compo-nent(s) of the estimated parameter vector. Therefore,in our numerical studies, we have constrained theparameters to be nonnegative. Using unconstrainedregression may allow one to achieve better fit tothe training data; however, it inevitably leads topoorer forecasting performance. Table 5 demonstratesthe fit and forecast performance of the constrainedand unconstrained models. (Data are generated withÑ = 10%; fit and forecast errors are averaged over 100random data sets.) The RMSE for the training datais roughly the same for the constrained and uncon-strained cases; however, the forecast RMSE of theunconstrained model is worse than the constrainedmodel. Similar observations are made in the Intelapplication in §5. Therefore, one is better off tak-ing advantage of the knowledge on the underlyingdynamics by constraining the parameters to nonneg-ative values to avoid a false model that provides abetter fit but poor forecast.

In summary, the iterative-descent approach workswell for simulated data. In §5, we apply the proposedmethod to Intel data and compare it with the Bass,Norton–Bass, and Jun–Park models. (Such a compar-ison would not be appropriate in §4 because the sim-ulated data is generated using our model.)

5. Application to the Extreme EditionMicroprocessor Market

In this section, we apply our model to the sales dataof Intel’s “extreme edition” microprocessors. Thesehigh-end desktop processors are sold primarily toend users who participate in sophisticated computergames—the “extreme gamers.” Therefore, this is alsoreferred to as the extreme-gamers market.

5.1. Data DescriptionIncluded in the data are the introduction date, per-formance score, and price for each product sold byIntel as well as those for a major competitor. Weeklysales data cover a four-year time window with a totalof 11 products from Intel. (We include in the online

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


appendix the masked data set, as well as a plot of themasked sales for Intel products.)

Performance scores for both Intel and its competi-tion’s products are obtained from the Standard Per-formance Evaluation Corporation (SPEC) based onboth integer and floating-point benchmarks (SPEC2010), which are commonly adopted industry stan-dards. The extreme-gamers market is generally per-ceived as price-insensitive. Nevertheless, we applythe method in two settings: one in which productstrength is given by performance score alone andone in which product strength is given by perfor-mance/price ratio. The latter is commonly adoptedat Intel as a measure of product strength for mar-kets that are price-sensitive. We compare these twoalternatives for the extreme-gamers market. The dataspan a time period that is characterized by a per-formance race between Intel and its competitor withthe performance gap between the two companieswidening rapidly over time despite the initial per-formance lead and steady improvement by the com-petitor (Figure 4(a)). Figure 4(b) shows the perfor-mance/price ratio of the two firms. The competitorpriced its product much lower than Intel and wasthus leading on performance/price ratio. In addition,the total market during this time was growing overtime, reflecting a known market trend. As mentionedearlier, such trend is assumed to be known. Our focusis on how the total market is split between Inteland its competition and among the products withinIntel. We use 120 weeks of data to parameterize themodel (which includes sales of nine products) andthe remaining 96 weeks of data (which includes salesof six products, two of which were introduced afterweek 120) to test the forecast accuracy of the pro-posed model. The forecast is based on a “frozen” fitsample of 120 weeks, which differs from the rolling-horizon forecast typically seen in the literature of dif-fusion models. The rolling-horizon forecast is unfitfor strategic-planning decisions, which are often madelong before the actual sales.

First, we adjust for the aggregate market trend.Note that �4t5 is the sales growth due to marketexpansion as a percentage of the most recent sales.Ideally, we would use the original “forecasted” �4t5for this purpose. However, we do not have data ofthe forecasted sales growth for new expansion. There-fore, to calibrate the model, we use the actual sales(which are available) to estimate the growth trend oftotal sales, and use it as a proxy for the growth trendof new expansion. This is reasonable if the growthtrend due to new market expansion and the growthtrend of total sales follow a similar pattern. For theextreme-gamers market, the total market during thistime window was on a logistic growth path. We thusfit a logistic growth curve to the estimated total sales,

Figure 4 Performance and Performance/Price

0 50 100 150 200 25010

20

30

40

50

60

Time (week)

Perf

orm

ance

0 50 100 150 200 2500

0.05

0.10

0.15

0.20

Time (week)

Perf

orm

ance

/pri

ce

Intel products

Competition

(a) Performance

(b) Performance/price

and the growth rate �4t5 is obtained through this fit-ted curve. Figure 5(a) shows the fitted logistic curve.

Before applying the iterative-descent approach, weneed to estimate the initial population mix. Theextreme-gamers market is initially dominated by thecompetition, so we set xi405 to zero and y405 to the sizeof the initial market. Experts at Intel estimate this to beapproximately 705 million at the start of this time win-dow. We show with additional sensitivity analysis thatthe model is robust to fluctuations in this estimate dueto the reasons discussed in §4 (see Tables 14 and 15provided in the online appendix).

5.2. Estimation and ForecastIn addition to the two alternatives of product strengthmeasure, namely, performance alone (perf-only) andperformance/price ratio (perf./price), we also exploretwo options for determining the product strength ofthe competition, fy4t5: one defined as the strengthof the strongest product of the competition (best-comp)the other defined as the mean product strength of thecompetition (mean-comp). Moreover, as in §4, weexamine the forecast performance with (constrained)

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Figure 5 Application of Extreme-Gamers Market

0 50 100 150 200 2500

0.1

0.2

0.3

0.4

Time (week)

Tot

al s

ales

(m

illio

n)

Fitted curve

1 2 3 4 5 6

0

2

4

6

8

10

12× 10–3

Iteration

�1

�2

�3

�4

�5

�6

�7

�8

(a) Total market

(b) Convergence of parameters

Fitte

d va

lue

for �

and without (unconstrained) nonnegativity constrainton the parameters.

Figure 5(b) shows the convergence path of theparameters for the “constrained, perf-only, best-comp” case. Similar convergence pattern is observedfor other cases.

Tables 6 and 7 summarize the parameter estimates,standard errors, and White and Newey–West errorsobtained using sales data from the first 120 weeks

Table 6 Parameter Estimates for the Extreme-Gamers Market (Constrained, Best-Comp)

Parameters �1 �2 �3 �4 �5 �6 �7 �8

Perf. onlyEstimate 000034∗ 0 0000045 0 000037∗∗ 000089∗ 0000091 000022Std. error 000011 000019 000012 000038 000008 000028 000008 000041White error 000025 000032 000018 000035 000015 000059 000010 000055Newey–West error 000048 000073 000038 000066 000025 000092 000016 000077

Perf./price ratioEstimate 000150∗∗ 0 000310∗∗ 000806∗∗ 000039∗∗ 000045 0 0Std. error 000020 000095 000061 000298 000012 000035 000028 000093White error 000021 000088 000052 000268 000018 000062 000027 000097Newey–West error 000033 000124 000075 000432 000030 000092 000043 000144


(first nine products). The model fit and forecast perfor-mances are reported in Tables 8 and 9, respectively. Inaddition to the RMSE, MAE, and MAPE, we also com-pute the median absolute percentage error (MdAPE),MAPE of the cumulative sales (cumMAPE), MAPE ofthe peak sales (peakMAPE), mean absolute error ofthe peak time (timeMAE), and the R2 value. (Note thatfor nonlinear regression, R2 value is not necessarilybetween 0 and 1.) With the exception of peakMAPEand timeMAE, errors are averaged over all availabledata points (which includes sales for each product ateach time period during its selling window). To com-pute the error for model fit (Table 8), we use datapoints within the first 120 weeks; to compute the fore-cast error, we use data points in the 96 weeks startingfrom week 121. Because of the life-cycle effect, pre-dicted and actual sales may differ by a small abso-lute amount but a high percentage amount near thetail regions of each product. This is particularly truein a real data application in which the predicted andactual sales window may differ significantly near thetails (possibly due to companies enforcing an end-of-life deadline for a product instead of letting it followthe course of a diffusion). Thus, for the Intel applica-tion we also report the MdAPE, which partially offsetsthis problem. Each Intel product has a sales peak in itslife cycle, and peakMAPE and timeMAE, respectively,measure error in the peak volume and peak time ofthe products, which are important for Intel becausethe company needs to plan capacity appropriately toaccommodate peak sales. These values are averagedover the number of products (for the training data,averaged over all products that peaked during the first120 weeks, and for the test data, averaged over allproducts which peaked after week 120).

As discussed earlier, we use the RMSE to com-pare model fit and forecast performance because ourmethod is a least-squares method. We make several

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Table 7 Parameter Estimates for the Extreme-Gamers Market (Unconstrained, Best-Comp)

Parameters �1 �2 �3 �4 �5 �6 �7 �8

Perf. onlyEstimate 000092∗ −000135 −000059 000180∗ 000013 000124∗ 000014∗ 000067∗∗

Std. error 000011 000018 000011 000034 000008 000025 000008 000037White error 000027 000035 000019 000028 00008 000047 000008 000053Newey–West error 000046 000063 000032 000048 000011 000067 000012 000076

Perf./price ratioEstimate 000113∗ −000091 000105∗ 000202∗ 000001 000179∗ 000031∗ −000062Std. error 000013 000039 000025 000093 000010 000030 000019 000068White error 000017 000034 000029 000095 000010 000057 000018 000092Newey–West error 000028 000052 000052 000161 000013 000075 000022 000116


observations from Tables 8 and 9. First, althoughthe unconstrained regression yields better RMSE inthe training data, the forecast RMSE is much worsethan the constrained regression, indicating that theconstrained regression is more appropriate. This isconsistent with the findings from the test on thesimulated data. Second, whether one defines fy4t5as the strength of the strongest competing productor as the mean product strength of the competitionat time t does not lead to dramatic changes in theforecast performance, with the former performingslightly better (smaller RMSE). This could be eas-ily explained by the fact that the latter is roughly amoving average of the former, and in the absenceof drastic trend change in the product strength ofthe competition, one should not expect major differ-ences. Last, we observe that the performance-onlyalternative generally achieves better results. This isconsistent with our earlier expectation that customersin the “extreme” edition processor market are less

Table 8 Comparison of Model Fit for Different Model Specifications

MAPE MdAPE cumMAPE peakMAPEMethod RMSE MAE (%) (%) (%) (%) timeMAE R2

Constrained, perf.-only, best-comp 000176 000121 67 41 49 3044 504 0040Constrained, perf./price, best-comp 000202 000153 74 82 42 2093 603 0022Unconstrained, perf.-only, best-comp 000163 000115 62 42 42 2094 503 0049Unconstrained, perf./price, best-comp 000183 000138 70 64 55 3035 701 0035Constrained, perf.-only, mean-comp 000182 000128 125 41 81 3051 507 0036Constrained, perf./price, mean-comp 000193 000149 122 69 75 3015 801 0028

Table 9 Comparison of Forecast Performance for Different Model Specifications

MAPE MdAPE cumMAPE peakMAPEMethod RMSE MAE (%) (%) (%) (%) timeMAE

Constrained, perf.-only, best-comp 000111 000089 94 38 6100 0069 1307Constrained, perf./price, best-comp 000154 000131 78 88 6700 1090 1007Unconstrained, perf.-only, best-comp 000167 000132 128 60 7106 0024 3103Unconstrained, perf./price, best-comp 000333 000254 187 94 15001 4008 1000Constrained, perf.-only, mean-comp 000113 000090 95 39 6207 0063 1307Constrained, perf./price, mean-comp 000140 000119 84 85 6603 0089 1200

sensitive to price changes. To further verify that it issafe not to consider price in this particular applica-tion, we also test the model by incorporating a sep-arate price term into product strength, i.e., we let fijbe a linear combination of performance improvementand price improvement from product i to product j .The best fit is obtained with zero weight on price(see more details in the online appendix). Summariz-ing the above evaluation, the “constrained, perf.-only,best-comp” model specification demonstrates overallbetter performance, and we recommend this modelspecification for the high-end gamer market. We focusour discussion in the remainder of the paper on thisspecification.

5.2.1. Parameter and Forecast Interpretation. Asdiscussed in §4.1, the parameters obtained usingour method show low statistical significance at highnoise levels. In the Intel application, for the “con-strained, perf-only, best-comp” model, only �1, �5,

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


and �6 are above the 90% significance level. There-fore, one cannot make definitive parameter infer-ences because it is difficult to establish whether alow significance level is due to low influence or highnoise in the data. Nonetheless, the higher signifi-cance level of �6 than �2 seems to suggest that theinfluence of performance improvement on the inno-vators may be stronger for between-brand switchingthan for within-brand upgrades. One plausible inter-pretation is that for within-brand product upgrades,customers are less concerned with the exact perfor-mance improvement as long as Intel passes someexpected improvement hurdle (which Intel deliversas shown in Figure 4(a)). However, this is not thecase for between-brand switchings. Because proces-sors of different brands are not compatible, switchingbrand means replacing many other major componentsin the computer, or buying an entire new system.Therefore, for a customer to switch from one brandto another, he or she needs to be convinced that theperformance improvement justifies this move. As aresult, we observe stronger influence of performanceon brand switchings than on within-brand upgrades.

Figure 6 Actual vs. Predicted Sales

20 40 60 80 100 1200

0.02

0.04

0.06

0.08

0.10

Time (week)

100 150 2000

0.05

0.10

0.15

Time (week)

Sale

s (m

illio

n)Sa

les

(mill

ion)

Actual sales

Predicted sales

(a) Product 3

(b) Product 7

Figure 6 shows the predicted and actual sales forthe product with the best (product 3) and worst (prod-uct 7) fit in terms of RMSE. Note that the predictedsales before time 120 are fitted values and sales aftertime 120 are forecast. The fitted sales curves largelyreplicate the asymmetric pattern of the recorded sales(i.e., sales usually peak early in a product’s life cycle,shown by the left-skewed sales curve) as well as therelative magnitude of sales across products, with theexception of product 7 in Figure 6(b). Close examina-tion of the data reveals that even though product 7only has a marginal performance improvement overits predecessor, it marks a major silicon technologyshift and customers’ purchasing decisions might beinfluenced by factors not captured in the performancedata, which may help explain the dramatic sales spikein spite of the performance. To maintain simplicityand generality, we did not specifically account for thiseffect, although adjustment for such technology shiftshould certainly improve the fit.

5.3. Comparison to Existing ModelsNext, we compare the forecast performance of ourmethod with the Bass, Norton–Bass, and Jun–Parkmodels. As discussed previously, these models requireproduct-specific parameters and are difficult to applyin long-range forecasting. To use them and draw acomparison, we let the market-potential parameters belinearly dependent on product strength. The diffusionparameters are assumed to be the same across all theproducts following the argument made in Norton andBass (1987). In addition, in the Jun–Park model, welet the linear coefficient that describes how fast theutility for each product grows with time be depen-dent on product strength as well. We use generalizedversions of these models, and in the absence of suchlinear dependence, they simply reduce to the originalversion in the literature.

Tables 10 and 11 illustrate the performance of eachmethod. Our method outperforms the Bass, Norton–Bass, and Jun–Park models in terms of the RMSE andalso along most other dimensions that Intel is inter-ested in. The ability of our model to capture bothproduct upgrades and brand switching has clearlycontributed to its improvement over existing mod-els, making it more suitable for long-range forecast-ing in a market with frequent product introductionsand competition. More importantly, we note that ourapproach accomplishes this without using actual salesdata from the competition.

Last, we remark that the forecast errors in the Intelapplication are high, and the same level of error mightbe unacceptable for a one-step ahead or two-stepahead forecast, which is often used in the empiricalapplications of the Norton–Bass or Jun–Park models.However, for long-range forecast of products to be

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Table 10 Comparison of Model Fit with Alternative Methods

MAPE MdAPE cumMAPE peakMAPEMethod RMSE MAE (%) (%) (%) (%) timeMAE R2

Iterative descent (perf. only) 000176 000121 67 41 49 3044 504 0040Bass 000243 000201 518 178 167 4014 2007 −0014Norton–Bass 000217 000169 282 137 129 3066 1103 0009Jun–Park 000227 000174 383 133 120 4050 408 00004

Table 11 Comparison of Forecast Performance with Alternative Methods

MAPE MdAPE cumMAPE peakMAPEMethod RMSE MAE (%) (%) (%) (%) timeMAE

Iterative descent (perf. only) 000111 000089 94 38 61 0069 1307Bass 000208 000195 182 141 141 0058 1307Norton–Bass 000645 000259 188 191 190 3000 900Jun–Park 000141 000122 129 95 92 0060 1400

released years later, and given the high uncertaintiesinherent to this market and the fact that we forecastweekly sales instead of quarterly (e.g., Norton and Bass1987) or yearly (e.g., Mahajan and Muller 1996) sales,high forecast uncertainty is expected. More impor-tantly, high uncertainty does not diminish the value ofa forecast. To the contrary, being able to characterizethe demand and corresponding uncertainty enablesthe company to appropriately hedge against futurerisk. For example, Intel has implemented a method touse the knowledge of future demand uncertainty tooptimally design option contracts with its equipmentsuppliers to reduce excess capacity (Peng et al. 2012,Kempf et al. 2013).

5.4. Sensitivity to Fit/Forecast Time WindowWe have shown in §4.2.2 that the forecast accuracydecreases with the amount of historical data used forparametrization. For the Intel data set, we perform asimilar sensitivity analysis. We vary the size of the fitsample among 1001120, and 140 weeks while fixingthe forecast window to the next immediate 76 weeks.We also change the forecast window while fixing the

Table 12 Forecast Sensitivity to Fit/Forecast Time Window

Fit/forecastwindow 6111007/610111767 6111207/612111967 6111407/614112167

RMSE 0.0308 0.0116 0.0098MAPE (%) 199 103 93MdAPE (%) 85 41 35

Table 13 Sensitivity of Forecast Errors to the Forecast Window(Fit Window = 6111207)

Forecast window 612111407 612111607 612111807 612112007 612112167

RMSE 0.0114 0.0108 0.0119 0.0115 0.0111MAPE (%) 59 71 106 101 94MdAPE (%) 31 30 37 41 38

fit window to 120 weeks. Tables 12 and 13 illus-trate how the forecast accuracy changes with thesevariations.

It appears that increasing the fit window slightlyimproves the forecast accuracy, while reducing itlowers the accuracy dramatically. As we fix the fitwindow and vary the forecast window, the forecastaccuracy changes significantly, but there does notappear to be any clear trend. Although it may betempting to make inferences from the above, anytrend (or the lack of it) observed from a single dataset could be anecdotal to this particular data set. Thetrend derived from the simulated data might be moregeneral because it is averaged over 100 data sets.

In summary, the application of the proposedmethod on the sales of Intel’s high-end gamers mar-ket demonstrates some appealing features as wellas some less satisfying aspects: It requires no dataon the units-in-use population (compared with con-ventional population-growth models) and accountsfor the effect of competition. It converges quicklyand compared with alternative methods can be moreeasily applied to long-range forecasting. However,at a high noise level, low statistical significance ofthe parameters precludes conclusive inferences of thecausal effects, and the forecast accuracy is sensitiveto the size of the training data set. Because resultsfrom a single data set could be anecdotal, more realdata applications by researchers and practitioners arerequired for further evaluation of the method.

6. ConclusionWe have proposed a method for parameterizingand forecasting the demand for multiple, successivegenerations of products. Our model is based on apopulation-growth model. The usual application of apopulation-growth model requires data on the popu-lation size. However, for many companies that cannot

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


track the units-in-use for each product, a direct appli-cation of the growth model is not possible. We over-come this difficulty with an iterative approach thatconstructs the units-in-use population for each prod-uct based on current parameter estimates, and thenuse the constructed population size and sales obser-vations to improve the parameter estimates. We showthat this method is theoretically sound as long as werestrict the discrete time interval to a small value. Themethod performs well for the synthetic data and out-performs other available methods when applied tothe sales data of Intel’s high-end microprocessors.

Using synthetic data, we test sensitivity of themethod (model fit and forecast performance) againstnoise levels, sample size (the number of time peri-ods or products), and perturbations in the initialpopulation sizes. We show that the model is well-behaved. In the Intel application, we test severalalternate model specifications based on how productstrength is determined, how the competition strengthis determined, and whether or not to restrict theparameters to be nonnegative. In particular, we usetwo alternative specifications for product strength inthe extreme-gamers market: performance and perfor-mance/price ratio. The former fits the data better andalso shows better forecast accuracy, which is mostlikely due to the low price-sensitivity of this high-end market, where processors cost 5 to 10 times thatof a mainstream processor product. In addition, thedependence on product performance appears to bestronger for brand switchings than for the within-brand product upgrades, in terms of the innovatoreffect. This is intriguing, but the special characteristicsof the extreme-gamers market may help explain theobservation.

The parametrization method we propose is a non-linear least-squares method, and therefore, it suffersfrom some common limitations of nonlinear regres-sion: the optimization may yield a local optimum(because the objective function is a high-order poly-nomial that may have multiple roots), and if the noiselevel in the data is high, convergence may be slowand errors in the parameter estimates will be high.A limitation unique to our approach is the restrictionto small coefficients. This is driven by (i) the needto ensure nonnegative and finite population size inthe discrete time model and (ii) exploitation of theasymptotic result for convergence in Proposition 3.5.In both (i) and (ii), it is difficult to define a priori ananalytical bound for Â. However, when applying themethod, it is easy to spot cases when Â is chosen to betoo large (one either encounters negative or extremelyhigh population sizes or observes oscillation insteadof convergence). Therefore, the lack of an analyticalbound does not become an impediment for imple-mentation. In practice, a small Â can be enforced by

restricting the time unit to a small interval. This, how-ever, may place a higher requirement on the temporalgranularity of the data. With a smaller time unit, thenoise in the data may become larger. Hence, there is atrade-off, and identifying the optimal data collectiontime interval may be a trial-and-error process specificto each application.

Moreover, we have made several assumptions inthis multigeneration diffusion model, some of whichare similar to the Bass model; others pertain to theintergenerational effect: (i) the products are consumerdurables and each customer purchases only one unit,i.e., there is no repeat purchase of the same prod-uct; (ii) we assume that customers only buy a productthat is newer than the one that is currently owned;(iii) customers switch from and to competitors’ prod-ucts only if doing so results in a performance increase;(iv) the initial population mix at the start of theforecast window and the trend of the total marketgrowth and/or decline can be estimated with highconfidence. Clearly, these assumptions do not hold inevery industry and for every company. Lastly, as men-tioned earlier, our approach targets multiple productgenerations with incremental positive improvementbetween successive generations, and it does not applyto the diffusion of a single radically innovative prod-uct such as the first-ever electric refrigerator.

We show in this paper how this demand model canbe parameterized and employed at Intel as an inputto long-range planning in the extreme-gamers mar-ket. To expand to other markets, the model may needto be adjusted. For example, the market for serversis also known to be relatively insensitive to price,like the extreme-gamers market. However, the mar-ket for mainstream consumer laptops is very sensitiveto price fluctuations. Nonetheless, it does not requirea fundamental model change, and one may simplyextend the interpretation of “product strength” to ameasure appropriate for the specific market. In thecase of multidimensional product strength, for exam-ple, with price being one of the dimensions in addi-tion to performance, there may be additional dataand parameter-related requirements: With one moredimension in the product-strength measure, the num-ber of coefficients increases from 8 to 12, so additionaldata may be needed to ensure identification. Also,convergence could become more difficult because ourmethod requires all coefficients to be small and this isharder to ensure with more parameters. This problemcould, however, be remedied by more careful scal-ing of the performance and price data. Although themodel is motivated and developed based on a partic-ular company, coexistence of multiple generations ofproducts is common in the technology industry, andwe hope that researchers and practitioners may findit useful in other contexts as well.

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Supplemental MaterialSupplemental material to this paper is available at http://dx.doi.org/10.1287/msom.2013.0430.

AcknowledgmentsThe authors are grateful to Jay Schwarz from Intel Corpo-ration for sharing his expertise and to Stephen Graves forhis helpful comments. The authors also thank Alan Scheller-Wolf, an anonymous associate editor, and three anony-mous reviewers for their constructive suggestions. DieterArmbruster’s research was partially supported by a grantfrom the Volkswagen Foundation under the program oncomplex networks.

ReferencesAmemiya T (1985) Advanced Econometrics (Harvard University

Press, Cambridge, MA).Bass FM (1969) A new product growth model for consumer

durables. Management Sci. 15(5):215–227.Bass FM (1980) The relationship between diffusion rates, experience

curves, and demand elasticities for consumer durable techno-logical innovations. J. Bus. 53(3):S51–S67.

Bass FM, Krishnan TV, Jain DC (1994) Why the bass model fitswithout decision variables. Marketing Sci. 13(3):203–223.

Bayus BL, Kim N, Shocker AD (2000) Growth models for mul-tiproduct interactions: Current status and new directions.Mahajan V, Muller E, Wind Y, eds. New-Product Diffusion Mod-els (Kluwer Academic Publishers, New York), 141–163.

Bertsekas DP (2003) Nonlinear Programming (Athena Scientific,Belmont, MA).

Danaher PJ, Hardie BGS, Putsis WP Jr (2001) Marketing-mix vari-ables and the diffusion of successive generations of a techno-logical innovation. J. Marketing Res. 38(4):501–514.

Debo LG, Toktay LB, Van Wassenhove LN (2006) Joint life-cycledynamics of new and remanufactured products. ProductionOper. Management 15(4):498–513.

Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihoodfrom incomplete data via the EM algorithm. J. Royal Statist.Soc., Series B Method. 39(1):1–38.

Fisher JC, Pry RH (1971) A simple substitution model of technolog-ical change. Tech. Forecasting Soc. Change 3:75–88.

Gordon BR (2009) A dynamic model of consumer replacementcycles in the PC processor industry. Marketing Sci. 28(5):846–867.

Gowrisankaran G, Rysman M (2009) Dynamics of consumer de-mand for new durable consumer goods. NBER Working Paper14737, National Bureau of Economic Research, Cambridge, MA.

Greene WH (2003) Econometric Analysis (Prentice Hall, UpperSaddle River, NJ).

Ho T-H, Savin S, Terwiesch C (2002) Managing demand and salesdynamics in new product diffusion under supply constraint.Management Sci. 48(2):187–206.

Jain DC, Rao RC (1990) Effect of price on the demand for durables:Modeling, estimation, and findings. J. Bus. Econom. Statist.8(2):163–170.

Jun DB, Park YS (1999) A choice-based diffusion model for multi-ple generations of products. Tech. Forecasting Soc. Change 61(1):45–58.

Kalish S (1985) A new-product adoption model with price, adver-tising, and uncertainty. Management Sci. 31(12):1569–1585.

Kamakura W, Balasubramanian S (1988) Long-term view of the dif-fusion of durables: A study of the role of price and adoptioninfluence process via tests of nested models. Internat. J. Res.Marketing 5(1):1–13.

Kempf KG, Erhun F, Hertzler EF, Rosenberg TR, Peng C (2013)Optimizing capital investment decisions at Intel Corporation.Interfaces 43(1):62–78.

Kim N, Chang D, Shocker A (2000) Modeling inter-categorydynamics for a growing information technology industry. Man-agement Sci. 46(4):496–512.

Kumar S, Swaminathan JM (2003) Diffusion of innovations undersupply constraints. Oper. Res. 51(6):866–879.

Libai B, Muller E, Peres R (2009) The influence of within-brandand cross-brand word of mouth on the growth of competitivemarkets. J. Marketing 73(2):19–34.

Mahajan V, Muller E (1996) Timing, diffusion, and substitution ofsuccessive generations of technological innovations: The IBMmainframe case. Tech. Forecasting Soc. Change 51(2):109–132.

Meade N, Islam T (2006) Modelling and forecasting the diffu-sion of innovation—A 25-year review. Internat. J. Forecasting22(3):519–545.

Melnikov O (2001) Demand for differentiated durable products:The case of the U.S. computer printer industry. Mimeo, YaleUniversity, New Haven, CT.

Murray JD (2002) Mathematical Biology: An Introduction (Springer-Verlag, New York).

Newey W, West K (1988) A simple positive semi-definite, het-eroscedasticity and correlation consistent covariance matrix.Econometrica 55(3):703–708.

Norton JA, Bass FM (1987) A diffusion theory model of adoptionand substitution for successive generations of high-technologyproducts. Management Sci. 33(9):1069–1086.

Peng C, Erhun F, Hertzler EF, Kempf KG (2012) Capacity plan-ning in the semiconductor industry: Dual-mode procurementwith options. Manufacturing Service Oper. Management 14(2):170–185.

Peres R, Muller E, Mahajan V (2010) Innovation diffusion and newproduct growth models: A critical review and research direc-tions. Internat. J. Res. Marketing 27(2):91–106.

Pielou E (1977) Mathematical Ecology (Wiley, New York).Rothenberg TJ (1971) Identification in parametric models. Economet-

rica 39(3):577–591.Savin S, Terwiesch C (2005) Optimal product launch times in a

duopoly: Balancing life-cycle revenues with product cost. Oper.Res. 53(1):26–47.

Shenoy SR, Daniel A (2006) Intel architecture and silicon cadence:The catalyst for industry innovation. Tech. at Intel Magazine(October), 1–7.

Song I, Chintagunta P (2003) A micromodel of new productadoption with heterogeneous and forward-looking consumers:Application to the digital camera category. Quant. MarketingEconom. 1(4):371–407.

SPEC (2010) SPEC benchmarks, http://www.spec.org/benchmarks.html.

Tuma N, Hannan M (1984) Social Dynamics: Models and Methods(Academic Press, New York).

White H (1980) A heteroscedasticity-consistent covariance matrixestimator and a direct test for heteroscedasticity. Econometrica48(4):817–838.

Wu SD, Kempf K, Atan M, Aytac B, Shirodkar S, Mishra A(2010) Improving new-product forecasting at Intel Corporation.Interfaces 40(5):385–396.

Dow

nloa

ded

from

info

rms.

org

by [

129.

219.

247.

33]

on 2

8 A

ugus

t 201

4, a

t 16:

34 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.

Documents

A Population-Growth Model for Multiple Generations of ...dieter/papers/msom_2013.pdf · Li, Armbruster, and Kempf: A Population-Growth Model for Multiple Product Generations 344 Manufacturing