45
Managing Capacity and Inventory Jointly for Multi-Server Make-to-Stock Queues Josh Reed Bo Zhang Stern School of Business Mathematical Sciences and Analytics New York University IBM Research New York, NY Yorktown Heights, NY Abstract Motivated by practices in modern supply chains, we consider capacity-inventory joint management for a make-to-stock manufacturing system operating under a base stock policy. The production facility is modeled as multiple servers operating in parallel. The number of servers corresponds to the capacity de- cision and the base stock level is the inventory decision. The main problem which we consider is the joint optimization of the capacity and inventory decisions to minimize a combination of capacity, inventory, and backordering costs. We develop a square-root rule for the joint decision and justify the rule analytically in a many-server queue asymptotic framework. We also provide operational insights into the tradeoffs involved in such joint management problems, through various analysis based on the square-root rule as well as a comparison with analogous results for single-server make-to-stock queues. Keywords: make-to-stock queue; capacity-inventory joint management; multi-server queue; many-server asymptotics; square-root rule; diffusion approximation 1. Introduction In modern society, a variety of goods are manufactured and then sold to consumers. How to facilitate the process of meeting demand in a cost effective manner is a topic of long-lasting interest in the operations research literature. In particular, installing sufficient production capacity and holding Preprint submitted to Elsevier November 3, 2014

Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

  • Upload
    ledat

  • View
    227

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Managing Capacity and Inventory Jointly for

Multi-Server Make-to-Stock Queues

Josh Reed Bo ZhangStern School of Business Mathematical Sciences and AnalyticsNew York University IBM Research

New York, NY Yorktown Heights, NY

Abstract

Motivated by practices in modern supply chains, we consider capacity-inventoryjoint management for a make-to-stock manufacturing system operating undera base stock policy. The production facility is modeled as multiple serversoperating in parallel. The number of servers corresponds to the capacity de-cision and the base stock level is the inventory decision. The main problemwhich we consider is the joint optimization of the capacity and inventorydecisions to minimize a combination of capacity, inventory, and backorderingcosts. We develop a square-root rule for the joint decision and justify the ruleanalytically in a many-server queue asymptotic framework. We also provideoperational insights into the tradeoffs involved in such joint managementproblems, through various analysis based on the square-root rule as well asa comparison with analogous results for single-server make-to-stock queues.

Keywords:make-to-stock queue; capacity-inventory joint management; multi-serverqueue; many-server asymptotics; square-root rule; diffusion approximation

1. Introduction

In modern society, a variety of goods are manufactured and then soldto consumers. How to facilitate the process of meeting demand in a costeffective manner is a topic of long-lasting interest in the operations researchliterature. In particular, installing sufficient production capacity and holding

Preprint submitted to Elsevier November 3, 2014

Page 2: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

safety inventory are two widely studied and used operational levers to achievethis goal.

This paper revisits the topic of capacity and inventory joint management.The topic is especially timely today as many manufacturing firms increasinglyhave direct retail stores (physical or online), e.g., Apple, and on the otherhand more major retailers sell products of their own brand, e.g., Amazon.Also, the traditional hierarchical planning literature’s assumption (see Bitranet al. (1982) and Hax and Meal (1973)) that capacity decisions are higher-level than inventory ones in the managerial hierarchy is no longer a normin modern businesses. For example, for a major online retailer, warehousesizing, a decision on the maximum inventory level, may very well be moreimportant than supplier selection, a capacity decision, especially if the goodsare commodities and fast demand fulfillment is critical.

The setting directly addressed by this paper is one assuming a stationaryenvironment, i.e., demand and supply modeled by time-invariant stochasticprocesses. Our focus is on understanding and quantifying the interactionbetween capacity and inventory and how they may be used to combat un-certainties of a stationary nature. Yet, the results developed in this paperpotentially can also be building blocks for analyzing more complex settingssuch as a piecewise stationary environment (e.g., see Song and Zipkin (1993)for an example with Markov-regulated demand).

The main context in which the capacity-inventory tradeoff is examined isone of minimizing the sum of capacity cost, inventory cost (including finishedgoods and work-in-process), and backordering cost, which accounts for lossof goodwill and can be viewed as a proxy for service level or potential revenueloss. In addition, we examine this tradeoff by analyzing customer demandfulfillment performance explicitly.

Specifically, we study the capacity and inventory joint optimization prob-lem in a make-to-stock queue with multiple servers operating in parallel (here-after referred to as multi-server, make-to-stock queue), where the capacitydecision corresponds to the addition or removal of servers. We quantify theamount of excessive capacity level above the demand rate (i.e., the extracapacity level above the minimum for ensuring system stability) as well asthe amount of “excessive” inventory level above the demand rate, where thesecond “excessive” is placed in quotes because the optimal inventory levelcan be lower than the demand rate as it turns out. Adopting the term “vari-ability hedge” from Bassamboo et al. (2010), we shall sometimes refer tothese two excessive amounts as the (optimal) capacity variability hedge and

2

Page 3: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

the (optimal) inventory variability hedge respectively. These two hedges canalso be viewed as and naturally called safety factors. Spearman (2014) syn-onymize this type of safety factor for a buffer. Regardless of the wording, itis of long-lasting interest to quantitatively characterize such a safety factorin the stochastic operations research community. In the case of having twosuch factors in one single setting as we do here, it is interesting to understandthe interaction between them. This is one important subject of study in thepresent paper.

More specifically, our analysis uncovers strong interactions between ca-pacity and inventory decisions. For example, the structure of the chosenproduction facility has a significant impact on the inventory decision as wellas how inventory is distributed throughout the supply chain. Such insightsare not only applicable to our specific model, but useful for the broader pur-pose of better understanding the classical operational challenge of matchingsupply with demand.

1.1. Related literature

Bradley and Glynn (2002) were the first to analytically formulate andsolve the problem of jointly optimizing capacity and inventory decisions. Intheir paper, a two-stage supply chain is modeled as a GI/M/1 make-to-stock queue under a base stock policy, where the capacity decision corre-sponds to increasing or decreasing the service rate of the single server andthe inventory decision is made on the base stock level. A very recent paperBuyukkaramikli et al. (2013b) also studies capacity-inventory joint optimiza-tion in an M/M/1 make-to-stock queue and its variants. Angelus and Por-teus (2002) study a multi-period, short-life-cycle, make-to-stock model, inone case of which inventory carry-over between periods is allowed and hencethe capacity-inventory tradeoff arises. They show that capacity and inven-tory are economic substitutes. A deterministic version of a similar problemhas been studied by Rao (1976). A multi-period joint capacity and inventoryoptimization problem with capacity acquisition lead time has been investi-gated by Mincsovics et al. (2009). Also, Jodlbauer and Altendorfer (2010)study the tradeoff between capacity invested and inventory needed in a multi-item, multi-period make-to-order system. They formulate and solve a jointcapacity-inventory optimization problem, although both the nature of theirsystem (i.e., make-to-order) and their mathematical model are completelydifferent from ours. Van Mieghem and Rudi (2002) study optimal capac-ity and inventory decisions in their own developed model called newsvendor

3

Page 4: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

networks. They consider discretionary activities post capacity and inventorydecisions and hence their approach has a stochastic programming nature.Allon and Zeevi (2011) introduce the notion of strategic substitute and pointout that capacity and inventory are strategic substitutes, although their fo-cus is on understanding the tradeoff between capacity and price. Finally, aninteresting economics paper by Eberly and Van Mieghem (1997) is (weakly)related to our study in that they also consider a joint optimization problem,although all their decision variables (what they call “factors of production”)exercise effects on capacity only.

All the past papers related to capacity and inventory joint optimization,with the exception of Bradley and Glynn (2002) and Buyukkaramikli et al.(2013b), assume that production lead time does not depend on the quantitydemanded, which is a major difference from a queueing model. As a generalobservation, relatively fewer researchers have adopted the M/M/1 make-to-stock queue to analyze two-stage supply chains, even for separate optimiza-tion of capacity or inventory decisions; see Caldentey and Wein (2003) andPlambeck and Zenios (2003) for the studies in which a single-server queueingmodel is used. There has been even less research on multi-server make-to-stock queues. The most relevant to our paper is a short section on page261-262 of Zipkin (2000), where our expressions (2) and (3) are given forthe multi-server make-to-stock queue and it is remarked that the systemcombines features of the M/M/1 and M/M/∞ systems. Also related is ashort section “Re-order for Each Item Sold” in Chapter 10 of the pioneer-ing book by Morse (1958). There no backorder is allowed and all sales arelost whenever the inventory level is zero, and as a result there is only onedecision variable, the base stock level, which is equivalent to the number ofparallel servers. More precisely, Morse’s model can be viewed as a specialcase of the one considered in our Section 6 with one of its decision variablesr set to zero. The only other prior studies applying multi-server queues ininventory systems that we are aware of are two very recent papers: Ata andRubino (2009) on multi-server, make-to-order queues with order cancella-tions and Buyukkaramikli et al. (2013a) using multi-server, make-to-orderqueueing models to make hiring decisions subject to a lead time performanceconstraint. Neither of them address the same problem as we do in this paper.

Our analysis is based on a many-server asymptotic framework first pro-posed by Borst et al. (2004) for studying call center staffing. They showthat under a general cost structure on the number of servers and the waitingtimes, the optimal number of servers to staff for the M/M/c queues is within

4

Page 5: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

a square root factor of the offered load, i.e., obeys a square-root staffingrule. A similar approach has been taken in Mandelbaum and Zeltyn (2009);Janssen et al. (2011), and Zhang et al. (2012). Mandelbaum and Zeltyn(2009) use the M/M/c+G queueing model for call center staffing. Janssenet al. (2011) and Zhang et al. (2012) develop corrected diffusion approxi-mations and their so-called refined square-root staffing rules for the M/M/cand M/M/c + M queues respectively. Finally, it is worth noting that themain mathematical framework and key analytical insights along this line ofresearch were developed or implicitly contained in the seminal paper Halfinand Whitt (1981).

1.2. Our contributions

This paper is the first to apply the many-server asymptotic framework toanalyzing make-to-stock queues. In particular, the asymptotic analysis of theinventory aspect of the system is novel. Other than appearing in two classicaltextbooks Morse (1958) and Zipkin (2000), the multi-server make-to-stockqueue as a tool for supply chain analysis has received no attention in the lit-erature. On the other hand, the state-of-the-art for multi-server queues hasbeen advanced significantly by the queueing theory community in the pasttwo decades. The most notable progress is the Halfin-Whitt many-serverasymptotic that has attracted extensive research efforts and proved to be anextremely insightful approach for call center staffing or, more generally andessentially, capacity sizing. Therefore, we believe, and as we shall demon-strate throughout the paper, that this approach is an equally valuable toolfor understanding how to use the two levers of capacity and inventory ina make-to-stock queueing setting. Specifically, solving a joint optimizationproblem is equivalent to the choice of a square-root safety factor for bothcapacity and inventory, which we call the square-root rule (see Section 3).The accuracy of the rule is asymptotically justified in Theorem 2, supportedby the approximation result Theorem 1 for the main cost objective functionunder consideration. The lucid form of the rule enables convenient analysisand discovery of qualitative insights. For example, Proposition 1 highlightsthe necessity of the square-root form for achieving a balance between servicequality and economy. Lemma 1 and Figure 1 provide a precise quantificationof the substitution effect between capacity and inventory. Section 4.3 ana-lytically illustrates the effects of uncertainty on various aspects of the supplychain. Also, Section 4.5 provides the first examination of the celebrated prin-ciple of economies of scale from the inventory point of view, and Section 4.5.1

5

Page 6: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

develops a characterization of service degradation due to control delegationin a supply chain.

This paper also contributes to the literature of capacity-inventory jointmanagement in that it is the first to consider the joint management problemin multi-server make-to-stock queues. The type of questions that we askare similar to those studied by Bradley and Glynn (2002) for single-serverqueues. As we shall show in Sections 4.1 and 4.2, despite the commonalityof capturing the load-dependent nature of production lead time, the single-server model is not applicable in a multi-server setting. Specifically, usinga wrong model can lead to an under-stocking error that grows indefinitelywith the demand rate. Moreover, from a practical point of view, to study themulti-server queueing model is relevant as in many situations the capacityof an individual server is limited and significant increases in system capacitymay only be obtained through the acquisition of additional servers. Further-more, the comparative analysis between these two stylized manufacturingmodels that we conduct leads to several operational insights that carry overto more general settings. For example, the physical structure of the produc-tion facility has substantial impacts on the inventory management aspect ofthe system. Also, the production capacity level affects not just the amountof inventory but how the inventory is distributed between the retailer andthe manufacturer in a supply chain.

Thirdly, our work contributes to the literature of diffusion approximationfor multi-server queues. The many-server asymptotic is popular, in addi-tion to for its analytical elegance, largely due to the surprising accuracy ofits resulting performance approximation, namely (first-order) diffusion ap-proximation. This accuracy was first observed by Borst et al. (2004) andlater further assessed by Janssen et al. (2011), who developed a refinement,namely corrected diffusion approximation. However, this accuracy has notbeen investigated thoroughly because there had not been a need to do so:specifically, delay probability is the primary measure of interest in these twostudies. The approximation efficacy for the whole queue-length distributionhas never been looked into; neither has any refinement been proposed. Inour setting, the whole queue-length distribution naturally arises due to thespecifics of the inventory optimization problem. Our Section 5 contains anovel corrected diffusion approximation and numerical assessments that fillthis gap. These results indicate that the first-order diffusion approximationfor the whole queue-length distribution is also extremely accurate in mostcases and hence that the qualitative insights gleaned from our asymptotic

6

Page 7: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

analysis are robust to variations of problem parameters. Furthermore, thecorrected diffusion approximation that we develop for the M/M/c queue canbe of independent interest for researchers studying other problems related tothis elementary, yet very popular model.

Finally, both the model and the analysis in Section 6 are new. Ourobservation on the three independent levers, in connection with the viewproposed by Spearman (2014) (see Remark 1 and the beginning of Section 6),can lead to interesting and challenging three-dimensional joint optimizationor control problems for future research.

The remainder of this paper is organized as follows. Section 2 formu-lates the main make-to-stock queueing model in detail. Section 3 containsthe main analytical results, including a diffusion approximation and a result-ing square-root rule for capacity and inventory sizing. We further discussthese analytical results and highlight key qualitative insights in Section 4.The accuracy of the square-root rule is investigated and corrected diffusionapproximation developed in Section 5. Finally, in Section 6 we formulateand analyze an extension of our main model with partial backlogging. Mosttechnical proofs are collected in the appendices to the paper.

2. Model Description

The main model that we consider in this paper is a single-product, make-to-stock queue consisting of a retailer and a production facility with c serversoperating in parallel. The amount of time to produce one unit of finishedgoods on an individual server is exponentially distributed with rate µ > 0.Customer orders arrive to the retailer according to a Poisson process withrate λ > 0 and each order is placed for one unit of goods. If finished goodsinventory is available when a customer order arrives to the retailer, the orderis immediately fulfilled. If no finished-goods inventory is available, thenthe customer order is backlogged and backlogged orders are fulfilled on afirst-come-first-served (FCFS) basis as finished goods inventory is replenishedfrom the production facility. The retailer operates according to a base stockpolicy with base stock level s. This means that the retailer aims to keep thesum of the finished goods inventory level and the number of work-in-process(WIP) orders at the production facility equal to s. In doing so, each unitof backorder (i.e., backlogged order) is counted as a unit of negative finishedgoods inventory. As a result the production facility operates exactly as anFCFS M/M/c queue.

7

Page 8: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Specifically, as in Bradley and Glynn (2002); Glasserman (1997), andTayur (1993), we describe the system dynamics by considering the inven-tory shortfall process, which tracks the difference between the target basestock level s and the actual amount of finished goods inventory over time.Specifically, the shortfall process Q(t), t ≥ 0 obeys the following dynamics:

Q(t) = Q(0) +NA(λt)−NS

∫ t

0

(Q(s) ∧ c)ds

), t ≥ 0, (1)

where NA = NA(t), t ≥ 0 and NS = NS(t), t ≥ 0 are both rate onePoisson processes which are independent of each other. Note that the short-fall process, independent of the base stock level s, is identical to the queuelength process of an M/M/c queue with arrival rate λ and service rate µ.Moreover, for a given base stock level s, the finished goods inventory level attime t is given by (s−Q(t))+ and, similarly, the backorder level at time t by(Q(t) − s)+, where a+ := maxa, 0 for any real number a throughout thepaper. Also, the WIP level at the production facility at time t simply equalsthe shortfall level itself, Q(t).

Let R := λ/µ be the offered load. This is the rate at which work arrivesto the system and is also the minimum number of servers required in orderto ensure that backorder levels do not grow without bound. Specifically, ifR < c, then it follows from standard results (e.g., see Gross and Harris (1998)or section 7.3.3.2 of Zipkin (2000)) that as t → ∞, Q(t) weakly converges tosome random variable Q(∞) satisfying

PcQ(∞) = k =

(Rk/k!)η, for k ≤ c,[cc(R/c)k

c!

]η, for k ≥ c,

(2)

where

η =

(Rc/(c!(1−R/c)) +

c−1∑k=0

Rk

)−1

. (3)

Note that we use a subscript c on the probability operator in (2) in order toemphasize the dependence of the limiting shortfall level Q(∞) on the chosencapacity level c and also later on we do the same for expectation operators(e.g., in (4)).

8

Page 9: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Within the M/M/c make-to-stock queueing model, we postulate a linearcost structure and investigate the optimal choice of the 2-tuple (c, s) fromthe perspective of cost (as done by Bradley and Glynn (2002)). Specifically,we assume that each unit of finished goods inventory at the retailer incurs aholding cost of h per unit time and each backorder at the retailer incurs apenalty cost of p per unit time. The cost per unit time for one unit of WIPat the production facility is given by w and each server at the productionfacility operates at a cost of c per unit time. Thus, the long-run average totalcost per unit time for the system is given by

Ω(c, s, R) = d·c+w ·Ec[Q(∞)]+h·Ec[(s−Q(∞))+]+p·Ec[(Q(∞)−s)+], (4)

and we aim to choose the production capacity level (i.e., number of servers)c and the inventory base-stock level s so as to minimize Ω(c, s, R). Note thatthe backorder cost term accounts for the loss of goodwill, and therefore ourtotal cost minimization problem should be viewed as a mathematical abstrac-tion of optimizing operational cost efficiency in conjunction with customerservice level.

In addition to the cost minimization problem, we explicitly consider animportant measure of service level, namely, the fill rate, which is defined asthe fraction of demand met from on-hand finished-goods inventory withoutdelay. The fill rate (or 1 minus it) has been considered for many other models,e.g., Benjaafar et al. (2004); Caldentey (2001); Plambeck and Zenios (2003),and Angelus and Porteus (2002). In our model, the fill rate is given by

PcQ(∞) ≤ s− 1 = PcQ(∞) ≤ s − PcQ(∞) = s. (5)

The exact analysis for minimizing (4) or evaluating (5) is feasible given(2)-(3). One particularly useful optimality characterization in the cost min-imization problem is the newsvendor-based critical fractile solution for theoptimal s subject to a fixed c. Specifically, for a fixed capacity level c, theminimization of expression (4) reduces to that of its last two terms, whichhave a standard newsvendor cost structure, and its solution is well known(see Nahmias (2009) for example) to be F−1

c,Q(∞)(p/(p + h)), with F−1c,Q(∞)(·)

denoting the inverse of the CDF (cumulative distribution function) of Q(∞)assuming c servers.

However, the computational complexity of this exact analysis (e.g., theabove CDF inversion) grows with the magnitude of offered load R and, more

9

Page 10: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

importantly, intuitive rules of thumb and managerial insights elude such anal-ysis. In this paper we adopt an asymptotic framework, originally developedby Borst et al. (2004) and later extended by Mandelbaum and Zeltyn (2009);Janssen et al. (2011), and Zhang et al. (2012) in the context of call centerstaffing. We demonstrate that in the supply chain setting and particularly inaddressing the tradeoff between capacity and inventory, useful rules of thumband new insights come out of this framework fairly readily.

3. Analysis

In this section, we first introduce the asymptotic framework used in an-alyzing the capacity-inventory optimization problem. Then we present themain analytical results.

3.1. Introduction to our asymptotic framework

Our approach to solving the optimization problem is based on an asymp-totic framework proposed by Borst et al. (2004). Specifically, we first trans-late the discrete optimization problem of seeking integer (c, s) into a contin-uous one by replacing each cost term in (4) with its continuous extensionallowing non-integer (c, s). We then consider an approximation to the con-tinualized optimization problem and solve for the optimal real-valued (c, s)to the approximating problem. We further show that the resulting solutionis asymptotically optimal in the sense that, as R increases indefinitely, it co-incides with the true optimum to the continualized optimization problem.Later in Section 5 we provide analytical and numerical assessments for theaccuracy of the asymptotic approach in the case of small or moderate Rvalues.

3.2. Preliminaries on the continuous extension

To explain the idea of the continuous extension, let us consider the steady-state delay probability in the M/M/c queue (also called Erlang delay proba-bility or the Erlang C formula), PcQ(∞) ≥ c, which simply can be obtainedby calculating

∑k≥c PcQ(∞) = k using expression (2). It is well known

(see Janssen et al. (2011) for example) that this probability turns out toequal C(c, R), defined as

C(c, R) :=

[R

∫ ∞

0

te−Rt(1 + t)c−1dt

]−1

. (6)

10

Page 11: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Note that the integral representation (6) can be defined for all c ∈ (R,∞),as opposed to integer c only as in PcQ(∞) ≥ c; we shall do so and call (6)a continuous extension of the delay probability.

Similarly, we are able to derive continuous extensions to the average costfunction Ω(c, s, R) and the shortfall (or queue-length) distribution PcQ(∞) ≤s, relaxing the integrity constraints on (c, s). Specifically, we are able todefine for any (c, s) ∈ (R,∞) × [0,∞), a function Π(c, s, R) and anotherfunction D(c, s, R), such that for any integer c > R and integer s ≥ 0,Π(c, s, R) = Ω(c, s, R) and D(c, s, R) = PcQ(∞) ≤ s. We omit the de-tailed expressions of these two functions and their derivations and defer themtill Appendix A (see Lemma 2 and Proposition 2). In the remainder of thepaper we focus on solving the problem of seeking the optimal real-valued(c, s) with respect to the continualized Π(c, s, R) and D(c, s, R).

3.3. Approximation to the continuous extension

In order to state our result, we need the following notation. Throughoutthe paper, let Φ(·) and ϕ(·) denote the standard normal cumulative distri-bution and density functions, respectively. A real-valued function f(R) issaid to be O(g(R)) if lim supR→∞ |f(R)/g(R)| < ∞, and an R2-valued func-tion (f1(R), f2(R)) is said to be O(g(R)) if f1(R) = O(g(R)) and f2(R) =O(g(R)). We write f(R) ∼ g(R) if limR→∞ f(R)/g(R) = 1.

Following the asymptotic framework introduced in Section 3.1, we nextapproximate the continualized performance function and cost objective func-tion. Several approximating approaches have been developed in the litera-ture, including (first-order) diffusion approximation (see Halfin and Whitt(1981); Borst et al. (2004); Mandelbaum and Zeltyn (2009)), corrected dif-fusion approximation (see Janssen et al. (2008, 2011); Zhang et al. (2012)),and exact bounds (see Janssen et al. (2011); Zan et al. (2013)).

We choose to use the first-order diffusion approximation because of itsinsightfulness and also due to its remarkable accuracy for our model. Borstet al. (2004) and Janssen et al. (2011) have noted such accuracy in their callcenter staffing application; our observation of the diffusion approximation’sremarkable accuracy, however, does not follow from theirs, as our objectivefunction involves the whole queue-length distribution while theirs only doesthe delay probability (equivalent to the queue-length distribution evaluatedat one point).

Our first approximation result is on the the continuous extension of thesteady-state shortfall distribution.

11

Page 12: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Proposition 1. Let cR and sR denote positive real-valued functions of R.The continuation of the shortfall distribution has a nondegenerate limit, i.e.,

limR→∞

D(cR, sR, R) = α, for α ∈ (0, 1), (7)

if and only if

cR ∼ R + β√R and sR ∼ R + b

√R, with β > 0,−∞ < b < ∞, (8)

in which caseα = D∗(β, b), (9)

where

D∗(β, b) :=

[1− C∗(β)]Φ(b)/Φ(β) = βΦ(b)[ϕ(β) + βΦ(β)]−1, if b ≤ β,1− C∗(β)e

−β(b−β), if b > β,(10)

and

C∗(β) :=

[1 +

βΦ(β)

ϕ(β)

]−1

.

Proposition 1 essentially follows from combining Proposition 1 and Proposi-tion 2 in Halfin and Whitt (1981) (with some additional minor technicality toaccommodate the continuation), and we omit the proof. In the statement ofthis result, the “if and only if” logic, of the same spirit as that in Proposition1 of Halfin and Whitt (1981), carries with it important insight: it dictatesthat for the system to have a fill rate that balances between service qualityand economy, the capacity safety factor must be proportional to

√R and

likewise the base stock level must equal R plus or minus a “safety” factorproportional to

√R; any other capacity and inventory configuration leads to

a fill rate of (approximately) either zero or one as R grows large, e.g., if thebase stock level is bounded from above by a fixed warehouse size, the fill ratewould become extremely low for high-demand systems or precisely goes tozero as R → ∞. To see this, simply note from (5) that under condition (8),

fill rate = D(cR, sR − 1, R) ∼ D(cR, sR, R), (11)

and then Proposition 1 applies. The square-root form of the capacity safetyfactor is well known as the square-root safety staffing principle (e.g., seeWhitt (1992); Tijms (1995)). However, to the best of our knowledge, it has

12

Page 13: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

not been established in the literature that the base stock level must have themagnitude of R + b

√R where the factor b may be positive or negative. In

fact, if the production facility has a different structure, e.g., if it resemblesa single server, this result may not hold any more. We shall discuss thiscomparison further in Section 4.

Next, we state the diffusion approximation for the continualized long-runaverage total cost function Π(c, s, R).

Theorem 1 (Diffusion approximation for the average cost function).For any β > 0 and b ∈ (−∞,∞),

Π(R + β√R,R + b

√R,R) = (d+ w)R +K∗(β, b)

√R +O(1), (12)

where

K∗(β, b) :=

dβ − pb+ w+p

β· C∗(β) + (p+ h)D∗(β, b)

[b+ ϕ(b)

Φ(b)

], if b ≤ β,

dβ + hb+ w−hβ

· C∗(β) + (p+ h)[1− C∗(β)]ϕ(β)

β2Φ(β)· e−β(b−β), if b > β.

(13)

The total cost, in light of Theorem 1, consists of three components: anorder R term proportional to the capacity operating cost rate d and WIPholding cost rate w constitutes the minimum cost required to keep the systemrunning in stability; an order

√R term with coefficient K∗(β, b), which we

shall examine closely in Section 4, is the cost due to variability; and the thirdcomponent remains bounded regardless of the demand volume and hence isnegligible compared to the first two for large systems (i.e., large R values).The derivation of this approximation is given in Appendix B.

3.4. Solving approximating problem: square-root capacity and inventory siz-ing rule

Theorem 1 suggests that in order to approximate the cost-optimal capac-ity and base-stock levels, we can simply round up (c∗, s∗) where

c∗ := R + β∗√R and s∗ := R + b∗

√R, (14)

with (β∗, b∗) being the minimizer of K∗(β, b).

13

Page 14: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

More specifically, treating β as fixed and setting to zero the partial deriva-tive of K∗(β, b) with respect to b yields the optimal value of b under the fixedβ, given by

w∗(β) :=

β + 1βln[C∗(β)·(p+h)

h

], if C∗(β) >

hp+h

,

Φ−1(

pΦ(β)(p+h)[1−C∗(β)]

), if C∗(β) ≤ h

p+h.

(15)

Therefore, the joint minimization of K∗(β, b) may be achieved by first ob-taining

β∗ := argminβ>0

K∗(β,w∗(β)), (16)

and, subsequently, setting b∗ := w∗(β∗). We refer to (β∗, b∗) obtained in thisway as the square-root rule.

It is worth pointing out that for any β > 0,

D∗(β,w∗(β)) =p

p+ h. (17)

This characterization nicely corresponds to the critical fractile solution men-tioned in Section 2, in light of Proposition 1.

To understand the efficacy of the square-root rule, we shall compare(c∗, s∗) against (copt, sopt), with

(copt, sopt) := arg min(c,s)∈(R,∞)×[0,∞)

Π(c, s, R). (18)

Our next result states that the square-root rule is asymptotically optimal asthe workload grows indefinitely.

Theorem 2 (Asymptotic optimality of square-root rule).

(c∗, s∗) ∼ (copt, sopt), (19)

Π(c∗, s∗, R) ∼ Π(copt, sopt, R). (20)

Theorem 2 implies that for a problem instance with a high demand rate,both the relative error made by the square-root rule and its cost optimalitygap become negligible and hence one may learn the behavior of the optimalcapacity-inventory level by examining (c∗, s∗), or simply (β∗, b∗). This is alsothe main approach that we take in our discussion of the qualitative insights

14

Page 15: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

both later this section and in the next section. The proof of Theorem 2 iscontained in Appendix C.

We next consider the other important performance metric defined in Sec-tion 2, namely, the fill rate, which measures the customer service level. Ourfirst observation is that both the exact and approximate cost-optimal solu-tions lead to a fill rate of about p/(p + h) for large R values. This is truebecause, by (11) and (17), the fill rate under these two solutions satisfies

D(copt, sopt − 1, R) ∼ D(c∗, s∗ − 1, R) ∼ p

p+ h. (21)

This is consistent with the observation by Zipkin (2000) (p. 46) that althoughthe cost objective function does not restrict or put penalty with respect to thefill rate (which can be viewed as cost due to occurrence of a backorder, ratherthan its continuation in time), the cost-optimal policy controls it anyway asa by-product.

Now suppose that the system manager wishes to achieve a target fill rateof δ ∈ (0, 1). Proposition 1 and relation (11) together imply that, for acapacity level c = R + β

√R and base-stock level s = R + b

√R, the fill rate

may be approximated by D∗(β, b). It is therefore instructive to characterizethose pairs of (β, b) such that D∗(β, b) = δ, or in other words, to understandthe behavior of the level curves of function D∗(β, b). We do so by consideringbδ(β), the solution to D∗(β, bδ(β)) = δ. The following lemma is illuminatingand its proof is in Appendix C.

Lemma 1.limβ→∞

bδ(β) = Φ−1(δ), (22)

limβ→0

[bδ(β)−

− ln(1− δ)

β

]= −

√2π

2. (23)

Relation (22) is not surprising. As β → ∞, capacity becomes unlimitedand the production facility should behave like an infinite server queue, which

has a Gaussian approximation P∞Q(∞) ≤ s ≈ Φ(

s−R√R

)(see, e.g., Borst

et al. (2004)). As the capacity safety factor β decreases to zero, the systemmanager must increase the base-stock level in order to maintain the desiredfill rate δ. The result (23) suggests that the required inventory and capacitysafety factors are approximately inversely proportional to one another whenthe capacity hedge is low. Figure 1 plots level curves of the function D∗(β, b),

15

Page 16: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

which provide a precise quantification of the substitution effect between ca-pacity and inventory in the multi-server make-to-stock queueing model.

Also, Figure 1 shows that both limits in Lemma 1 seem to converge fast.In fact, each level curve can be well approximated by a nearly vertical straightline and another nearly horizontal one pasted together. This observation sug-gests that for high demand rate systems there is a large region of parametersettings for which the marginal improvement on the fill rate due to eithercapacity or inventory increase is very small and hence the capacity-inventorysubstitution effect is weak ; more specifically, if the capacity hedge is low,any substantial improvement on the fill rate needs to come from capacityincrease; on the other hand, once the capacity hedge reaches a threshold(which is fairly low, somewhere between 1.5 and 2 according to Figure 1),holding more inventory plays a much more important role in improving thefill rate.

0.01

0.01

0.11

0.11

0.21

0.21

0.31

0.31

0.41

0.41

0.51

0.51

0.61

0.61

0.71

0.71

0.81

0.81 0.91

0.91

0 1 2 3 4 5

-3

-2

-1

0

1

2

3

Figure 1: Contour plot of D∗(β, b) at values from 0.01 to 0.91, for (β, b) ∈ (0, 5)× (−3, 3).

16

Page 17: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

4. Discussion and Qualitative Insights

In this section we further discuss the implications of our analytical re-sults and highlight some qualitative insights. Throughout this section weshall compare our results to the counterparts in a single-server make-to-stockqueue setting whenever appropriate.

4.1. Single-server model is not applicable in multi-server setting

To start with, we illustrate that making a wrong assumption about theproduction facility structure can lead to severe errors on the optimal solution,especially on the inventory decision.

First note that for the multi-server model we have conducted the analysiswith respect to the offered load R, where R = λ/µ and the individual servercapacity µ is a constant. Under the single-server model assumption, how-ever, there is not a constant parameter µ and the production capacity (ormore precisely, the service rate of this single server) is the decision variable.Therefore, in the remainder of this section we relate our discussion to thecustomer order arrival rate λ, a common parameter in both scenarios.

The M/M/1 make-to-stock model without a WIP cost is solved in Ex-ample 1 of Bradley and Glynn (2002). Denote by (cMM1, sMM1) the optimalsolution to the M/M/1 model; in particular, note that cMM1 represents theoptimal rate of the single server, not the number of servers (which is fixed atone). The approximation for the optimum reads

cMM1 ≈ λ+ βMM1

√λ, (24)

sMM1 ≈2√λ+ βMM1

2βMM1

ln

(p+ h

h

)λ→∞∼ w∗,2(βMM1)

√λ, (25)

where

βMM1 :=

[h

dln

(p+ h

h

)] 12

, (26)

w∗,2(β) :=1

βln

(p+ h

h

). (27)

Therefore, the optimal capacity decision in the single-server model, as inthe multi-server model, is to set the capacity level to the order arrival rateplus a square root safety factor, i.e. λ + βMM1

√λ for some positive βMM1

determined by the cost parameters. However, the optimal base stock level

17

Page 18: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

does not have such a form. Specifically, the base stock level need not matchthe order arrival rate, rather it should only be proportional to

√λ. Therefore,

assuming a single-server model when the facility actually resembles a multi-server one can lead to severe under-stocking, and in particular, the error onthe base stock level grows with λ.

To recapitulate, in the operational setting of capacity-inventory jointmanagement, the structure of the production facility can have significanteffects on the inventory decision. There are also other clear differences be-tween the multi- and single-server models. The WIP level in the multi-servermodel is at the order of λ whereas that in the single-server model is at theorder of

√λ. Also, for large λ the probability that all c servers are busy can

be set to any level between 0 and 1 by simply adjusting β or b in our model,equivalent to an order

√λ capacity and inventory change. By contrast, for

the single-server model the probability that the production line is busy con-verges to 1 as λ → ∞ and, to lower this probability to a level away fromone, significant capacity investment of order λ is required (e.g., see p. 568 top. 569 of Halfin and Whitt (1981), where their “first procedure” correspondsto our single-server approximation and their “second procedure” is the onewith order λ capacity slack, more recently called quality-driven regime, e.g.,see Zhang and Zwart (2013)).

4.2. Choosing between single-server and multi-server in practice

The contrast between M/M/1 and M/M/c models also leads to a usefulguideline on choosing between speeding up a single production line versusopening multiple production lines, when the system manager, faced withincreasing demand, has both options to consider.

The total amount of inventory in the system has different orders of mag-nitude in these two models. To begin with, we know from standard formulasin queueing theory that the average WIP level Ec[Q(∞)] in the multi-servermodel has the same order as the order arrival rate λ, while the averagefinished-goods inventory level Ec[(s−Q(∞))+] grows according to

√λ. Thus,

in the multi-server model, the average WIP level dominates the total amountof inventory in the system. On the other hand, in a single-server model theaverage WIP level is only at the order of

√λ and, in addition, the average

finished-goods inventory is also at the order of√λ since (1) it is known to

be at the same order of magnitude as the base stock level (see Bradley andGlynn (2002) and Wein (1992), both of which observe that at optimality, in-ventory costs are simply equal to h times the optimal base stock level), and

18

Page 19: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

(2) the base stock level has the order of√λ, as pointed out in the preced-

ing subsection. Therefore, the total amount of inventory in the single-servermodel grows according to the square root of the arrival rate,

√λ, which is

significantly slower than the linear growth rate in the multi-server model.This analytical observation implies that from the perspective of cost ef-

ficiency, the manager of a high-volume system would prefer speeding up asingle-line production facility (if feasible), as opposed to installing additionallines in parallel, due to significant savings on the inventory cost. In particu-lar, for capital-intensive industries with high WIP and inventory cost rates,a single faster production line (or fewer of them) should be preferable tomany slower production lines in parallel. This is indeed consistent with thestructures of most semiconductor fabrication plants and automobile factories.

Finally, we remark that we have assumed a linear capacity cost structurewith rate d for simplicity to derive insights and that this assumption maynot hold in practice. The choice between single- and multi-server facilitiesshould heavily depend on the actual cost structure, which can differ qualita-tively between these two types of manufacturing models. For example, it isreasonable to define the cost for the single-server facility as infinity for veryhigh capacity levels, that is, there is simply an upper bound on the speedof a single production line and any speed (synonymous to ‘capacity’ in thiscontext) above that level is not feasible. For multi-server facilities, it maywell be the case that the capacity cost function is strictly concave or convexdepending on the nature of the servers.

4.3. Effects of uncertainty

In the absence of uncertainty, a demand-R system can be served by ca-pacity level R and base stock level R at the minimum capacity cost dR, WIPcost wR, and finished goods inventory cost hR with 100% fill rate and zerobackordering cost (one may think of this as a D/D/c model with c = R). Inthis subsection, we investigate the effects of uncertainty using the approxi-mation result that we have obtained for the M/M/c model. We begin byseparating the coefficient of the

√R term K∗(β, b) appearing in Theorem 1

into four parts, each corresponding to one of our four costs: capacity cost,WIP cost, finished goods holding cost and backordering cost. Doing so, weobtain

K∗(β, b) = d ·Kd(β) + w ·Kw(β) + p ·Kp(β, b) + h ·Kh(β, b), (28)

19

Page 20: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

where Kd(β) := β,Kw(β) := C∗(β)/β,

Kp(β, b) :=

C∗(β)

β− b+D∗(β, b)

[b+ ϕ(b)

Φ(b)

], if b ≤ β,

ϕ(β)β2Φ(β)

· e−β(b−β), if b > β,(29)

and

Kh(β, b) =

D∗(β, b)

[b+ ϕ(b)

Φ(b)

], if b ≤ β,

b− C∗(β)β

+ [1− C∗(β)]ϕ(β)

β2Φ(β)· e−β(b−β), if b > β.

(30)

One may easily verify a diffusion-approximation-level conservation law:

Kw(β) +Kh(β, b) = b+Kp(β, b), (31)

which results from the base-stock policy, i.e., at each point in time,

WIP + finished-goods inventory = base stock level s+ backlog level.

The positivity of the capacity safety factor β suggests that in the presence ofuncertainty an additional capacity investment of dβ

√R is always necessary

to ensure stability. Stockout, equivalent to backordering in our model, isinevitable, as quantified by Kp(β, b)

√R. Figure 2 plots Kp(β, b) and shows

that with one of the two factors β or b fixed, it is decreasing in the other one.This observation is consistent with the intuition that an increase in eitherthe production capacity or the base stock level should improve the system’sresponsiveness to customer demand and thus result in a lower average backo-rder level. The WIP cost term is relatively straightforward: once the capacityhedge β is fixed, so is the WIP cost; the hedge against uncertainty Kw(β)

√R

is decreasing in β.The effect of uncertainty on finished-goods inventory is more intricate.

First, since the inventory hedge b may be strictly negative, the base stocklevel, as a decision variable, may be chosen to be lower than the level withoutuncertainty. However, the overall investment on finished-goods inventoryholding, measured by the average finished-goods inventory level, does needto be always higher than hR in order to hedge against uncertainty, even ifb < 0. To see this, note that when b < 0 and thus b < β,

Kh(β, b) = D∗(β, b)

[b+

ϕ(b)

Φ(b)

]. (32)

20

Page 21: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

0.5 1.0 1.5 2.0

2

4

6

b=3

b=2

b=1

b=.5

b=0

b=-.5

b=-1

b=-2

b=-3

Figure 2: Kp(β, b) with fixed b values.

0.5 1.0 1.5 2.0

0.5

1.0

1.5

2.0

2.5

3.0

b=3

b=2

b=1

b=.5

b=0

b=-.5

b=-1

b=-2

b=-3

Figure 3: Kh(β, b) with fixed b values.

It is easy to algebraically check that D∗(β, b) > 0 for any (β, b) and then

for any b < 0, b + ϕ(b)Φ(b)

≥ 0 holds since Φ(−x) ≤ ϕ(x)/x for any x > 0 (see

Theorem 1.2.3 on p. 11 in Durrett (2010)). We therefore have that (32) isnon-negative. The plot of Kh(β, b), increasing in β and b, is given in Figure3.

It is worth noting that the cost-optimal inventory factor b∗ should go tonegative infinity as the ratio between p and maxd, h, w goes to infinity(similar to Theorems 7.1 and 8.1 in Borst et al. (2004)). This is the casein which finished-goods inventory holding is dominantly the most expensiveand thus should be avoided.

Remark 1. It can be illuminating to view backorder as a third operationallever, along with capacity and inventory; in other words, investment canbe made to defray backordering cost, just like that be done on capacity andinventory. This view, articulated by Spearman (2014), states that time, in-ventory, and capacity constitute three possible forms of buffer against demandand supply variabilities. In our model, the backorder level, or the degree towhich this third operational lever is utilized, is actually uniquely determinedby the capacity and base stock levels. One way to see this interdependence isto note that with β (or b) fixed, Kp(β, b) is a bijective function of b (or β),more specifically, strictly monotone. Therefore, among these three forms ofbuffers, if any two of them are chosen, the remaining third one is also al-ready determined. Spearman (2014) also poses the interesting and importantquestion how the three forms of buffers interact, which is exactly answeredby the plot of Kp(β, b), Figure 2. In Section 6, we shall discuss an extensionof our main model, in which an independent control on this third lever ofbackordering (or time) is added.

21

Page 22: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

4.4. Effects between capacity and inventory

We have already seen in Sections 4.1 and 4.2 that the structure of theproduction facility can have significant impacts on the inventory. We shallstart this section by discussing how the capacity decision influences the in-ventory level under our specific structural assumption of the facility, i.e., themulti-server queueing model.

Our first observation is that increasing capacity reduces the total amountof inventory in the system (WIP plus finished-goods inventory ), since thetotal inventory is given by R + [b + Kp(β, b)]

√R due to the relation (31)

and Kp(β, b) is decreasing in β. In addition, the production capacity levelaffects the way in which inventory is distributed in the pipeline. Specifically,as we already noted, Kw(β) is decreasing in β; on the other hand, Kh(β, b)is increasing in β. Therefore, since the sum of Kw(β, b) and Kh(β, b) isdecreasing in β, it follows that increasing capacity pushes a larger percentageof the total inventory towards the retailer.

We discussed the substitution effect between capacity and inventory withrespect to the fill rate at the end of Section 3.4. Next we expand thatdiscussion, focusing on the relationship between capacity and inventory atoptimality of the average cost function.

Recall that w∗(β), as defined in (15), gives the optimal base-stock safetyfactor corresponding to a particular capacity safety factor β (and converselyw−1

∗ (·) characterizes the inverse of this relationship). It is easy to recognizethat w∗(β) = bδ∗(β) where δ∗ :=

pp+h

, due to the characterization of w∗(·) in(17) and the definition of function bδ(·) in Section 3.4. Hence we may applyLemma 1 to obtain that

limβ→∞

w∗(β) = Φ−1

(p

p+ h

), (33)

limβ→0

[w∗(β)−

1

βln

(p+ h

h

)]= −

√2π

2. (34)

That is, if the capacity variability hedge is high, then the c-server facilitybehaves similarly to an infinite-server queue and the actual capacity levelbecomes irrelevant. In this case, the steady-state shortfall distribution isapproximately normal, i.e., D∗(β, b) ≈ Φ(b). On the other hand, from (34) weobserve that the optimal inventory and capacity safety factors are inverselyrelated to one another when the capacity level is low. We plot w∗(·) in Figure4 for varying values of h/p. For a fixed value of h/p, w∗(·) is clearly decreasing

22

Page 23: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

0.5 1.0 1.5 2.0Β

2

4

6

h=10

h=5

h=2

h=1

h=.5

h=.2

h=.1

Figure 4: w∗(β) with p = 1 and h taking on different values.

in β, i.e., the optimal base stock level is decreasing in the capacity level. It isparticularly worth paying attention to for which values of h/p and β w∗(β)falls below the y axis, i.e., the inventory hedge is negative. Figure 4 alsodemonstrates that the rate of convergence in (33) is very fast. As long as βis not too small, the infinite-server normal approximation seems to be veryaccurate. In addition, the optimal inventory hedge is bounded from below

by that in the infinite-server case b := Φ−1(

pp+h

)and thus the optimal base

stock level is at least R + b√R, where b is less than zero if p < h.

4.5. Economies of Scale

The square-root rule, often referred to as square-root (safety) staffing inthe context of service systems, quantifies the economies of scale (EoS) thatcan be achieved by combining several isolated systems into a larger one (e.g.,see the discussion on p. 200 to 201 in Tijms (1995)). In order to illustratethis point in our make-to-stock queueing model, consider two identical sys-tems both with the same set of cost parameters (d, w, p, h) and hence, bythe results in the previous section, the same (approximately) optimal safetyfactor (β∗, b∗), where β∗ must be strictly positive in order for the system tobe stable. Moreover, assume that both systems have an offered load of R.If these two systems are operated independently, then, by Theorem 2, theoptimal total capacity level is approximately 2R + 2β∗

√R. On the other

hand, if they are combined together the optimal capacity size is approxi-mately 2R + β∗

√2R, which equates to a saving of (2 −

√2)β∗

√R servers.

More generally, an n-fold increase in the offered load R requires that thecapacity variability hedge increase by only

√n times.

23

Page 24: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Although the pooling of two systems together will always result in a de-crease in the optimal capacity level, it may lead to an increase in the basestock level in order to compensate. Specifically, this is the case when b∗ isnegative. This can for instance happen if p

p+h< 1/2 and β∗ is sufficiently

large (see Figure 4). In such a case, combining two systems together, eachwith an offered load R and identical costs parameters, will result in an in-crease in the cost-optimal base stock level by approximately (2−

√2)|b∗|

√R

units. Nevertheless, although the base stock level may rise due to pooling,the average amount of finished-goods inventory level will always decreasesince Kh(β, b) > 0 always holds (even when b < 0, as mentioned in Section4.3), and consequently EoS does still apply to the average inventory level.Finally, we note that in contrast to our multi-server model, EoS holds forboth the capacity and base stock levels in the single-server model, as seenfrom expressions (24), (25), and (26).

4.5.1. Asymptotically negligible service degradation

The EoS principle is also applicable to the service level performance whenthe production and inventory decisions are not made by a single manager ofthe whole system. Specifically, the service degradation, the degradation inthe fill rate caused by delegation of control (see Plambeck and Zenios (2003)),is asymptotically negligible.

Consider the retailer and the manufacturer (i.e., the production facility)who act independently. The manufacturer, as the agent, first determinesthe number of servers, say, cpa, based on its own interests (for example, tominimize its operational cost subject to some constraint on its response timeto the retailer). Then the retailer, as the principal, with the knowledge ofthe number of servers installed by the manufacturer cpa, determines the basestock level spa to minimize its own costs, including inventory holding costh ·Ecpa [(s−Q(∞))+] and backorder cost p ·Ecpa [(Q(∞)− s)+]. The resultingservice degradation, in light of (5) and (11), is given by

PcoptQ(∞) ≤ sopt − 1 − PcpaQ(∞) ≤ spa − 1

∼PcoptQ(∞) ≤ sopt − PcpaQ(∞) ≤ spa =p

p+ h− p

p+ h= 0, (35)

i.e., the service degradation is asymptotically negligible as R → ∞.For systems with small or moderate R values, we may further characterize

24

Page 25: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

the service degradation. Specifically,

service degradation = PcoptQ(∞) ≤ sopt − 1 − PcpaQ(∞) ≤ spa − 1= PcpaQ(∞) = spa − PcoptQ(∞) = sopt, (36)

where (36) is O(R−1/2) since each of the two terms is so. More precisely, withβpa := (cpa −R)/

√R and bpa := (spa −R)/

√R, we may apply Proposition 2

of Halfin and Whitt (1981) to write (36) as

service degradation ∼ [d(βpa, bpa)− d(β∗, b∗)]R−1/2, (37)

where

d(β, b) :=

[1− C∗(β)]ϕ(b)/Φ(β) = βϕ(b)[ϕ(β) + βΦ(β)]−1, if b ≤ β,1 + βC∗(β)e

−β(b−β), if b > β.(38)

5. Accuracy of the square-root rule: corrected diffusion approxi-mation and numerical illustration

The accuracy of the proposed square-root rule is only asymptotic, asstated precisely in Theorem 2. Therefore, the usefulness of the qualitativeinsights gleaned from our analysis in the previous section hinges upon thediffusion approximation and the square-root rule being accurate even forsmall or moderate R values, e.g., whether the O(∞) error term in Theorem1 is sufficiently small in those cases. The accuracy of diffusion approxima-tion and square-root type rules for the M/M/c model has previously beeninvestigated numerically by Borst et al. (2004) and analytically by Janssenet al. (2011) in the context of call center staffing. They both found that thefirst-order diffusion approximation is extremely accurate even for a small R.However, both of these two studies have focused on the approximation andstaffing problems with respect to the delay probability, namely, approximat-ing D(c, c, R) = C(c, R). To the best of our knowledge, no prior study hasevaluated the accuracy of the diffusion approximation for the whole queue-length distribution, that is, the approximation of D(R+ β

√R,R+ b

√R,R)

by D∗(β, b) for an arbitrary set of (β, b) values. This is the focus of thepresent section. The following theorem states a corrected diffusion approx-imation, which explicitly characterizes the order O(R−1/2) term in additionto D∗(β, b). Its full proof is included in Appendix D.

25

Page 26: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Theorem 3. For any β > 0 and b ∈ (−∞,∞),

D(R + β√R,R + b

√R,R) = D∗(β, b) +D•(β, b)R

−1/2 +O(R−1), (39)

where

D•(β, b) =

[1− C∗(β)]g1(β, b)− C•(β)Φ(b)Φ(β)

−1, if b ≤ β,[1− C∗(β)]g2(β, b)− C•(β)

[1 + (1− e−β(b−β))βϕ(β)Φ(β)−1

], if b > β,

(40)

g1(β, b) =1

6Φ(β)

[Φ(b)ϕ(β)(β2 + 2)

Φ(β)− ϕ(b)(b2 − 4)

], (41)

g2(β, b) =[1− e−β(b−β)

]g3(β)β

−1 − ϕ(β)

βΦ(β)

[1

2β2(b− β)− β

]e−β(b−β), (42)

g3(β) =1

2βϕ(β)

Φ(β)+

1

6β2

(ϕ(β)

Φ(β)

)2

+1

6β3 ϕ(β)

Φ(β)+

1

3

(ϕ(β)

Φ(β)

)2

, (43)

C•(β) = βC∗(β)2

[1

3+

β2

6+

Φ(β)

ϕ(β)

2+

β3

6

)]. (44)

In light of Theorem 3, it makes sense to assess the accuracy of the firstorder diffusion approximation by examining the magnitude of rd(β, b) :=D•(β, b)/D∗(β, b). The greater is rd(β, b), the more significant is the sec-ond order term relative to the first one in the above series expansion. Wefind through extensive numerical experiments that it is helpful to divide theparameter space into three regions: −2.5 ≤ b ≤ 2.5, b > 2.5, and b < −2.5.

For −2.5 ≤ b ≤ 2.5, |rd(β, b)| is at most 0.76 or so; see the 3-D plotFigure 5. This means that in this parameter region for a moderate R = 50the relative error of the diffusion approximation is no more than 10% (≈0.76/

√50) or so. For b > 2.5, |rd(β, b)| is even smaller, as illustrated by

Figure 6 and 7. However, for b < −2.5, |rd(β, b)| can become large; Figure 8shows that rd(β, b) decreases to −∞ as b becomes smaller. These numericalresults show that s << R is the only parameter setting at which the diffusionapproximation may be inaccurate. From an optimization point of view, thisoccurs when finished-goods inventory holding cost h is extremely high.

6. Extension to Partially Backlogged Demand

In this section, we consider an extension of our basic model studied inthe previous two sections. We assume that the system manager may turn

26

Page 27: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

0

5

10

Β

-2

0

2

b

-1.0

-0.5

0.0

0.5

Figure 5: rd(β, b) for (β, b) ∈ (0, 10) ×(−2.5, 2.5).

0.0

0.5

1.0

Β3

4

5

b

0.0

0.1

0.2

0.3

Figure 6: rd(β, b) for (β, b) ∈ (0, 1) ×(2.5, 5).

2

4

6

Β3

4

5

b

-0.03

-0.02

-0.01

0.00

Figure 7: rd(β, b) for (β, b) ∈ (1, 6) ×(2.5, 5).

05

10

Β

-100

-50

0

b

-150 000

-100 000

-50 000

0

Figure 8: rd(β, b) for (β, b) ∈ (0, 10) ×(−100,−2.5).

away customer orders if the backorder level is too high. In this case, thesystem manager must determine the maximum permissible backorder level, inaddition to the capacity and base stock levels. As commented on in Remark1, this adds a third independent control on the degree to which the systemmanager may apply the operational lever of ‘time’ (i.e., meeting demand laterthan it occurs) to hedge against variabilities.

Specifically, as in our original model, customer orders arrive to the systemaccording to a Poisson process with rate λ > 0 and are fulfilled from thefinished-goods inventory which is managed under a base-stock policy withbase-stock level s. The production facility consists of c servers in paralleland each order’s processing time has an exponential distribution with mean1/µ. At most r ∈ [0,∞) units of backorders are allowed and any customerorder that arrives when there are r customer orders backlogged is lost. Thesingle-server version of this model has been studied by Caldentey (2001);Kouikoglou and Phillis (2002), and also in Ioannidis and Kouikoglou (2008),

27

Page 28: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

where the model is called a single-stage CONWIP (constant work-in-process)production system. In all of these three papers, the capacity is given, i.e., asingle server with a given production rate; their objective is to optimize thechoice of s and r.

In our model, the shortfall process, namely, Zs+r(t), t ≥ 0, is identicalto the number of customers process in an M/M/c/(s+ r) queue with arrivalrate λ and service rate µ, where s + r indicates the maximum number ofcustomers allowed in the queueing system. Also, without loss of generality,we assume c ≤ s + r, because if c > s + r, then c − (s + r) of the serverscan be idled at all times without affecting the system dynamics and so thesystem manager would want to simply install s + r servers instead, whichreduces this case to that of c = s+ r.

Now note that the shortfall process Zs+r(t), t ≥ 0 is simply a truncationto the set 0, 1, ..., s+ r of the birth-death process Q(t), t ≥ 0 defined byexpression (1) . We therefore have (e.g., according to Proposition 5.6.3 inRoss (1996)) that

Zs+r(∞)d=Q(∞)|Q(∞) ≤ s+ r, (45)

whered= means “equal in distribution to.” Now denote by l the penalty cost

for each unit of lost sale. The average cost objective function is then givenby

T (c, s, r, R) = d · c+ w · Ec[Zs+r(∞)] + h · Ec[(s− Zs+r(∞))+]

+ p · Ec[(Zs+r(∞)− s)+] + l · λ · PcZs+r(∞) = s+ r, (46)

where the last term in (46) represents the long-run average penalty rate forlost sales. We further write

T (c, s, r, R) = (d+ w) ·R +R1/2 ·KT (c, s, r, R),

with

KT (c, s, r, R) := d · c−R√R

+ w · Ec[Zs+r(∞)]−R√R

+ h · Ec[(s− Zs+r(∞))+]√R

+ p · Ec[(Zs+r(∞)− s)+]√R

+ l · λ · PcZs+r(∞) = s+ r√R

. (47)

Our goal now is to choose the 3-tuple (c, s, r) that minimizes T (c, s, r, R) or,equivalently, minimizes KT (c, s, r, R).

28

Page 29: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

First note that

Ec[(Zs+r(∞)− s)+] = Ec[Zs+r(∞)]− s+ Ec[(s− Zs+r(∞))+], (48)

and, it follows from (45) that

Ec[(s− Zs+r(∞))+] =Ec[(s−Q(∞))+]

PcQ(∞) ≤ s+ r, (49)

Ec[Zs+r(∞)] = s+ r − Ec[(s+ r −Q(∞))+]

PcQ(∞) ≤ s+ r, (50)

and

PcZs+r(∞) = s+ r =PcQ(∞) = s+ rPcQ(∞) ≤ s+ r

. (51)

Because we have developed diffusion approximations for all of the terms onthe right-hand sides of expressions (49)-(51) (or more precisely their con-tinuous extensions to real-valued arguments), a diffusion approximation forKT (c, s, r, R) can be readily obtained. Specifically, for any (β, b, ν) ∈ S3 :=(a1, a2, a3) ∈ (0,∞)× (−∞,∞)× [0,∞) : a2+a3 ≥ a1, the following seriesexpansion result holds:

KT (R + β√R,R + b

√R, ν

√R,R) = KT,∗(β, b, ν) +O(R−1/2), (52)

where

KT,∗(β, b, ν) = dβ + w · Eβ[Zb+ν ] + h · Eβ[(b− Zb+ν)+]

+ p · Eβ[(Zb+ν − b)+] + l · (β + Eβ[max(0,−Zb+ν)]), (53)

where Zb+ν is a random variable with its distribution given by

PβZb+ν ≤ x =D∗(β, x)

D∗(β, b+ ν), (54)

for x ≤ b + ν. The approximately optimal capacity-inventory-backlog pre-scription is then given by

(c∗, s∗, r∗) := (R + β∗√R,R + b∗

√R, ν∗

√R), (55)

where (β∗, b∗, ν∗) := argmin(β,b,ν)∈S3 KT,∗(β, b, ν). Again, as in the analysisof our main model, this approximation based on our asymptotic framework

29

Page 30: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

easily leads to an otherwise non-obvious fact: the optimal maximum backloglevel should be at the order of

√R.

In order to solve for (β∗, b∗, ν∗), note that if we fix β > 0 and further fixthe sum of b and ν at a certain value y ≥ β, then it can be seen from (53) thatthe optimal base-stock safety factor b (i.e., the minimizer of KT,∗(β, b, y− b))must satisfy

D∗(β, b)

D∗(β, y)=

p

p+ h. (56)

This optimality characterization corresponds to the critical fractile solutionfor the exact optimal base-stock level s, given c and s + r for the objectivefunction T (c, s, r, R),

PcZs+r(∞) ≤ s =p

p+ h,

because

PcZs+r(∞) ≤ s =PcQ(∞) ≤ s

PcQ(∞) ≤ s+ r, (57)

and the D∗(·, ·) function is the diffusion approximation for the CDF of Q(∞).Inverting the left-hand side of (56) as a function of b, we find that the optimalb with fixed β and b+ ν = y is simply given by v∗(β, y), where

v∗(β, y) :=

β + 1βln[

C∗(β)·(p+h)p+h−pD∗(β,y)

], if C∗(β) >

p+h−pD∗(β,y)p+h

,

Φ−1(

pD∗(β,y)Φ(β)(p+h)[1−C∗(β)]

), if C∗(β) ≤ p+h−pD∗(β,y)

p+h.

(58)

Therefore, instead of searching for (β∗, b∗, ν∗) in the three-dimensional set S3,one can first find the minimizer of the two-variable functionKT,∗(β, v∗(β, y), y),for β > 0 and y ≥ β, say (β∗, y∗), and then set (b∗, ν∗) := (v∗(β∗, y∗), y∗ −v∗(β∗, y∗)).

Finally, we remark that if we require that r = 0, then no backorder isallowed in the system and we have a pure lost sales model. In this case,

T (c, s, 0, R) = d · c+w ·Ec[Zs(∞)]+h ·Ec[s−Zs(∞)]+ l ·λ ·PcZs(∞) = s,

where 0 ≤ Zs(∞) ≤ s, and the solution procedure for the joint optimizationproblem specified above can be simplified since only two decision variablesare involved.

30

Page 31: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

References

Abramowitz, M., Stegun, I. A., 1964. Handbook of Mathematical Functionswith Formulas, Graphs, and Mathematical Tables. Dover, New York.URL http://www.math.ucla.edu/~cbm/aands/

Allon, G., Zeevi, A., 2011. A note on the relationship among capacity, pric-ing, and inventory in a make-to-stock system. Production and OperationsManagement 20 (1), 143–151.

Angelus, A., Porteus, E., 2002. Simultaneous capacity and production man-agement of short-life-cycle, produce-to-stock goods under stochastic de-mand. Management Science 48 (3), 399–413.

Ata, B., Rubino, M., 2009. Dynamic control of a make-to-order, parallel-server system with cancellations. Operations Research 57 (1), 94–108.

Bassamboo, A., Randhawa, R. S., Zeevi, A., October 2010. Capacity sizingunder parameter uncertainty: Safety staffing principles revisited. Manage.Sci. 56, 1668–1686.

Benjaafar, S., ElHafsi, M., Vericourt, F. d., October 2004. Demand allocationin multiple-product, multiple-facility, make-to-stock systems. Manage. Sci.50, 1431–1448.

Bitran, G., Haas, E., Hax, A., 1982. Hierarchical production planning: atwo-stage system. Operations Research, 232–251.

Borst, S., Mandelbaum, A., Reiman, M., 2004. Dimensioning large call cen-ters. Oper. Res. 52, 17–34.

Bradley, J., Glynn, P., 2002. Managing capacity and inventory jointly inmanufacturing systems. Manage. Sci. 48 (2), 273–288.

Brothers, H. J., Knox, J. A., 1998. New closed-form approximations to thelogarithmic constant e. Math. Intell. 20 (4), 25–29.

Buyukkaramikli, N., Bertrand, J., van Ooijen, H., 2013a. Flexible hiring in amake to order system with parallel processing units. Annals of OperationsResearch 209 (1), 159–178.

31

Page 32: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Buyukkaramikli, N., van Ooijen, H., Bertrand, J., 2013b. Integrating inven-tory control and capacity management at a maintenance service provider.Annals of Operations Research, 1–22.

Caldentey, R., 2001. Analyzing the Make-to-Stock Queue in Supply Chaindand e-Business Settings. Ph.D. Thesis.

Caldentey, R., Wein, L., Jan. 2003. Analysis of a decentralized production-inventory system. Manufacturing & Service Operations Management 5 (1),1–17.

Durrett, R., 2010. Probability: Theory and examples. Online book.URL http://www.math.duke.edu/~rtd/PTE/PTE4_Jan2010.pdf

Eberly, J., Van Mieghem, J., 1997. Multi-factor dynamic investment underuncertainty. Journal of Economic Theory 75 (2), 345–387.

Glasserman, P., 1997. Bounds and asymptotics for planning critical safetystocks. Operations Research 45 (2), 244–257.

Gross, D., Harris, C. M., 1998. Fundamentals of Queueing Theory (3rd Edi-tion). Wiley-Interscience.

Halfin, S., Whitt, W., 1981. Heavy-traffic limits for queues with many expo-nential servers. Oper. Res. 29, 567–588.

Hax, A., Meal, H., 1973. Hierarchical integration of production planning andscheduling. Tech. rep., DTIC Document.

Ioannidis, S., Kouikoglou, V., 2008. Revenue management in single-stageconwip production systems. International Journal of Production Research46 (22), 6513–6532.

Jagers, A., van Doorn, E., 1986. On the continued erlang loss function. Op-erations Research Letters 5 (1), 43–46.

Janssen, A., van Leeuwaarden, J., Zwart, B., 2008. Corrected asymptotics fora multi-server queue in the Halfin-Whitt regime. Queueing Syst. TheoryAppl. 58 (4), 261–301.

Janssen, A., van Leeuwaarden, J., Zwart, B., 2011. Refining square rootsafety staffing by expanding Erlang C. Oper. Res. 59 (6), 1512–1522.

32

Page 33: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Jodlbauer, H., Altendorfer, K., 2010. Trade-off between capacity investedand inventory needed. European Journal of Operational Research 203 (1),118–133.

Kouikoglou, V., Phillis, Y., 2002. Design of product specifications and controlpolicies in a single-stage production system. IIE Transactions 34 (7), 591–600.

Mandelbaum, A., Zeltyn, S., 2009. Staffing many-server queues with impa-tient customers: constraint satisfaction in call centers. Oper. Res. 57 (5),1189–1205.

Mincsovics, G., Tan, T., Alp, O., 2009. Integrated capacity and inventorymanagement with capacity acquisition lead times. European Journal ofOperational Research 196 (3), 949 – 958.

Morse, P., 1958. Queues, Inventories, and Maintenance: The Analysis ofOperational Systems with Variable Demand and Supply. John Wiley &Sons.

Nahmias, S., 2009. Production and Operations Analysis. McGraw-Hill/Irwin.

Plambeck, E., Zenios, S., May 2003. Incentive efficient control of a make-to-stock production system. Oper. Res. 51, 371–386.

Rao, M., 1976. Optimal capacity expansion with inventory. Operations Re-search 24 (2), pp. 291–300.

Ross, S., 1996. Stochastic processes. Wiley New York.

Song, J.-S., Zipkin, P., Apr. 1993. Inventory control in a fluctuating demandenvironment. Oper. Res. 41 (2), 351–370.

Spearman, M., 2014. Of physics and factory physics. Production and Oper-ations ManagementForthcoming.

Tayur, S., 1993. Computing the optimal policy for capacitated inventorymodels. Stochastic Models 9 (4), 585–598.

Tijms, H. C., 1995. Stochastic Models: An Algorithmic Approach. JohnWiley & Sons.

33

Page 34: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Van Mieghem, J., Rudi, N., Oct. 2002. Newsvendor networks: Inventorymanagement and capacity investment with discretionary activities. Man-ufacturing & Service Operations Management 4 (4), 313–335.

Wein, L. M., July 1992. Dynamic scheduling of a multiclass make-to-stockqueue. Oper. Res. 40, 724–735.

Whitt, W., 1992. Understanding the efficiency of multi-server service systems.Management Science 38 (5), 708–723.

Whitt, W., 2002. The Erlang B and C formulas: Problems and solutions.Class notes.URL http://www.columbia.edu/~ww2040/ErlangBandCFormulas.pdf

Wikipedia, 2014. Incomplete gamma function — Wikipedia, the free ency-clopedia. [Online; accessed 2-Aug-2014].URL http://en.wikipedia.org/wiki/Incomplete_gamma_function

Zan, J., Hasenbein, J., Morton, D., 2013. Staffing large service systems underarrival-rate uncertainty. Preprint.

Zhang, B., van Leeuwaarden, J., Zwart, B., 2012. Staffing call centers withimpatient customers: refinements to many-server asymptotics. OperationsResearch 60 (2), 461–474.

Zhang, B., Zwart, B., 2013. Steady-state analysis for multi-server queues un-der size-interval task assignment in the quality-driven regime. Mathematicsof Operations Research 38 (3), 504–525.

Zipkin, P., 2000. Foundations of Inventory Management. McGraw-Hill.

Appendix A. Preliminary results

In the appendices, we provide the proofs of the results found in the pa-per. We begin with some preliminary results with regard to the analyticcontinuation of various performance functions. As discussed in Section 3.2,we consider the continuous extension of each performance function, allowingc and s to take on any non-negative real value. The same approach is takenin Janssen et al. (2011) and Zhang et al. (2012) and is based on results fromearlier work such as Jagers and van Doorn (1986).

34

Page 35: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

First, the continuation of the steady-state shortfall distribution to real-valued arguments is given by the following lemma.

Lemma 2. For any c ∈ Z+ ∪ (R,∞) and s ∈ Z+, D(c, s, R) = PcQ(∞) ≤s, where D(c, s, R) is defined for all c ∈ (R,∞) and s ∈ [0,∞):

D(c, s, R) :=

[1− C(c, R)] ·Rs−c+1B(c− 1, R)Γ(c)/[B(s,R)Γ(s+ 1)], if s ≤ c[1− C(c, R)] · [1 + ρ(1− ρs−c+1)B(c− 1, R)/(1− ρ)] , if s > c

,

(A.1)with

C(c, R) :=

[R

∫ ∞

0

te−Rt(1 + t)c−1dt

]−1

, for all c ∈ (R,∞)

B(x,R) :=

[R

∫ ∞

0

e−Rt(1 + t)xdt

]−1

, for all x ∈ [0,∞) (A.2)

Proof. For any integer c > R, the analytic continuation of the Erlang Cformula reads (see Janssen et al. (2011))

PcQ(∞) ≥ c = C(c, R), (A.3)

and thereforePcQ(∞) ≤ c− 1 = 1− C(c, R). (A.4)

For any s ∈ Z+ ∪ [0, c], it follows from the exact formula for the distributionof Q(∞) that

PcQ(∞) ≤ sPcQ(∞) ≤ c− 1

=

∑si=0 R

i/i!∑c−1i=0 R

i/i!=

Rs−c+1B(c− 1, R)Γ(c)

B(s,R)Γ(s+ 1). (A.5)

Multiplying (A.4) by (A.5) yields the desired result for s ≤ c. In the case ofs > c, we have that

PcQ(∞) ≤ sPcQ(∞) ≤ c− 1

=1 +

∑si=c PcQ(∞) = i

PcQ(∞) ≤ c− 1= 1 +

cc∑s

i=c ρi/c!∑c−1

i=0 Ri/i!

=1 +ρ(1− ρs−c+1)B(c− 1, R)

(1− ρ), (A.6)

which combined with (A.4) completes the proof.

35

Page 36: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

We next show that a recursive relation that is known to hold for the ErlangB loss formula (see Whitt (2002)) remains valid for its analytic continuation.

Lemma 3. For any x > 0,

1

B(x,R)=

R

(x+ 1)B(x+ 1, R)− R

x+ 1

Proof. The recursion follows from

B(x,R) =e−RRx

Γ(x+ 1, R), for all x > 0

and a recursive relation for the incomplete gamma function (see Wikipedia(2014))

Γ(x+ 1, R) = xΓ(x,R) +Rxe−R.

Our third preliminary result is the continuation for the two performancefunctions that determine the inventory holding cost and backorder cost.

Lemma 4. If c ≥ s, Ec[(s−Q(∞))+] = L1(c, s, R), and if c ≤ s, Ec[(Q(∞)−s)+] = L2(c, s, R), where

L1(c, s, R) := D(c, s, R) · [s−R(1−B(s, R))], (A.7)

L2(c, s, R) := [1− C(c, R)] ·B(c− 1, R) · ρs−c · ρ2

(1− ρ)2, (A.8)

and both functions are defined for all c ∈ (R,∞) and s ∈ [0,∞).

Proof. First, by examining the state transition rates in the underlyingbirth-death process of the total number of customers in the M/M/c queue,we observe that, if c ≥ s, Q(∞)|Q(∞) ≤ s is equal in distribution to QB,where QB denotes the steady-state number of customers in the M/M/s/sloss queue with offered load R, and if c ≤ s, Q(∞) − s|Q(∞) ≥ s equal indistribution to QM/M/1, where QM/M/1 denotes the steady-state number ofcustomers in the M/M/1 queue with traffic intensity ρ.

If c ≥ s, we condition to obtain that

Ec[(s−Q(∞))+] = PcQ(∞) ≤ s · (s− Ec[Q(∞)|Q(∞) ≤ s]) . (A.9)

36

Page 37: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Substituting the formula for the expected steady-state queue length in theM/M/s/s queue

E[QB] = R(1−B(s,R)) (A.10)

into (A.9) then yields (A.7).In the case of c ≤ s,

Ec[(Q(∞)− s)+] = PcQ(∞) ≥ s · Ec[Q(∞)− s|Q(∞) ≥ s]

= PcQ(∞) ≥ s · E[QM/M/1]

=PcQ(∞) = cρs−c

1− ρ· ρ

1− ρ

= L2(c, s, R), (A.11)

becausePcQ(∞) = c = [1− C(c, R)] ·B(c− 1, R)ρ,

due toPcQ(∞) = c

PcQ(∞) ≤ c− 1=

Rc/c!∑c−1i=0 R

i/i!= B(c− 1, R)ρ.

We further note that

Ec[(Q(∞)− s)+]− Ec[(s−Q(∞))+] = Ec[Q(∞)]− s.

Also, since

Ec[Q(∞)] = C(c, R) · ρ

1− ρ+R,

we then have

Ec[(Q(∞)− s)+]− Ec[(s−Q(∞))+] = C(c, R) · ρ

1− ρ+R− s. (A.12)

Therefore, combining (A.12) and Lemma 4, we can have the continuationfunction for Ec[(s − Q(∞))+] and Ec[(Q(∞) − s)+], regardless of whetherc is greater than or less than s, and this eventually leads to the continuedexpression of Ω(c, s, R).

Proposition 2. For any c ∈ Z+∪(R,∞) and s ∈ Z+, Ω(c, s, R) = Π(c, s, R),where

Π(c, s, R) := (d+ w)R +R1/2K(c, s, R), (A.13)

K(c, s, R) :=

d · c−R√

R− p · s−R√

R+ (w+p)

√R

c−R· C(c, R) + (p+ h)L1(c, s, R) ·R−1/2, if s ≤ c,

d · c−R√R+ h · s−R√

R+ (w−h)

√R

c−R· C(c, R) + (p+ h)L2(c, s, R) ·R−1/2, if s > c.

(A.14)

37

Page 38: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Proof. The representation simply follows by applying (A.12) and Lemma4 to (4). We provide the details below.

Π(c, s, R) =

d · c+ (w + p) · Ec[Q(∞)] + (p+ h) · L1(c, s, R)− p · s, if s ≤ c,d · c+ (w − h) · Ec[Q(∞)] + (p+ h) · L2(c, s, R) + h · s, if s > c.

(A.15)If s ≤ c,

Π(c, s, R) = d ·(R +

c−R√R

R1/2

)+ (w + p) ·

(C(c, R) · ρ

1− ρ+R

)+ (p+ h) · L1(c, s, R)− p ·

(R +

s−R√R

R1/2

)= (d+ w)R +

(d · c−R√

R− p · s−R√

R

)R1/2

+(w + p)

√R

c−R· C(c, R)R1/2 + (p+ h) · L1(c, s, R)

= (d+ w)R +R1/2

[d · c−R√

R− p · s−R√

R

+(w + p)

√R

c−R· C(c, R) + (p+ h)L1(c, s, R) ·R−1/2

], (A.16)

and if s > c,

Π(c, s, R) = d ·(R +

c−R√R

R1/2

)+ (w − h) ·

(C(c, R) · ρ

1− ρ+R

)+ (p+ h) · L2(c, s, R) + h ·

(R +

s−R√R

R1/2

)= (d+ w)R +

(d · c−R√

R+ h · s−R√

R

)R1/2

+(w − h)

√R

c−R· C(c, R)R1/2 + (p+ h) · L2(c, s, R)

= (d+ w)R +R1/2

[d · c−R√

R+ h · s−R√

R

+(w − h)

√R

c−R· C(c, R) + (p+ h)L2(c, s, R) ·R−1/2

]. (A.17)

38

Page 39: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Appendix B. Diffusion approximation for the average cost function

In this section, we provide the proof of Theorem 1.

Proof (Theorem 1.). Throughout the proof, let c := R + β√R and s :=

R + b√R. Due to Proposition 2, the desired result is equivalent to

K(c, s, R) = K∗(β, b) +O(R−1/2). (B.1)

First, Janssen et al. (2011) show that

C(c, R) = C∗(β) +O(R−1/2), (B.2)

and

B(s, R)−1 =Φ(αs)

ϕ(αs)s1/2 +O(1), (B.3)

where

αs =√−2s(1−R/s+ ln(R/s)), sign(αs) = sign(1−R/s), (B.4)

a simple function of R and s with αs → b as s → ∞. Inverting (B.3) yields

B(s,R) =ϕ(αs)

Φ(αs)s−1/2 +O(s−1). (B.5)

Simple calculations show that

s−1/2 = R−1/2 +O(R−1), (B.6)

andϕ(αs)

Φ(αs)=

ϕ(b)

Φ(b)+O(R−1/2). (B.7)

Applying (B.6) and (B.7) to (B.5), we have that

B(s,R) =ϕ(b)

Φ(b)R−1/2 +O(R−1). (B.8)

We then apply (B.8) and Proposition 1 to (A.7) and get that

L1(c, s, R) = D(c, s, R) · [bR1/2 +R ·B(s,R)] (B.9)

= D∗(β, b)

[b+

ϕ(b)

Φ(b)

]R1/2 +O(1). (B.10)

39

Page 40: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

On the other hand, expression (B.8) implies

B(c− 1, R) =ϕ(β)

Φ(β)R−1/2 +O(R−1). (B.11)

Also,

ρs−c = [1 + βR−1/2](β−b)√R = e−β(b−β) +O(R−1/2). (B.12)

Applying (B.11), (B.12), (B.2), and ρ(1 − ρ)−1 = β−1R1/2 to (A.8), we canobtain that

L2(c, s, R) = [1− C∗(β)]ϕ(β)

β2Φ(β)· e−β(b−β)R1/2 +O(1). (B.13)

Finally, substituting (B.2), (B.10), and (B.13) into (A.16) and (A.17) leadsto (B.1).

Appendix C. Asymptotic optimality of square-root rule

This section is mainly devoted to the proof of Theorem 2. We also providethe proof of Lemma 1.

Proof (Theorem 2.). Since the cost function Π is continuous, the asymp-totic optimality of the cost objective function (20) follows directly from thatof the decision variables (19). Therefore, it suffices to prove (19). Letβopt = (copt − R)R−1/2 and bopt = (sopt − R)R−1/2. The desired result isequivalent to

(β∗, b∗) ∼ (βopt, bopt). (C.1)

We first note that by Proposition 2, the exact optimal solution pair (copt, sopt)must satisfy (

∂K(copt, sopt, R)

∂c,∂K(copt, sopt, R)

∂s

)= (0, 0), (C.2)

in which the second component is equivalent to

D(copt, sopt, R) = p/(p+ h). (C.3)

From (C.2), (C.3), (B.1), and Proposition 1, we have that

p

p+ h= D(R+ βopt

√R,R+ bopt

√R,R) = D∗(βopt, bopt) +O(R−1/2), (C.4)

40

Page 41: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

0 =∂K(R + βopt

√R,R + bopt

√R,R)

∂β=

∂K∗(βopt, bopt)

∂β+O(R−1/2). (C.5)

Let g1,∗(R) := βopt − β∗ and g2,∗(R) := bopt − b∗. Then applying a first-orderTaylor expansion to (C.4), we obtain that

p

p+ h= D∗(β∗, b∗) +O(maxg1,∗(R), g2,∗(R)) +O(R−1/2). (C.6)

Finally, substitutingD∗(β∗, b∗) =p

p+hinto (C.6), we find that maxg1,∗(R), g2,∗(R) =

O(R−1/2), and therefore (C.1) holds.

Proof (Lemma 1.). The first part of the lemma simply follows by applyingto (10) the result limβ→∞ β[ϕ(β)+βΦ(β)]−1 = 1, which can be easily verifiedalgebraically. For the second part, setting to δ the expression (10) for β < b,we solve for b and obtain that

bδ(β) =ln(C∗(β))− ln(1− δ)

β+ β. (C.7)

Then note that

C∗(β) = 1−√2π

2β + o(β), (C.8)

where a function f(β) is said to be o(β) if limβ→0 f(β)/β = 0. Finally, thesecond part of the lemma follows after first applying (C.8) to (C.7) and thensubsequently using the fact that ln(1 + x) = x + o(x) for |x| < 1 (with x

corresponding to −√2π2β + o(β) in this case).

Appendix D. Corrected diffusion approximation

In this section we prove Theorems 3, i.e., the corrected diffusion approx-imation for the steady-state shortfall distribution, which is equivalent to thesteady-state queue-length distribution in the elementaryM/M/c queue. Thiscorrected diffusion approximation refines the celebrated diffusion approxima-tion first developed in Propositions 1 and 2 of Halfin and Whitt (1981), andmay be of independent interest.

Proof (Theorem 3.). Throughout the proof, let c := R + β√R and s :=

R + b√R. By Theorem 2 in Janssen et al. (2011),

1− C(c, R) = 1− C∗(β)− C•(β)R−1/2 +O(R−1). (D.1)

41

Page 42: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

We first consider the case of s ≤ c or b ≤ β. Due to (A.1) and (D.1), itis sufficient to prove

Rs/[B(s,R)Γ(s+ 1)]

Rc−1/[B(c− 1, R)Γ(c)]= Φ(b)Φ(β)−1 + g1(β, b)R

−1/2 +O(R−1). (D.2)

In Janssen et al. (2011), it is shown that

B(s,R)−1 =Φ(αs)

ϕ(αs)s1/2 +

2

3+O(s−1/2), (D.3)

where

αs =√

−2s(1−R/s+ ln(R/s)), sign(αs) = sign(1−R/s), (D.4)

a simple function of R and s with αs → b as s → ∞. By letting p(s) :=sse−s

√2πs Γ(s+ 1)−1, we have

e−RRs

Γ(s+ 1)= ϕ(αs)p(s)s

−1/2. (D.5)

Multiplying (D.3) by (D.5) yields

e−RRs

Γ(s+ 1)B(s,R)= p(s)Φ(αs) +

2

3ϕ(αs)p(s)s

−1/2 +O(s−1). (D.6)

To expand the first term in (D.6), we note from the proof of Theorem 2 inJanssen et al. (2011) that

Φ(αs)

ϕ(b)=

Φ(b)

ϕ(b)− 1

6b2R−1/2 +O(R−1)

and thus

Φ(αs) = Φ(b)− 1

6b2ϕ(b)R−1/2 +O(R−1). (D.7)

By the Stirling’s approximation for the Gamma function (see p. 257 ofAbramowitz and Stegun (1964)),

p(s) = 1 +O(s−1) = 1 +O(R−1). (D.8)

We then multiply (D.7) by (D.8) and arrive at

p(s)Φ(αs) = Φ(b)− 1

6b2ϕ(b)R−1/2 +O(R−1). (D.9)

42

Page 43: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Next, we expand the second term in (D.6). Simple computations show that

ϕ(αs) = ϕ(b) +O(R−1/2)

ands−1/2p(s) = s−1/2 +O(s−3/2) = R−1/2 +O(R−1).

It then follows that

2

3ϕ(αs)p(s)s

−1/2 =2

3ϕ(b)R−1/2 +O(R−1). (D.10)

Substituting (D.9) and (D.10) into (D.6) yields

e−RRs

Γ(s+ 1)B(s,R)= Φ(b) +

[2

3ϕ(b)− 1

6b2ϕ(b)

]R−1/2 +O(R−1). (D.11)

This provides a power series expansion of the numerator in (D.2) times e−R.We then turn to expanding the denominator of expression (D.2) times e−R.By Lemma 3,

1

B(c− 1, R)=

R

cB(c, R)− R

c

and therefore

e−RRc−1

Γ(c)B(c− 1, R)=

e−RRc

Γ(c+ 1)B(c, R)− e−RRc

Γ(c+ 1). (D.12)

The expansion of the first term of (D.12) is just the same as (D.11), with breplaced by β:

e−RRc

Γ(c+ 1)B(c, R)= Φ(β) +

[2

3ϕ(β)− 1

6β2ϕ(β)

]R−1/2 +O(R−1). (D.13)

For the second term of (D.12), following the same procedure as above (i.e.,from (D.5) to (D.10)), we obtain that

e−RRc

Γ(c+ 1)= ϕ(β)R−1/2 +O(R−1). (D.14)

Substituting (D.13) and (D.14) into (D.12) yields

e−RRc−1

Γ(c)B(c− 1, R)= Φ(β)−

[1

3ϕ(β) +

1

6β2ϕ(β)

]R−1/2 +O(R−1),

43

Page 44: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

which upon inversion becomes[e−RRc−1

Γ(c)B(c− 1, R)

]−1

= Φ(β)−1+Φ(β)−2

[1

3ϕ(β) +

1

6β2ϕ(β)

]R−1/2+O(R−1).

(D.15)Finally, we multiply (D.11) by (D.15) to get (D.2). This completes the prooffor the case of b ≤ β. We now turn to proving the theorem in the case ofb > β. First, by Lemma 3 and (D.3) (with s replaced by c),

B(c− 1, R)−1 = ρ(B(c, R)−1 − 1) (D.16)

=Φ(αc)

ϕ(αc)· ρc1/2 − 1

3ρ+O(c−1/2), (D.17)

which upon inversion becomes

B(c− 1, R) =1

ρ· ϕ(αc)

Φ(αc)

(c−1/2 +

1

3· ϕ(αc)

Φ(αc)c−1

)+O(c−3/2). (D.18)

Simple computations show that

ρ−1 = 1 + βR−1/2, (D.19)

c−1/2 = R−1/2 − 1

2βR−1 +O(R−3/2), (D.20)

and

ϕ(αc)

Φ(αc)=

ϕ(β)

Φ(β)+

1

6β2

[(ϕ(β)

Φ(β)

)2

+ βϕ(β)

Φ(β)

]R−1/2 +O(R−1). (D.21)

Applying (D.19), (D.20) and (D.21) to (D.18), we have that

B(c− 1, R) =ϕ(β)

Φ(β)R−1/2 + g3(β)R

−1 +O(R−3/2). (D.22)

Next, we derive a refined approximation for ρs−c+1. We shall need thefollowing result (see Brothers and Knox (1998)): for any x < −1,(

1 +1

x

)x

= e− e

2x−1 +O(x−2). (D.23)

44

Page 45: Managing Capacity and Inventory Jointly for Multi-Server ...people.stern.nyu.edu/jreed/papers/paper20.pdf · Managing Capacity and Inventory Jointly for ... Also, the traditional

Also, we can express the traffic intensity as

ρ = 1 +1

−(β−1R1/2 + 1)= 1− βR−1/2 +O(R−1). (D.24)

Applying (D.23) and the two expressions in (D.24), we have that

ρs−c =

[(1 +

1

−(β−1R1/2 + 1)

)−(β−1R1/2+1)

· (1− βR−1/2 +O(R−1))

]−β(b−β)

=[(

e+e

2(β−1R1/2 + 1)−1 +O((β−1R1/2 + 1)−2)

)· (1− βR−1/2 +O(R−1))

]−β(b−β)

=[(

e+e

2βR−1/2 +O(R−1)

)· (1− βR−1/2 +O(R−1))

]−β(b−β)

=(e− e

2βR−1/2 +O(R−1)

)−β(b−β)

=

(e−1 +

1

2e−1βR−1/2 +O(R−1)

)β(b−β)

= e−β(b−β)

(1 +

1

2βR−1/2 +O(R−1)

)β(b−β)

= e−β(b−β)

(1 +

1

2β2(b− β)R−1/2 +O(R−1)

). (D.25)

Combining (D.25) with (D.24) yields

ρs−c+1 = e−β(b−β) +

(1

2β2(b− β)− β

)e−β(b−β)R−1/2 +O(R−1). (D.26)

Finally, substituting (D.1), (D.22), (D.26) and ρ(1 − ρ)−1 = β−1R1/2 into(A.1), we obtain the desired series expression in the case of b > β andcomplete the proof of the theorem.

45