Download pdf - Optimal load sharing in soft real-time systems using likelihood ratios

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 82, No. 1. JULY 1994

Optimal Load Sharing in Soft Real-Time Systems Using Likelihood Ratios 1

E. K. P. CHONG 2 AND P. J. RAMADGE 3

Communicated by Y. C. Ho

Abstract. We consider a load-sharing problem for a multiprocessor system in which jobs have real-time constraints: if the waiting time of a job exceeds a given random amount (called the laxity of the job), then the job is considered lost. To minimize the steady-state probability of loss with respect to the load-sharing parameters, we propose to use the likelihood ratio derivative estimate approach, which has recently been studied for sensitivity analysis of stochastic systems. We formulate a recursive stochastic optimization algorithm using likelihood ratio estimates to solve the optimization problem and provide a proof for almost sure convergence of the algorithm. The algorithm can be used for on-line optimization of the real-time system and does not require a priori knowledge of the arrival rate of customers to the system or the service time and laxity distributions. To illustrate our results, we provide simulation examples.

Key Words. Load sharing, real-time systems, likelihood ratios, score function, stochastic approximation.

1. Introduction

We consider a system which has multiple processors (or resources) that share the load of processing jobs that enter the system. Examples of such a system include multiprocessor computer systems and automated manufacturing systems. The load in the system is shared among the

1This research was partially supported by an IBM Graduate Fellowship and by the National Science Foundation through Grant No. ECS-87-15217.

2Assistant Professor, School of Electrical Engineering, Purdue University, West Lafayette, Indiana.

3Associate Professor, Department of Electrical Engineering, Princeton University, Princeton, New Jersey.

23 0022-3239/94/0700-0023507.00/0 �9 1994 Plenum Publishing Corporation

24 JOTA: VOL. 82, NO. 1, JULY 1994

processors according to a set of parameters, referred to as the load sharing parameters, which we assume to be controllable. We are concerned with the problem of adjusting the load sharing parameters so that a performance measure associated with the system is optimized. To perform this optimization, we propose the use of a recursive optimization algorithm. The algorithm can be used as an on-line scheme, and requires minimal a priori system information. We provide a convergence proof for the algorithm.

The problem described above has been studied in various settings in the literature (Refs. 1-6). In Refs. 1, 2, and 6, the performance measure that was considered is the mean job response time. In this paper, each job that enters the system has an explicit time constraint; if the response time (waiting time) exceeds the constraint, the job is considered lost. The performance measure of interest in this case is the steady-state probability of loss. The authors of Refs. 3-5 also considered such time-constrained situations. They studied the convergence behavior of their algorithms via simulations. In this paper, we concentrate on providing a comprehensive analytical approach to the issue of convergence. This is our main contribu- tion. We believe that a rigorous demonstration of convergence is a necessary step toward a more thorough understanding of the behavior of such algorithms.

Our algorithm is based on the gradient estimation technique known as the likelihood ratio (LR) method (see Refs. 7-9). The LR method allows estimation of gradients of certain performance measures by observing only a single sample path of the system, and this alleviates problems associated with conventional techniques for estimating derivatives which involve performing at least two experiments on the system. Used in conjunction with a gradient-based descent algorithm, such estimates provide a means of optimizing the performance measure. The LR method was initially in- tended as a tool in simulation; in this paper, we exploit its properties to apply the method to a situation where the observations are taken from an actual system in operation.

Optimization algorithms that use LR derivative estimates for on-line applications have received attention only relatively recently. 4 Rubinstein (Ref. 11) considered optimization algorithms using LR estimates, and presented simulation results for his algorithm applied to inventory models. Glynn (Ref. 12) proved convergence of an optimization algorithm using LR estimates for Markov chains. More recently L'Ecuyer, Giroux, and Glynn (Ref. 13) provided convergence proofs for algorithms for optimizing a performance measure involving the sojourn time of a single server queue

4We are grateful to a reviewer for pointing out to us an upcoming book on the topic (Ref. 10).

JOTA: VOL. 82, NO. !, JULY 1994 25

in steady state. Our objective function is different from those considered in Refs. 11-13, and our problem is inherently a multivariable problem; both Refs. 12 and 13 consider only scalar control parameters. Furthermore, Refs. 11-13 are concerned primarily with optimization using simulations, and as such certain parameter values and distributions are assumed to be known. In our formulation, we avoid the need to use a priori knowledge of system parameter values and distributions (except a Poisson arrival assumption), and instead incorporate estimation of such parameters into the algorithm. This turns out to be a straightforward task, due in part to the generality of the LR method.

Another method for gradient estimation which has recently received much interest is perturbation analysis (PA), pioneered by Ho et al. (see, e.g., Refs. 14 and 15). Convergence proofs for optimization algorithms using PA have also been formulated (see Refs. 13, 16-19). In this paper, we do not consider the use of PA.

This paper is organized as follows. In Section 2, we describe a load- sharing scheme for a soft real-time system, and discuss the optimization problem to be considered. In Section 3, we propose an on-line algorithm using likelihood ratio estimates for optimization of the real-time system. We provide a proof for almost sure convergence of the algorithm in Section 4. Section 5 contains a simulation example of the real-time system with the on-line algorithm, which illustrates the result of Section 4. Finally we conclude in Section 6 with a summary of future research.

2. Load-Sharing Problem in a Real-Time System

Consider the system shown in Fig. 1. Jobs arrive at a scheduler, which distributes the jobs among K + 1 processors. The scheduling is probabilis- tic, that is, the scheduler sends jobs to each processor according to the value of a probability. The arrival rate of jobs and the speeds of the processors are unknown. An explicit time constraint is associated with each job arriving at the system, which we refer to as the laxity of the job. If the waiting time (the time between the arrival and the start of the service) of a job at a given queue exceeds the laxity of the job, then the job is considered lost. Lost jobs, however, remain in the system and receive service, no different from jobs which are not lost. Such a system is called a soft real-time system (as opposed to a hard real-time system, where lost jobs leave the system and are not served). Soft real-time systems may arise if, for example, the job or processor is not aware of whether the laxity has been exceeded when service begins.

26 JOTA: VOL. 82, NO. 1, JULY 1994

Jobs

Fig. 1.

Scheduler

Processor ]

' I

Processor 2

Processor K+I

Soft real-time system with load sharing.

A queueing model for the system is shown in Fig. 2. In keeping with the terminology used in queueing theory, we refer to the jobs as customers. Customers arrive at the system as a Poisson process with rate 2. A customer arriving at the system joins queue i with probability 0", where ~=+11 Oi= l, the scheduling being independent of the arrival process. The mean service time at queue i is z i, with a general service time distribution. Service times of customers are assumed to be independent. We assume that the laxities of customers are i.i.d, and independent of the arrivals, services,

01 ~1

IIIIQ

0 K+I 1

Fig. 2. Queueing model for soft real-time system with load sharing.

JOTA: VOL. 82, NO. 1, J U L Y 1994 27

and scheduling. The parameters 0; determine the way in which the load is shared between the K + 1 queues and so we refer to the 0 i as the load- sharing parameters. Note that the arrivals to queue i form a Poisson process with rate 2; = 20 i, so that, for fixed 01 . . . . ,0 K+1, each queue is effectively an M/G/1 queue. We do not assume knowledge of the arrival rate 2, the mean service times z;, or the service time distributions. The load- sharing parameters 0 ~, however, are assumed to be adjustable. We also assume that each queue is stable [see Eq. (13) for explicit conditions on the 0 i and z~ for stability].

For each queue i, let P~ be the steady-state probability of loss of customers, that is, the probability that a customer arriving at queue i in steady state will be lost. By ergodicity, this quantity is equal to the infinite-horizon average fraction of customers lost, that is,

P~ = lim(1/n) ~ Ij, a.s., n~.~ j=l

where I~ is equal to 1 if t he j t h customer in queue i is lost, and 0 otherwise. In general, P~ is a function of the arrival rate to queue i, and we write P~(2;) for the value of P~ when the arrival rate to queue i is 2i (P~ is also in general a function of the service and laxity distributions). Since At = 20 i, then P~ is a function of 0", and we display this explicitly by writing

For the real-time system above, we wish to find values for the load sharing parameters such that the steady-state probability of loss of customers arriving at the system is minimized. We may write the steady-state probability of loss as

K + I

s ( o l , . . . , o " + ' ) = Y, i = l

Thus, we wish to minimize J with respect to 0 subject to the constraints

O~Oi<~ 1, i = 1 . . . . . K + 1, K + I

and ~ 0~'= 1. i=1

Now, since we can write

K oK+ 1 = 1 - ~ 0 i,

i=l

we are free to adjust only K parameters, and J can be written as

K J( 01," ", OK) = 20ipiL('~Oi) + OK+IPLK+I(2OK+I)"

i=1

28 JOTA: VOL. 82, NO. 1, JULY 1994

In general, it is not possible to analytically calculate J or its derivative with respect to 0. To obtain an analytical expression for J or its derivative would require knowledge of the arrival rate, the service time distributions, and the laxity distribution, all of which we have assumed to be unknown. In the remaining sections of this paper, we demonstrate the use of likelihood ratios for estimating the derivative of J, and use these estimates in a recursive optimization algorithm, for on-line optimization of the real-time system. The on-line algorithm does not require a priori knowledge about the values of 2, ~i, the service time distribution, or the laxity distribution.

3. On-Line Optimization Algorithm

In this section, we propose an on-line optimization algorithm for obtaining the optimal load-sharing parameters for minimizing the steady- state probability of loss for the real-time system described in Section 2.

3.1. Stochastic Approximation Algorithm. The algorithm that we consider is based on the classical stochastic approximation scheme of Robbins and Monro (Ref. 20), and takes the form

0,,+ 1 =0,, -- a,,h',,+ l, (1)

where K,+1 is an estimate of dJ(On)/dO, the derivative of J ( . ) with respect to 0 at 0 = 0,, and

a, = diag[a 1 . . . . , aX]b,

is a random matrix consisting o f a fixed diagonal matrix multiplied by a positive real-valued random variable b,. Note that (1) is similar to a gradient-descent algorithm, the only difference being that an estimate of the gradient is used, rather than the true gradient. Such algorithms have been studied extensively in the literature (see, e.g., Refs. 21-24).

In our problem, we require 0 to satisfy

O<_OJ<_l,i=l . . . . . K, and ~ 0 ; < 1 . i = l

Since there is no guarantee that the parameter values {0, } in (1) remain inside the region where the above constraint is satisfied, we need a mechanism to explicitly ensure that the constraint holds. This is commonly done via a projection.

In general, suppose the constraint set for 0 is a given compact convex set D ~ R K, over which J is defined. The projection mechanism operates as follows: whenever the sequence leaves D, project it back into a given

JOTA: VOL. 82, NO. 1, JULY 1994 29

compact convex subset Dp _~ D (Dp may equal D). To be more specific, if during the execution of the recursive algorithm the sequence leaves D, say Ok- 1 ~ D but O k _ 1 - a k - I h'k 6D, then we set

Ok =arg inf {ll0 --gk II},

where II II denotes the standard Euclidean norm, i.e., we project 0k_ I--ak-l~'k to the closest point in D e. The projected algorithm can be written as

On+ l = l~[On - - anftn+ 1], (2)

where the projection ~[. ] is defined by

~[x] = ~x, if x eD, arg inf {II0 -xH}, otherwise. (3)

1 OEOp

It is easy to show that, for each x, ~[x] is unique.

3.2. Likelihood Ratio Derivative Estimates. The derivative h'.+ I in (2) is obtained via a LR estimator. The estimator is described in detail in this section.

The use of the likelihood ratio (LR) method for derivative estimation has been considered in several papers (Refs. 7-9, 25-26). The LR derivative estimation method is based on change-of-measure techniques used in importance sampling, accelerated simulation, and statistical inference. The following is a heuristic description of the idea underlying the LR method. Suppose that we wish to estimate the derivative with respect to 8 of the function F(O) = E(Y(O)). Assume that the distribution of Y(O) is absolutely continuous, and let the density associated with Y(O) be Po(Y). Then, we may write

F(O) = f YPo(Y) dy.

Taking derivatives of both sides with respect to 0, and assuming that we may interchange the order to differentiation and integration (see, e.g., Ref. 25 for conditions that make the interchange valid), we may write

dF(O)/dO = f y(dpo (y)/dO) dy = E( Y(O)Lo (Y(O))), $

where

Lo(y) (@o(y)/dO)&o(y).

30 JOTA: VOL. 82, NO. 1, JULY 1994

Thus, provided Y(O) is a quantity that can be observed from the sample path, and provided the function Lo(y) is known, then the derivative dF(O)/dO can be estimated by the observed quantity Y(O)Lo(Y(O)). Such an estimate is referred to as a likelihood ratio derivative estimate of dF(O)/dO. The term score function estimate is also used in the literature (e.g., Refs. 13 and 26).

Reiman and Weiss (Ref. 9) have proposed and analyzed a general class of LR methods for sensitivity analysis of stochastic systems. As is usual in LR derivative estimation, these estimates involve observation of a single sample path of the stochastic system. The LR derivative estimate for J that we will use is based on Ref. 9 and involves observing one busy period from every queue in the system. The derivative estimate obtained is then used in the recursive algorithm discussed in Section 2.1.

We now describe the LR technique proposed in Ref. 9 applied to an M/G/1 queue. We start with an underlying probability space (f~, ovf, P). Consider an M/G/1 queue defined on the underlying probability space, with arrival rate 2 and service time distribution G(x) with mean z, where 2z < 1. Assume that, at time 0, a customer arrives to find the queue empty. Index this customer by 0, the next customer by 1, and so on. Let N > 0 be the index of the first customer to encounter an empty system (for time greater than 0), and let T denote the arrival time of this customer. T is the duration of the regenerative cycle associated with a busy period, (i.e., the busy period and the idle period immediately following), and N is the number of customers served in the busy period. Let ~ be the natural a-algebra associated with the stopping time T, and let ~O be a ~r-measurable random variable that is bounded by a polynomial in N + T. The following lemma summarizes the result from Ref. 9 which we need for formulating our recursive optimization algorithm.

Lemma 3.1. (Reiman and Weiss). tion holds:

(S1) For s o m e 7 > 0 ,

Then,

dE(~k)/d2 = E((N/,~ - T)~k).

Suppose that the following condi-

f exp(~,x) dG(x) < ~ .

(4)

Proof. This is an immediate consequence of results in Ref. 9. []

In (4), the quantity (N/2 - T)~O, which can be obtained by observation of a single sample path of the queue, is our LR estimate for dE(~)/d2. By Lemma 3.1, it is an unbiased estimate.

JOTA: VOL. 82, NO. I, JULY 1994 31

We now derive an LR estimate for dPL/d2 based on the above lemma, where PL is the steady-state probability of loss of an M/G/1 queue with soft real-time customers. To this end, let Y denote the total number of customers lost in a busy period. Then clearly Y is ~r-measurable. From the theory of regenerative systems (Ref. 27), we can write

PL = E( Y ) /E (N) .

Now, for an M/G/1 queue, we have

E(N) = 1/(1 - 2z),

so that

PL -- (1 - 2z)E(Y). (5)

Differentiating (5) gives

dPL/d2 = (1 - 2z) d E ( Y ) / d 2 - zE (Y) .

Using Lemma 3.1, the following proposition is obtained.

Proposition 3.1. Suppose that (S1) holds. Then,

dPL (2)/d2 = E((( 1 - 2z)(m/2 - T) - z) Y).

3.3. On-Line Algorithm using LR Estimates. Recall that we wish to minimize the following performance measure with respect to 01 . . . . . OK:

K J ( O I , " " " , OK) = 2 o i p i L ( ~ 0 9 "~- oK+ i e f + I(~0K+ l), (6)

i=l

where K

0 x + l = 1 - ~ 0 i. i=1

Differentiating (6) with respect to 0 i, and using the fact that 2 / = 20 i, we get

~J/~0' = t ' i + 2 0 ' d e 2 / d , ~ , - t ' f +' - 2 0 '~+' d, of+' /a, t /~+,. (7)

Write 0 = [0' . . . . . OK] ', where the prime superscript denotes transposition, and

dY(0) /d0 = [OJ/ 0' . . . . . J/a0"l'.

The idea of the formulation of the algorithm is to estimate each term in (7) from observations of the system. The P~ terms can be estimated using Eq. (5), while the derivative terms dP~/d2~ can be estimated using LR estimates from Proposition 3.1. The parameters 2 and r ~, which are not known a

32 JOTA: VOL, 82, NO. 1, JULY 1994

priori, can also be estimated from observations of the system. Using these estimates, we obtain an estimator h" of d J/dO. We then use this estimator in an algorithm of the form of (1).

The on-line algorithm can be described as follows. We divide the time axis into intervals which we refer to as estimation cycles of the algorithm. Parameter updates are performed at the end of every estimation cycle, according to the recursion

0 . + , = - a J ; . + , ] .

where 6.+1 is an LR estimate of dJ(O,)/dO, which is obtained from observation of the system during the estimation cycle. By (5) and Proposi- tion 3.1, estimates for P~ and dP~(20~)/d2i can be obtained for each queue after observing the queue for one regenerative cycle (associated with one busy period in queue i). The estimation cycle of the algorithm is thus the smallest time interval containing one regenerative cycle for each queue. Figure 3 shows a typical estimation cycle of the algorithm for a system with three queues. The shaded regions below the queue length versus time plots for the three queues indicate the busy periods from which the derivative estimates for each queue are obtained. We refer to such busy periods as estimation busy periods.

Queue 1

I i CYCLE ! !

Queue 2 __ _1

Queue 3 ]

I ! I

I I

e

t

i !

n 0 n+l

Update Times l

Fig. 3. One estimation cycle of the on-line optimization algorithm.

JOTA: VOL. 82, NO. 1, JULY 1994 33

We are now ready to explicitly describe the estimate h', +1 used at the end of the nth estimation cycle in the parameter updating algorithm. For each i = 1 , . . . , K + 1, the ith component of h', + 1 is

= ( 1 +

- (1 - ] . r 1)rx++~ _ .~. OX+ l l ~ x + , ' (8 )

where for each j = 1 . . . . . K + 1,

fl~+, = ((1 - Le~)(N~ + i / ] . - T~+ l) -- 8~) Y~+ 1. (9)

Notice that (1 -].8~,)Y~,+ 1 are estimates of P~ based on (5) and fl~, + 1 are LR estimates of dP~ [d2~ based on Proposition 3.1. The symbols used in (8) and (9) are explained as follows:

~. is an estimate of the arrival rate 2, and is given by

. lyre Aj),

where N~ is the number of interarrivals from the start of the experiment to the start of the estimation cycle associated with h',+~ (the nth estimation cycle) and {A;} are the interarrival times;

^ i z, is an estimate of the mean service time z; at queue i and is given by Auf

^ i �9 . (11) j = l

where N~ 'n is the number of services at queue i from the start of the experiment to the start of the nth estimation cycle and {X~} are the service times at queue i;

Y~, + ~ is the number of customers lost in queue i during the estimation busy period of queue i in the nth estimation cycle;

N / + ~ is the number of customers in the estimation busy period of queue i during the nth estimation cycle;

T~, +1 is the duration of the regenerative cycle associated with the estimation busy period of queue i during the nth estimation cycle.

We now describe how the sequence {a, } used in the algorithm is obtained. Recall that, for each n,

an = diag[a ~,..., aX]b,, with a~ > 0, i - 1 . . . . . K,

{bn } is given recursively by

b. = 1/a.,

34 JOTA: VOL. 82, NO. 1, JULY 1994

where ~1 = 1 and for n > 1,

= ~ n + l ' ~ i f t~ ' n§176 . . . . an_L+landn>L, ~ § ~ (an, otherwise,

(12)

where L is some fixed positive integer, ~'n +lhn is the inner product of ~n + 1 and h'n, and h~ = 0 by convention. The above is similar to the so-called accelerated harmonic sequence (see Ref. 28), extended to the case of vector parameters. The motivation for this choice of {b~} is as follows. If ~ 'n+l~ >_ 0, then it is likely that 0 n is still far from 0 , , and therefore a larger value of b~ should be used, so that the algorithm can drive O n closer to 0 , more quickly by applying a larger correction to 0n. Therefore, if ~"~ § 1 ~n > 0, we do not decrease bn, and otherwise we do. We also decrease bn if its value has not changed for the last L iterations. This is to ensure that b~ decreases often enough, a technical condition that is required in the proof of convergence of the algorithm.

At this point, we have not explicitly defined the set D outside which the projection mechanism of the algorithm is used. This set D is defined in the next section.

4. Convergence Analysis

4.1. Assumptions. For each queue in the real-time system to be stable, we require that, for each i,

2iz t < 1,

which is equivalent to

0 i < 1/().~').

For each i, let

y; = min( 1, 1/(2z')).

Define the set S by

S={O~X:O<Ot<7~, i=l . . . . ,K, 1--,r+,<~O~<l}.~=, (13)

The set S consists of those 0 values for which the system is stable, and no queues are redundant (i.e., no 0 t component is 0). It represents the region of parameter values over which we wish to operate. Since J is continuous and bounded on S, then it can be continuously extended to the closure S of S. Since S is compact, there is some 0 . which minimizes J on S. However, 0 . may lie on the boundary of S, in which case the system may

JOTA: VOL. 82, NO. 1, JULY 1994 35

be unstable, or some queue may be redundant. To ensure that J achieves a minimum in S, we make the following assumption:

(P1) There exists O , e S such that for, all O~S, J (O, ) <J(O).

Consider the following compact subset of S:

D = OeNK: li <--Oi<--Ui, i = 1 , . . . ,K , 1 --UK+I <-- ~ 0 i < -- 1 --lK+l , i=l

where for each j = 1 . . . . . K + 1,

l: > 0 and uj < min(1, 1/(2z J)).

It is easy to show that D is convex. By Assumption (P1), we can choose/) such that an optimal value of 0 is contained in/9 (the interior of D). We shall assume this throughout. Note that the choice of/)requires knowledge of the stability region. In practice, D can be chosen using estimated bounds for the stability region (e.g., by knowing some a priori bounds on 2 and ~g).

We make the following assumptions on the functions P~( . )

(P2) For each i = 1 . . . . . K + 1, P~: is twice continuously differentiable with respect to 2i on (0, 1/zl).

(P3) For e a c h i = l , . . . , K + l ,

d2p~(2i) /d2~ > -2(dP~(2~) /d2 , ) /2 i , (14)

for all 2i ~ [2/i, 2ui].

Assumption (P3) implies that the function J is convex on D (this will be shown later). In practice, we may check this condition by estimating the quantities in (14) via the LR method (see Ref. 9, where second-derivative estimates are also discussed). We expect (P3) to hold if D is a sufficiently small region containing 0. If there is more than one optimum, then for some region D around each local minimum, we would expect the condition to hold.

4.2, Convergence Theorem. The main result for the on-line optimization algorithm is given in the following theorem. Recall from (P1) that we denote by 0 , the vector of load-sharing parameters which minimizes the performance measure J.

Theorem 4.1. Convergence of On-Line Algorithm. Suppose that, for the real-time system, Conditions (S1) and (P1)-(P3) hold. Define the on-line algorithm by the recursion

On+ l = ~[On - -an f tn+ l ],

36 JOTA: VOL. 82, NO. 1, JULY 1994

where n[. ] is defined by (3) with Dp c D compact and convex, O,eDp, and {a. }, {h'. } are as described before. Then 0.--, 0 . , a.s.

To prove the above theorem, we will appeal to the following convergence result.

Lemma 4.1. Let J: D ~ • be a function defined on a compact convex set D such that the following condition holds:

(A1) J is twice continuously differentiable and convex on D, with dJ(O.)/dO = 0 for some unique 0 .~/) , and O.~Dp.

Let {E.} be a sequence of random vectors given by E.+l=dJ(0~)/ d O - f i,,+,, and let {b.} be a sequence of positive real-valued random variables, both adapted to an increasing sequence of a-algebras { ~ }. Suppose that:

(B1) E L , bilE~,.(e,+,)ll < oo, a.s.;

(B2) Vn, E~( II~. +, I1:) - a= for some a.s. finite random variable a;

(C1) ~_~=, b. = ~ , a.s.;

(C2) ~,,oo=, b 2 < ~ , a.s.

Let a ,=d iag[a 1 . . . . ,aK]b,, with a i > 0 , i = 1 . . . . ,K. Let {O, , }cD be given by (2). Then, 0, ~ 0 . , a.s.

Proof. This is a variation of a standard result on the convergence of stochastic approximation algorithms (see, e.g., Refs. 21 and 29). A detailed proof based on martingale convergence arguments is available in Ref. 18.

[]

To use Lemma 4.1 to prove Theorem 4.1, we will need the following additional results.

Lenuna 4.2. Suppose that Assumptions (P1)-(P3) are satisfied. Then, the performance measure J for the real-time system satisfies Condi- tion (A1).

Proof. By Assumption (P2), J is twice continuously differentiable on D. So, to show (A1), it suffices to show that J is strictly convex on D, since existence of 0 . is implied by (P1), and uniqueness is implied by strict convexity. To this end, let x, y eD. Expanding J using Taylor's theorem to two terms, we get

J(y) = J(x) + (dJ(x)/dO)(y - x) + (y - x)'(d2j(O/dO2)(y - x),

JOTA: VOL. 82, NO. 1, J U L Y 1994 37

where

~ = x + ( ( y - x ) and 0 < ( < 1 .

Since D is convex, then ~ ~D. Write K

=[~1 . . . . . ~r], and ~x+l = 1 - )". (". i=1

Now, for each i , j = 1 , . . . , K,

2 i 2 i 2 ~2J(~)/~0'2 = 22 de~/d2i + 2 ~ d PL/d2,

+ 22dP[+~/dAx+l + 2z~ x+l t*A2DK+I/A]2~tL It*'~K+ 2, (15)

02j(~)/OO~OJ = 22dP~+,/d2x+, + 22~x+1 .'4z~ /-.~x+, �9 (16)

By Assumption (P3), the sum of the first two terms and the sum of the last two terms in (15) are strictly positive, and hence we have that

02J(~)/O0~O: > O,

a~J(~)laO'O j < a~j(~)laO ,~.

Write

x = [ x 1 . . . . . xX] ', y = [ y l , . . . ,yX],,

and for convenience, let

A = [Aij] ~ dZJ(~)/dO 2.

Note that Ai: is independent of

(y -- x)'(d2j(~)/dO2)(y - x) = i

=2 i =Z

i >0 .

Hence, we have that

i and j. Then,

X ( y ' - x ' ) ~ 4 . ( y - x:) J

(Yi-- XI)(A~(Y i - x') q- Z AiY(YJ-- x~) )

(yi--x ' )2(Au -- Aq) + AiJ(~i (Y i - -x ' ) ) 2

J(y) > J(x) + (dJ(x)/dO)(y - x),

and this implies that J is strictly convex (see Ref. 30, p. 112). []

Define an increasing sequence of a-algebras { 4 } as follows. Let ~ be the a-algebra generated by all random variables that can be observed from the start of the experiment to the time of the n th update. For an illustration

38 JOTA: VOL. 82, NO. 1, JULY 1994

of this, see Fig. 3. The arrow labeled ~ (and ~n+ 1) in Fig. 3 indicates that all the random variables from the start o the experiment to the tail of the arrow are measurable with respect to ~ (respectively, O~n +1)" Note that the sequences {/~, } and {b, } described in the previous section are adapted to

Lemma 4.3. The sequence {b n } used in the on-line algorithm satisfies Conditions (C1) and (C2).

Proof. It is easy to see from (12) that, for each n, b, > 1In and b,, < L/n. Hence,

• bn> ~ 1 / n = ~ , n = l n ~ l

• b~ < L z ~ l/n 2 < ~ . n = l n = l

Each b, is clearly ~-measurable. Therefore, Conditions (C1) and (C2) are satisfied. []

Lemma 4.4. {E n } given by (B2).

Suppose that Condition (SI) holds. Then, the sequence E,+, =h'n+ l -dJ(On)/dO satisfies Conditions (B1) and

Proof. See Section 7 (Appendix). []

We are now ready to prove Theorem 4.1.

Proof of Theorem 4.1. By the foregoing lemmas, all the conditions of Lemma 4.1 hold, and the required result follows. []

5. Simulation Example

In this section, we provide results from simulations of a specific example of the real-time system of Fig. 2. We consider the case where there are three queues (i.e., K = 2), the service time distributions are exponential, and the laxity distribution is also exponential, with rate I. The following parameter values for the system were used:

2 = 1, l = 10,000, ~.1 = 0.3, T 2 = 0.2, T 3 = 0.1.

JOTA: VOL. 82, NO. 1, JULY 1994 39

The on-line algorithm was applied to the above system with the following parameter values:

01 = [0.4, 0.35]', hence 013 = l - 0~ - 02 = 0.25,

a ] = 0.038, a 2 = 0.030, L = 10,000,

D = {0:0 ~, 02 > 0.05, 01 + 02 < 0.95},

D e = {0:01 , 02>0.1 , 01 + 0 2 < 0 . 9 } .

The values of a ~ and a 3 w e r e chosen by trial and error. 100,000 customers were used in each simulation run.

Figure 4 shows plots of 0,1, 0~ z, and 0, 3 = 1 - 0~ - 0,2 versus n for the simulated system operating with the on-line optimization algorithm, In Fig. 4, the solid curve is an average over 100 sample paths of the sequence {0, }, whereas the dashed curve is a single sample path.

It turns out that, for this example, we can obtain an analytical expression for J as follows. Now, since each queue is effectively an M/M/1 queue, we may write the distribution function of the steady-state mean waiting time for queue i as (see Ref. 31, p. 203)

W~(y) = 1 - 2;Z' exp( - ( 1 - ,~iz')y/'c').

The laxity distribution is given by

F/(x) = 1 -- exp( - Ix).

Let y be a random variable with distribution W,.(y), and let x be a random variable with distribution Fl(x). Then, since the waiting time is independent

i On

0 8 - Average " I . . . . . . . S a m p l e /

0.7 ~.t 4

r ] i / - ~--.-, ,~- . . . . - - - - . . . . . . . . . ~ 0 . 5 4 5 6

o., . . . . t

_ _ , " "" " . . . . . . . ~ . . . . . 0 . 2 7 2 7

0.2 ~= , - - - ' - - " . . . . . . . . . . . . . . . . ~ = ~ - - 0 . 1 8 1 7

0"1 L / ' - " . . . . . . . ' 01n t 0 i i i i J i i /

0 1 0 0 0 2 0 0 0 3 0 0 0 4 0 0 0 5 0 0 0 6 0 0 0 7 0 0 0 8 0 0 0

Fig. 4. 0.1, 0. 2, 0. 3 versus n.

40 JOTA: VOL. 82, NO. 1, JULY 1994

of the laxity, the steady-state probability of loss for queue i is

P~(:.~) = E(Py {y >_ x})

= (1 - exp( -/y))~.i(1 - 2 : ' ) exp( - ( 1 - 2:~)y/~ ') dy

Therefore, the objective function o r can be written as K+I

or(01 . . . . . 0 = 2 t - i ~ l

Solving for 0 , with the above expression for or using numerical techniques yields

0 , = [0.1817, 0.2727, 0.5456]'.

The numerically obtained optimal point agrees with the simulation results.

6. Conclusions

We have demonstrated the use of likelihood ratio derivative estimates for on-line optimization of a real-time system with load sharing. Under the stated assumptions, we have proved that the proposed algorithm converges a.s. Our analysis is potentially applicable to recursive multivariable optimization algorithms using LR estimates for a larger class of systems, and provides a rigorous framework for studying such algorithms.

In our analysis, we assumed Poisson arrivals and used a result of Reiman and Weiss (Ref. 9). With the use of more general LR derivative estimators [e.g., the generalized score function method (Ref. 10)], it may be possible to extend the analysis to include general arrival processes, as well as other forms of performance measures.

Several variations of the updating algorithm are possible. In particu- lar, it is sufficient to update a single component of the parameter vector at a time (see, e.g., Ref. 32). Using such asynchronous update methods, it is not necessary to compute all the components of the gradient estimate at each update. Furthermore, asynchronous update methods can be imple- mented in a parallel or distributed fashion. On the other hand, we would expect an algorithm that updates a single component at a time to converge at a slower rate than one that updates all the components at the same time.

While the algorithm considered here can be proved to converge, it requires waiting one estimation cycle before updating, this being the result of using regenerative techniques for estimating steady-state quantities. The length of one estimation cycle of the algorithm may be large if the load is

JOTA: VOL. 82, NO. 1, JULY 1994 41

high. Algorithms using IPA with update times that are not related to regeneration points have been studied, both empirically (e.g., Ref. 33) as well as analytically (e.g., Ref. 19). Such studies have not been conducted on their LR counterparts. 5 This seems to be an important issue, and deserves careful consideration.

7. Appendix: Proof of Lemma 4.4

Before we prove Lemma 4.4, we first introduce the following lemma.

Lemma 7.1. Let {~, } be an i.i.d, sequence of random variables such that E ( ~ ) = 0 and E(~ 2) = a 2 < ~ . Then, there exists an a.s. finite random variable M such that, if n > M(~o), then

( l l n ~;(~o) _2o'n -1/3 i 1

Proof. The result follows easily from the law of the iterated logarithm (see Ref. 34, p. 372). We leave the details of the proof to the reader.

[]

We may now proceed to prove Lemma 4.4.

Proof of Lemma 4.4. We first show that Condition (B1) is satisfied. Since b,, < L / n (see the proof of Lemma 4.3), then to show (B1) it will suffice to show that, a.s. for sufficiently large n,

IIE~.(~. + 1)[I -~ C/nq, where q > 0 and Cis some positive constant. This will proceed in several steps.

Step 1. Write E~, + i as a sum of two types of terms. Fix an i ~ { 1 , . . . , K}. Then,

' f ; ' . - h'(o.) ~ ' n + l = + 1

= ( 1 -Ls 1 + ~nO~fl~+ l - (1 - ~,,$~. + ') r f f +, ' - ~,,Off + ' ~ x + , '

__ p i L __ ~ 0 i ap iL /a~ i qt_ p f + l _ 2 o f f + l d V f + , / d 2 K + l

= (1 - L - ~ i ) Y / + 1 - e k + o i ( L j ~ / + 1 - ,/~ a p i L / d ~ i )

- ( (1 - L e . " + 1) r,,++l I _ / , f + ,) _ o , , + , ( L ~-,,+ / _ ,~ aef+ ~/a~,,+ ,). (17)

SSinee the writing of this paper, we have been informed by a reviewer of some recent work along these lines.

42 JOTA: VOL. 82, NO. 1, JULY 1994

where, for simplicity, we have written P J (omitting the argument) rather than PJL(20J). The terms in (17) above are of two types:

Type TI. (I - ~ . U ) Y J + , - P J L ;

Type T2. OJ(~..fl{+,-2dPJL/d2:), where j = i or K + 1.

Step 2. Show that, for terms of Type T1,

[E~(T1) I < O(n-"3). Consider first the terms of Type T1. Now,

(1 - ~.U)YJ+, - P ~

= (I - 2z ' )YJ+, - P J + (2~ j - ;[.s rJ +,

= ( 1 - 2 z S Y { + , - P ~ +(zJ2,~.(1/1~-l/2)-I .({{-zJ))Y{+,. (18)

The random variables in the nth estimation cycle depend on ~ only through 0, ; i.e., we may write Eo, in place of E{ . So, using (18) and noting that

P~ = ( 1 --2~OEo~

we get

Eo.((1 - ,~,s r j +, - P J ) = (zJ2~,( 1/X, - 1/2) - ~ , (U - zJ))Eo, r j +,.

But Y{ + 1 < N j + 1, where N { + 1 is the number of customers served in the estimation busy period. Hence,

Eo YJ+, <_Eo NJ+,.

Let Xj be a service time at queue j. Now, Condition (S1) implies that E(Xj) < ~ . Let

B~ = sup Eo(NJ(O)), O~D

where NJ(O) is the number of customers in a busy period of queue j, with an arrival rate of 40. Since for all 0 eD,

20i~ < uj, where uj < 1,

then B~ < ~ (see Ref. 31, p. 213). So,

Eo.(U{+ ,) < B-~,

and thus we may write

IEoo((a - L { L ) YL +, - P D I < B4 (~:2L 11/2. - 1/41 + ~. I~L - vg.

JOTA: VOL. 82, NO. 1, JULY 1994 43

Since by the strong law of large numbers, ~n ~/~ a.s., then {~., } is bounded a.s. Let

Then,

B 2 = sup )~ and B~ = B~ (1 + Xv:)B2. n

leo:((1 - #~.~)YJ+l - s>r _< B ~ ( l l / L , - llXl + Ir - ~Jl).

Now, if A is an interarrival time, then

Var(A) = 1/2,

since A is exponentially distributed with parameter 2. Let

,~o = 1/,,/~.

Also, by (S1), if Xj is a service time at queue j, then

Var(X:) < oo.

Let

~ , ; _ - ~ .

By Lemma 7.1, there exist a.s. finite random variables Ma and M~ such that, for n > M a,

i l i o n - - 1/21 < 2a, n-1/3,

and for n > M~,

le~ - ~'1 ~ 2a~ n-l/3"

Let

B{ = 2B{ max(a:, a~) and M j = max(M~, M0 .

Then, for all n > M:,

lEo.(( 1 - L ' ~ ) Y~+ , - P~)I --- B:4n -11, (19)

Step 3. Show that, for terms of Type T2,

l f~ (m2) I _< O(n - ,13).

Now, consider terms of type T2. We may write

7t,,fl~+, -- 2 dP~ld2: = --~.2( 11~.. -- l/2)fl~+ i + 2(fl~ +, -- dP~/d2j).

44 JOTA: VOL. 82, NO. 1, JULY 1994

Taking conditional expectations of both sides with respect to o~, we get

InEon(~J+ 1) - 2 dPJL/d~j = - I . 2 ( 1 / i . - 1/2)Eo.(fl:.+ ,)

+ 2(Eo~ +,) - cte~/d~:).

Now, by Proposition 3.1,

Eo~ @. +,) = aP~ /d~j, (20)

and hence the second term on the right side of (20) is zero. Let

8+ = sup laP& la,~+l. O+D

Since dPJL/d2j is continuous and D is compact, then B j < oo. So, we may write

I~.E0.(fls + , ) - 2 dP~/da+l<_ B~.~.All/~n -- l/A].

Let

B{ = B:sB222a ..

Thus, using Step 2 and noting that 0 j < 1, we have that, for n >_ M:,

O:. l I .Eo.@.+, ) - :. dP~/dXjl < B:6n-'/3.

Step 4. Show that

E~.(%+ 1) < 0(n-1/3).

From (19) and (21), for

n > max(M/, M K+ 1),

we may write

[E~n (E/+ ,)l < (B~ -~- B~ =~ B K+I -t- B K+ 1)n-l/3

Let

B = m a x {Bi4--~-Bi6.-~BK+I-t-B6K+I}, i~{l,...,/q

M = max {gJ}. jE{1,...,K + 1}

Then, combining the results of Steps 2 and 3, we have that, for n > M,

[Ie~.(~.+,) II -- B./~n ,/3 which shows that Condition (B1) holds.

(21)

JOTA: VOL. 82, NO. 1, JULY 1994 45

We now show that Condition (B2) holds. To this end, fix ie{1 . . . . . K}. We first show that E~,(e~+ 1) 2 is bounded uniformly in n a.s. Consider Eq. (17). Since ~, and ^i z, converge a.s., they are bounded uniformly a.s. Also, for j = i, K + 1, P~ and 0j are bounded uniformly by 1, and dP~[d2j is bounded uniformly since it is continuous, and D is compact. Therefore, we may bound E.+, as follows:

[ei+ B i t V i i i T ~ i VK+I 11< 7 ~ t n + l ' - ~ N n + l Y l n + l - - ~ n + l Y n + l - ~ . n + l

+ NK+, yX+? + TnK+ll y~+l I) + B~,

where B~, B~ < ~ a.s. Since

YJ+I m-SJ+l,

then we may write

i i i i O[N K+I~2 T K + I N K+I~ +B~ ].~+,]<B7(2(N.,+I)Z+T.+,Nn+,+.-~ . + , , + .+1 n+l / �9

Squaring, taking conditional expectations with respect to ~ , and using the inequality

x_< 1 + x 2, for x _>0,

we see that to show that E~,(e~,+1) 2 is uniformly bounded, it suffices to show that terms of the following form are uniformly bounded:

Eo.(NJ+I) 4, Eo.(NJ+I)3TJ+I, Eo.(NJ+I)Z(TJ+I) z.

But by the Schwarz inequality.

Eo.(Nj+ ,)3(7j + ,) _< \/Eo.(Nj+ ,)6 JEo.(Tj + ,)2,

Eo.(Nj+ ,)2(TJ + ,)2 _< ~/Eo.(Nj+ 1)4x/Eo.(Tj+ i)4.

Hence. it suffices to show that Eo.(Nj+,) 6 and Eo.(TJ+,) 4 are uniformly bounded. Now, by Assumption (Sl). if Xj is a service time at queue j, then

E(Xj) < oo.

Let

B9 = sup Eo(NJ) 6 and B, o = sup Eo(TJ) 4. OcD OcD

Since for all O~D,

20"z i _< uj, where uj < 1,

then B 9, Bto < oe (see Ref. 31, p. 213). So, Eo,,(Nj+l) 6 and Eo,,(Tj+,) 4 are uniformly bounded by B9 and B,o, respectively. Therefore, we may write

E~(e~+I)2<_B~I < 0% a.s.

46 JOTA: VOL. 82, NO. 1, JULY 1994

Hence,

where K

tr 2 = ~ B]I < ~ , a.s., i=1

which shows that Condition (B2) is satisfied. This completes the proof of Lemma 4.4. []

References

1. TANTAWI, A. N., and TOWSLEY, D., Optimal Static Load Balancing in Dis- tributed Computer Systems, Journal of the Association for Computing Machin- ery, Vol. 32, No. 2, pp. 445-465, 1985.

2. KUROSE, J. F., and SINGH, S., A Distributed Algorithm for Optimum Static Load Balancing in Distributed Computer Systems, Proceedings of the IEEE 1986 INFOCOM Conference, Miami, Florida, pp. 458-467, 1986.

3. KUROSE, J. F., and CHIPALKATTI, R., Load Sharing in Soft Real-Time Dis- tributed Computer Systems, IEEE Transactions on Computers, Vol. 36, No. 8, pp. 993-1000, 1987.

4. CASSANDRAS, C. G., and LEE, J. I., Application of Perturbation Techniques to Optimal Load Sharing in Discrete Event Systems, Proceedings of the 1988 American Control Conference, Atlanta, Georgia, pp. 450-455, 1988.

5. CASSANDRAS, C. G., and LEE, J. I., Discrete Event Systems with Real-Time Constraints: A Distributed Algorithm for Optimal Load Sharing, Proceedings of the 27th Conference on Decision and Control, Austin, Texas, pp. 1508-1513, 1988.

6. CASSANDRAS, C. G., ABtDI, M. V., and TOWSLEY, D., Distributed Routing with On-Line Marginal Delay Estimation, IEEE Transactions on Communications, Vol. 38, No. 3, pp. 348-359, 1990.

7. GLYNN, P. W., Likelihood Ratio Gradient Estimation: An Overview, Proceedings of the 1987 Winter Simulation Conference, Atlanta, Georgia, pp. 366-375, 1987.

8. RUBINSTEIN, R. Y., Sensitivity Analysis and Performance Extrapolation for Computer Simulation Models, Operations Research, Vol. 37, No. 1, pp. 72-81, 1989.

9. REIMAN, M. I., and WEISS, A., Sensitivity Analysis for Simulations via Likeli- hood Ratios, Operations Research, Vol. 37, No. 5, pp. 830-844, 1989.

10. RUBINSTEIN, R. Y., and SHAPIRO, A., Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization via the Score Function Method, John Wiley and Sons, Chichester, New York, 1993.

11. RUBINSTEIN, R. Y., Monte Carlo Optimization, Simulation, and Sensitivity of Queueing Networks, John Wiley and Sons, New York, New York, 1986.

JOTA: VOL. 82, NO. 1, JULY 1994 47

12. GLYNN, P. W., Stochastic Approximation for Monte Carlo Optimization, Pro- ceedings of the 1986 Winter Simulation Conference, Washington, DC, pp. 356-365, 1986.

13. L'EcUYER, P., GIROUX, N., and GLYNN, P., Stochastic Optimization by Simulation: Convergence Proofs and Experimental Results for the GI/G/ I Queue in Steady State, Management Science (to appear).

14. Ho, Y. C., and CAO, X. R., Perturbation Analysis of Discrete Event Dynamic Systems, Kluwer Academic Publishers, Norwell, Massachusetts, 1991.

15. GLASSERMAN, P., Gradient Estimation via Perturbation Analysis, Kluwer Aca- demic Publishers, Norwell, Massachusetts, 1991.

16. WARDI, Y., Simulation-Based, Distributed Algorithm for Optimization of Queue- ing Models Arising in Computer Communication Networks, Preprint, School of Electrical Engineering, Georgia Institute of Technology, 1989.

17. Fu, M. C., Convergence of a Stochastic Approximation Algorithm for the GI/G/ I Queue Using Infinitesimal Perturbation Analysis, Journal of Optimiza- tion Theory and Applications, Vol. 65, No. 1, pp. 149-160, 1990.

18. CHONG, E. K. P., and RAMADGE, P., Convergence of Recursive Optimization Algorithms Using Infinitesimal Perturbation Analysis Estimates, Discrete Event Dynamic Systems: Theory and Applications, Vol. 1, pp. 339-372, 1992.

19. CHONG, E. K. P., and RAMADGE, P., Optimization of Queues Using an Infinitesimal Perturbation Analysis-Based Stochastic Algorithm with General Update Times, SIAM Journal on Control and Optimization, Vol. 31, No. 3, pp. 698-732, 1993.

20. ROBBINS, H., and MONRO, S., A Stochastic Approximation Method, Annals of Mathematical Statistics, Vol. 22, No. 3, pp. 400-407, 1951.

21. KUSHNER, H. J., and CLARK, D. S., Stochastic Approximation Methods for Constrained and Unconstrained Systems, Springer Verlag, New York, New York, 1978.

22. METIVIER, M., and PRIOURET, P., Applications of a Kushner and Clark Lemma to General Classes of Stochastic Algorithms, IEEE Transactions on Information Theory, Vol. 30, No. 2, pp. 140-151, 1984.

23. POLYAK, B. T., New Method of Stochastic Approximation Type, Automation and Remote Control, Vol. 51, No. 7, pp. 937-946, 1991.

24. LJUNG, L., PFLUG, G., and WALK, H., Stochastic Approximation and Opti- mization of Random Systems, Birkhauser Verlag, Basel, Switzerland, 1992.

25. L'ECUYER, P., A Unified View of IPA, SF, and LR Gradient Estimation Techniques, Management Science, Vol. 36, No. 11, pp. 1364-1383, 1990.

26. ASMUSSEN, S., and RUBINSTEIN, R. Y., The Efficiency and Heavy Traffic Properties of the Score Function Method for Sensitivity Analysis of Queueing Models, Advances in Applied Probability, Vol. 24, No. 1, pp. 172-201, 1992.

27. SHEDLER, G. S., Regeneration and Networks of Queues, Springer Verlag, New York, New York, 1987.

28. KESTEN, H., Accelerated Stochastic Approximation, Annals of Mathematical Statistics, Vol. 29, No. 1, pp. 41-59, 1958.

48 JOTA: VOL. 82, NO. 1, JULY 1994

29. LJUNG, L., Analysis of Recursive Stochastic Algorithms, IEEE Transactions on Automatic Control, Vol, 22, No. 4, pp. 551-575, 1977.

30. FLEMING, W., Functions of Several Variables, Springer Verlag, New York, New York, 1977.

31. KLEINROCK, L., Queueing Systems, Vol. 1: Theory, John Wiley, New York, New York, 1975.

32. BERTSEKAS, D. P., and TSITSIKLIS, J. N., Parallel and Distributed Computation: Numerical Methods, Prentice-Hall, Englewood Cliffs, New Jersey, 1989.

33. SURI, R., and LEUNG, Y. T., Single-Run Optimization of Discrete Event Simulations: An Empirical Study Using the M/M/I Queue, l i e Transactions, Vol. 21, No. 1, pp. 35-49, 1989.

34. SHIRYAYEV, A. N., Probability, Springer Verlag, New York, New York, 1984.