Design of a web site for guaranteed delay and blocking probability bounds

www.elsevier.com/locate/dsw

Decision Support Systems 38 (2004) 131–140

Design of a web site for guaranteed delay and blocking

probability bounds

Indranil Bosea,*, Kemal Altinkemerb,1

aDepartment of Decision and Information Sciences, Warrington College of Business Administration, University of Florida, 351 Stuzin Hall,

PO Box 117169, Gainesville, FL 32611, USAbKrannert Graduate School of Management, Purdue University, West Lafayette, IN 47907, USA

Received 1 February 2002; accepted 1 July 2002

Available online 28 June 2003

Abstract

A new mathematical programming model is proposed for minimizing the cost of design of a Web site by optimally

determining the number of servers and buffers when a performance guarantee in terms of the average waiting time and loss

probability is to be provided to users. The Web site is modeled as an M/G/c/N queuing system where requests for connections

represent arriving customers and the browsing of Web sites represents service received by customers. Numerical experiments

are conducted with different choices of problem parameters and the optimal design cost, and optimal number of servers and

buffers are obtained for these cases.

D 2003 Elsevier B.V. All rights reserved.

Keywords: Finite capacity; Loss probability; Queuing model; Waiting time; Web server

1. Introduction purposes such as providing useful context sensitive

The World-Wide-Web has helped in the sharing of

information between Internet users throughout the

world. It has become synonymous with a mega ware-

house of information. From Ref. [4] we know that,

though it was initially started as a project for enabling

easy exchange of information between researchers

who were geographically distant from each other, it

has now taken the role of an international information

superhighway. The Web has been used for different

0167-9236/$ - see front matter D 2003 Elsevier B.V. All rights reserved.

doi:10.1016/S0167-9236(03)00080-0

* Corresponding author. Tel.: +1-352-392-0648; fax: +1-352-

392-5438.

E-mail addresses: [email protected] (I. Bose),

[email protected] (K. Altinkemer).1 Tel.: +1-765-494-9009; fax: +1-765-494-1526.

information, allowing exchange of information within

and between organizations, and lately, for advertising,

selling and buying merchandise, which has been

referred to as electronic commerce, as in Ref. [3].

Whatever be the use, there is no denying that the Web

has already become a part of everyday life in a large

part of the world.

With the advent of user friendly browsers, the Web

has become a technology easy to use and understand.

However, in spite of the apparent simplicity of use,

users have reported several problems in the use of the

Web. Among the top three problems reported in Ref.

[14] are searching for specific information, speed of

data access and locating and navigating sites. More

recently, Selvidge et al. [15] studied the variable impact

I. Bose, K. Altinkemer / Decision Support Systems 38 (2004) 131–140132

of Web delays on user lostness, frustration and propor-

tion of task completion. Current technology does not

allow users to get an estimate of the amount of time

they would have to wait when downloading a docu-

ment but provides a real-time measure of the amount of

content that has been downloaded. Sometimes users

wait a significant amount of time and are then refused

connection. This is quite frustrating for the users and

may lead to inertia about visiting a specific site. Hence,

for Web site designers it is important to design the Web

site in a manner that neither the waiting time for a

requested connection nor is the chance that the user is

refused connection is too high.

2. Motivation

From Ref. [5], we know that the basic mechanism

of operation of the Web is the same as that of a client

server system. The three main components are a client

site that requests specific information, a Web site with

servers and buffers and a network connection that

allows communication between the client and the site.

The underlying network may be a corporate intranet

or it can be the Internet itself. Fig. 1 gives a schematic

representation of the overall configuration of a Web

server. In case of the client server model based on the

Internet the problem of communication is extremely

complex as it involves a large variety of client sites

Fig. 1. Client– server m

that request information from the server site. The

delay experienced by a user when requesting infor-

mation is a function of the client, the server and the

network. In most cases, the site provider has no

control over the network or the client sites.

In most cases, the configuration of the Web site is

done on an ad-hoc basis. The final configuration of

the site usually depends on the objectives of the site

provider. Some goals might be minimizing the cost of

server and buffer installation and operation, minimiz-

ing the number of lost requests for access to the site or

improving the response time experienced by the users

when downloading information from the site. Some of

the major issues that need to be considered in order to

meet these objectives are the number and type of

requests for service, the service time for a request, and

the load on the servers. At the same time, the Web site

administrator needs to have knowledge about the

minimum level of service guarantee to be provided

to the users.

We propose a new mathematical programming

formulation of the problem of optimally designing a

Web site by deciding on the optimal number of servers

and buffers to be installed at the site. The goal of the

design is to minimize the cost of installation of servers

and buffers such that certain service related perfor-

mance bounds are satisfied. The contribution of this

research is to show the applicability of queuing theory

in modeling a Web site and using the queuing model

odel of the Web.

I. Bose, K. Altinkemer / Decision Support Systems 38 (2004) 131–140 133

in a mathematical programming framework for effi-

cient design of a Web site.

The organization of the paper is as follows. In

Section 3, we provide a brief literature review on the

use of finite capacity queuing models for some

applications and list some research that uses queuing

theory for modeling performance of Web servers. In

Section 4, the optimization model for the design of the

Web server is described. Section 5 provides a deriva-

tion of the approximate expression for average wait-

ing time of customers in case of an M/G/c/N queuing

system, which is required to solve the optimization

problem developed in Section 4. Section 6 details the

numerical experiments conducted to obtain the opti-

mal cost and configuration of Web sites for different

service distributions and expected arrival rates and

service times. The conclusion and directions for future

research appear in Section 7.

3. Literature review

In the area of performance evaluation and charac-

terization of Web servers, simulation and statistical

analyses have been used predominantly. Using vari-

ous statistical measures, Arlitt and Williamson [2]

obtained document size distribution, document type

distribution, document referencing behavior and geo-

graphic distribution of requests using six different data

sets from various educational and commercial site

providers. Iyengar et al. [10] used simulation to

develop different tradeoffs between delay experienced

by the users and the percentage of requests that are

lost, under conditions of heavy traffic, from the Web

servers. A benchmarking method based on WebStone,

for understanding the performance metrics of a Web

server is discussed in Almeida et al. [1]. The use of

queuing theory for understanding the performance of

Web servers is reported in Slothouber [16]. In this

high level model that ignores the details of the HTTP

protocol, Web servers are modeled as an open queuing

network and the effect of various parameters such as

file size, server speed and network bandwidth on the

server response time is studied using analytical pro-

cedures. In an attempt to model the low level details

and interactions between the HTTP and the TCP/IP

protocols, Hariharan et al. [9] model Web servers as a

tandem queuing model consisting of three interacting

components and study the dependence and interaction

between these sub-components using simulation.

There is some similarity between this research and

that of Fischer et al. [8] where M/G/c/c queues are

used to model call arrivals at Automatic Call Distri-

bution centers and an alternate expression for calcu-

lating loss probabilities is suggested that requires

significantly less computation time. Whitt [17] dis-

cussed an interesting application of a telephone call

center where customer satisfaction was improved by

informing the customers about anticipated delays

before joining the M/M/c/N queuing system. Lu et

al. [13] have addressed the problem of management of

delays for different service classes on a Web server.

They used a feedback control theory-based approach

for designing the adaptive architecture for Web serv-

ers operating under HTTP 1.1, which could provide

relative delay guarantees for different service classes.

The above review shows that queuing theory has

proved to be a useful technique for analyzing Web

sites. Our paper takes a unique approach by embed-

ding a queuing theory-based model of a Web site in a

new mathematical programming-based formulation

for solving the problem of optimal configuration of

a Web site. Using the known results on approximate

analysis of M/G/c/N queues, we are able to determine

the optimal number of Web servers and buffers that a

designer should install at a Web site at a minimum

cost while providing a guaranteed level of service.

4. Model

In this research, a Web site is represented as a

queuing system, where the requests for connections

represent arriving customers and time spent browsing a

particular Web site is defined as service. This is a finite

capacity queuing model since the Web site can handle

only a limited number of requests. We can model the

requests for connections approximately as a Poisson

process with exponential inter-arrival time distribution.

The users request a connection and after getting a

connection, spend time browsing pages within a Web

site. Once their purpose is served they quit the system

(which may be defined as end of service). Different

users spend different amounts of time on the Web sites

browsingHTML pages and embeddedmultimedia files

of various sizes. Hence, the service can be assumed to


follow a ‘general’ distribution. Since each browsing

activity involves browsing a number of Web pages and

the time spent browsing each page is random, the total

time spent in browsing all the Web pages during a

single visit can be assumed to follow a general distri-

bution as well (since the sum of general distributions is

a general distribution). Every Web site has a fixed

number of servers and also has a limit on the number

of connections that it can store in its buffer for future

service. Once all servers become busy, all subsequent

requests for connections are buffered in the TCP/IP

listen queue and they wait for the server to be free. If the

waiting spaces are all occupied, then the incoming

requests for connection are refused. This is known as

blocking of the Web site. The queuing model is an M/

G/c/N queuing system where customers are lost from

the system once all servers as well as the waiting spaces

in the system become busy. The queuing model of a

Web site is depicted in Fig. 2.

The goal of the Web site designer is to minimize

the total cost of installation of the Web site and to

provide a desired performance guarantee to the users.

The performance is guaranteed in terms of the average

waiting time of a connection and the blocking prob-

ability of the connection.

Notation:

c=Number of available Web servers on a site

N =Number of buffers

Z =Maximum allowable average waiting time of a

connection specified by the designer (s)

X =Maximum allowable blocking probability of a

connection specified by the designer

Fig. 2. A queuing represen

a =Cost of a server (US$)

b =Cost of a buffer (US$)

W=Average time spent to fulfill a request for

connection including waiting and connection (s)

D =Blocking probability of a connection.

Problem P: Minimize ac + bN.

Subject to:

WVZ ð1Þ

DVX ð2Þ

cz0 ð3Þ

Nz0 ð4Þ

c, N are integers.

In order to solve this problem, the designer has to

obtain closed form expressions for W and D. To the

best of our knowledge, no exact closed-form expres-

sions are available for the average waiting time and

blocking probability of customers in case of an M/G/

c/N queue. In the next section, we provide a deriva-

tion for approximate closed-form analytical expres-

sions for W and D.

5. Average waiting time and blocking probability

In this section, we derive an approximate expres-

sion for the waiting time of customers and the blocking

probability in case of an M/G/c/N queue with a single

class of service and under heavy traffic condition. This

tation of a Web site.


is required for solving Problem P described in the

earlier section. Multiserver queues with ‘general’ ser-

vice are difficult to analyze. Although closed form

expressions are available for the M/M/c/N queues, the

‘general’ service distribution for an M/G/c/N queue

makes it difficult to obtain exact analytical results.

Additional notation:

k =Arrival rate of incoming requests for connec-

tion to a Web site (requests/s)

s = Service time for each incoming connection (s)

E(s) =Average time spent by a connection at a Web

site for browsing activity (s)

Pj = Probability that there are j servers busy at a

Web site where j= 1, 2, . . ., cLq =Average number of requests for connection

waiting to be serviced in the buffers

Wq = Average waiting time experienced by a

request for a connection (s)

q = kE(s).

According to Kimura [11], the steady-state proba-

bility that j servers among c servers remain busy at

any time in case of an M/G/c/N queue is given by:

Pj ¼

ðcqÞsP0

j!; j ¼ 1; . . . ; c� 1

ðcqÞc

c!

1� n1� q

nj�cP0; j ¼ c; cþ 1; . . . ; cþ N � 1

ðcqÞc

c!nNP0; j ¼ cþ N

8>>>>>>><>>>>>>>:

ð5Þ

where

P0 ¼Xc�1

k¼0

ðcqÞk

k!þ ðcqÞc

c!

1� qnN

1� q

( )�1

ð6Þ

n ¼ kEðsÞRG

c� fkEðsÞ þ kEðsÞRGgð7Þ

RG ¼ Expected waiting time for general service distribution

Expected waiting time for exponential service

ð8Þ

Exact expressions for RG are difficult to obtain.

However, an asymptotic result on RG under the

condition of heavy traffic (i.e., q! 1) is available

from Kimura [11]. It states that

Limq!1

RG ¼ 1þ c2v2

ð9Þ

where cv is the coefficient of variation of the under-

lying service distribution. It is stated in Kimura [11]

that ‘‘the approximation is exact for the cases with

either no extra waiting space, exponential service-time

distribution, or a certain two-parameter family of

service-time distribution’’. Hence, the approximation

is valid in our case with the only limiting condition

that there is heavy traffic in the system and q! 1.

The value for Pj from Eqs. (5) and (6) can be used

together with the Poisson Arrivals See Time Averages

(PASTA) property to calculate the number of requests

for connection that are lost (D) because the servers

and the buffers remain busy. This is same as the

blocking probability of a connection. We find,

D ¼ PcþN ð10Þ

The expression for D is used in constraint (2) of the

optimization Problem P described in Section 4. The

information conveyed by this formula is important if

the Web site provider needs to keep track of the

number of connections that are lost. Every site will

possibly have a known percentage of ‘lost customers’

that they can tolerate due to unavailability of buffers.

Once D exceeds that value, this might give a signal

that the site is getting ‘too busy’.

Another figure of merit is the average number of

requests for connection waiting to be serviced in the

buffers. Using simple algebra, the expression for Lq is

given by

Lq ¼XcþN

n¼cþ1

ðn� cÞPn ð11Þ

Using Little’s Law, the average waiting time of the

request for connection is given by

Wq ¼Lq

kð1� PcþN Þð12Þ

Again, the total time spent by the request for waiting

in the buffer as well as completing the service of

downloading the required Web pages is then easily


obtained by adding the average queuing time to the

average service time. In other words,

W ¼ Wq þ EðsÞ ð13Þ

This expression for W is used in constraint (1) for

solving the optimization Problem P detailed in Sec-

tion 4.

6. Numerical experiments

The goal of the numerical experiments reported in

this section is to obtain the solution to the nonlinear

optimization Problem P in terms of the total cost of

installation of servers and buffers, optimal number of

servers and optimal number of buffers. The designer

provides known bounds for the blocking probability

and average waiting time of a connection. We exper-

iment with two different distributions—Erlang-2 (with

coefficient of variation 0.5) and hyperexponential

(with coefficient of variation 1.0) to represent the

‘general’ service distribution of the model and for

two pairs of values for E(s) and k. Physically, the Web

server is a commercially available computer server

that can host a Web site and hence we assume the unit

cost of server (a) to be US$3000. The buffer is

equivalent to a hard disk that can be used for storage

of requests and the unit cost of buffer (b) is assumed

to be US$100.

Since Problem P is a nonlinear optimization prob-

lem it could not be solved using any commercially

available software. We used an indirect approach for

solving this problem. Given a known delay bound and

a blocking probability bound we first completely

enumerated the combination of c and N values (inte-

gers) that satisfy constraints (1) and (2). Next we

choose the pair of values that resulted in the minimum

value of the objective function as the optimal solution.

The program for obtaining the solution was coded

using MATLAB 5.3 and was run on a Pentium III 650

MHz personal computer.

An important step in conducting the numerical

experiments is to estimate the parameters k and E(s).

Estimates of these parameters can be obtained from

the log-files associated with a Web site. As noted by

Eschenfelder et al. [7], the estimates can be obtained

from the access log file that lists the IP address of the

user, data and time of the access and user action taken

during the access period including the timestamp of

the last activity of the user on the Web site. For our

numerical experiments, we studied the log-files of

several Web sites to obtain realistic values for the

parameters k and E(s).

In the first experiment, we assume a heavy traffic

load on the server and let k= 170 requests/s and

E(s) = 175 s. The service distribution is Erlang-2.

The blocking probability is varied from 0.01 to 0.1

and the average waiting time is varied from 300 to

1000 s. No feasible solution to the optimization

problem can be found if the blocking probability is

less than 0.01 and the average waiting time is less than

300 s. The most expensive scenario for design

involves 23 servers and 40 buffers, with a total design

cost of US$73000. The least cost of design, i.e.,

US$3500, is obtained when the blocking probability

is 0.1 and the average waiting time varies between

600 and 1000 s. The results of this experiment are

shown in Table 1.

In the second experiment, all parameter values are

kept unchanged except the service distribution is

chosen to be hyperexponential. It is to be noted that

the specific distribution function is not used in calcu-

lation of the problem parameters. The coefficient of

variation of the service distribution is used for com-

putation of D and W. In Table 2, no optimal solution

can be obtained for a blocking probability value of

0.01. This is so because there is no feasible solution

available that satisfies a loss probability bound of 0.01

as well as a delay bound between 300 and 1000 s.

This goes to show that a loss probability bound of

0.01 (i.e., only 1% of the requests are rejected) is

extremely stringent and generally cannot be provided

by a Web server while providing a delay bound within

a tolerable limit. It is also observed from the second

experiment that the optimal design turns out to be

more expensive in the case of the hyperexponential

service distribution than that of the Erlang-2 distribu-

tion.

The third set of experiments is conducted for

k = 340 requests/s, E(s) = 350 s and for Erlang-2 and

hyperexponential distributions and the results are

reported in Tables 3 and 4, respectively. From Tables

3 and 4, we see that when the arrival rate is doubled

and the service rate is halved it becomes increasingly

difficult to obtain an optimal solution to the optimi-

Table 1

Number of servers, number of buffers and total installation cost (US$) for various average delay bounds (s) and loss probability bounds for

Erlang-2 service distribution with E(s) = 175 s and k= 170/s

Loss Delay

probability300 400 500 600 700 800 900 1000

0.01 23, 40 14, 41 10, 41 8, 41 6, 42 5, 42 5, 42 4, 42

73000 46100 34100 28100 22200 19200 19200 16200

0.02 14, 23 8, 23 6, 24 5, 24 4, 24 4, 24 3, 24 3, 24

44300 26300 20400 17400 14400 14400 11400 11400

0.03 4, 39 6, 16 5, 16 4, 17 3, 17 3, 17 2, 17 2, 27

15900 19600 16600 13700 10700 10700 7700 7700

0.04 4, 35 5, 12 4, 13 3, 13 2, 13 2, 13 2, 13 2, 13

15500 16200 13300 10300 7300 7300 7300 7300

0.05 4, 33 4, 10 3, 10 2, 10 2, 10 2, 10 2, 10 2, 10

15300 13000 10000 7000 7000 7000 7000 7000

0.06 3, 31 3, 8 3, 8 2, 9 2, 9 2, 9 2, 9 1, 9

12100 9800 9800 6900 6900 6900 6900 3900

0.07 3, 31 3, 7 2, 7 2, 7 2, 7 2, 7 1, 8 1, 8

12100 9700 6700 6700 6700 6700 3800 3800

0.08 3, 30 3, 6 2, 6 2, 6 2, 6 1, 7 1, 7 1, 7

12000 9600 6600 6600 6600 3700 3700 3700

0.09 3, 5 3, 5 2, 6 2, 6 1, 6 1, 6 1, 6 1, 6

9500 9500 6600 6600 3600 3600 3600 3600

0.1 3, 5 2, 5 2, 5 1, 5 1, 5 1, 5 1, 5 1, 5

9500 6500 6500 3500 3500 3500 3500 3500

Table 2

Number of servers, number of buffers and total installation cost (US$) for various average delay bounds (s) and loss probability bounds for

hyperexponential service distribution with E(s) = 175 s and k= 170/s

Loss Delay

probability300 400 500 600 700 800 900 1000

0.01 – – – – – – – –

0.02 21, 35 12, 36 9, 37 7, 37 6, 37 5, 38 5, 38 4, 38

66500 39600 30700 24700 21700 18800 18800 15800

0.03 5, 46 9, 25 7, 25 5, 26 4, 26 4, 26 3, 26 3, 26

19600 29500 23500 17600 14600 14600 11600 11600

0.04 4, 39 5, 49 5, 19 4, 20 4, 20 3, 20 3, 20 2, 20

15900 19900 16900 14000 14000 11000 11000 8000

0.05 4, 35 6, 15 4, 16 4, 16 3, 16 3, 16 2, 16 2, 16

15500 19500 13600 13600 10600 10600 7600 7600

0.06 4, 33 5, 12 4, 13 3, 13 3, 13 2, 13 2, 13 2, 13

15300 16200 13300 10300 10300 7300 7300 7300

0.07 4, 32 4, 11 3, 11 3, 11 2, 11 2, 11 2, 11 2, 11

15200 13100 10100 10100 7100 7100 7100 7100

0.08 4, 31 4, 9 3, 9 2, 10 2, 10 2, 10 2, 10 2, 10

15100 12900 9900 7000 7000 7000 7000 7000

0.09 4, 31 3, 8 3, 8 2, 9 2, 9 2, 9 2, 9 1, 9

15100 9800 9800 6900 6900 6900 6900 3900

0.1 3, 30 3, 7 3, 7 2, 8 2, 8 2, 8 1, 8 1, 8

12000 9700 9700 6800 6800 6800 3800 3800


Table 4

Number of servers, number of buffers and total installation cost

(US$) for various average delay bounds (s) and loss probability

bounds for hyperexponential service distribution with E(s) = 350 s

and k= 340/s

Loss Delay

probability500 600 700 800 900 1000

0.01 – – – – – –

0.02 – 21, 35 16, 35 12, 36 11, 36 9, 37

66500 51500 39600 36600 30700

0.03 – 12, 49 11, 25 9, 25 7, 25 7, 25

40900 35500 29500 23500 23500

0.04 – 10, 41 9, 18 7, 19 6, 19 5, 19

34100 28800 22900 19900 16900

0.05 – 9, 14 7, 15 6, 15 5, 15 4, 16

28400 22500 19500 16500 13600

0.06 – 8, 12 6, 12 5, 12 4, 13 4, 13

25200 19200 16200 13300 13300

0.07 – 6, 10 5, 10 4, 11 4, 11 3, 11

19000 16000 13100 13100 10100

0.08 – 6, 9 4, 9 4, 9 3, 9 3, 9

18900 12900 12900 9900 9900

0.09 – 5, 8 4, 8 3, 8 3, 8 3, 8

15800 12800 9800 9800 9800

0.1 – 5, 7 4, 7 3, 7 3, 7 3, 7

15700 12700 9700 9700 9700


zation problem for stricter delay bounds. This is the

reason why our solution method is not able to find

any solution for an average delay bound < 600 s.

Also, as observed in the paragraph above, we again

note that the optimal design turns out to be more

expensive in the case of the hyperexponential service

distribution.

These results can be of use to a Web site designer

in a number of ways. First, if the designer is operating

under a given budget, (s)he can decide how many

servers and buffers to procure to provide the best

quality of service. For some designers, blocking

probability will be of more concern than average

delay and (s)he can choose to operate with a stringent

bound for blocking probability and loose bound for

average delay. Depending on the criterion, (s)he will

be able to operate on different cells of Table 1, 2, 3 or

4. Second, an important aspect of these experiments is

that even if the designer has no knowledge about the

service distribution of the various users, (s)he can still

use the results of the hyperexponential and Erlang-2

distributions to solve the problem of Web site design

to get an idea about the approximate cost of config-

Table 3

Number of servers, number of buffers and total installation cost

(US$) for various average delay bounds (s) and loss probability

bounds for Erlang-2 service distribution with E(s) = 350 s and

k= 340/s

Loss Delay

probability500 600 700 800 900 1000

0.01 – 23, 40 17, 40 14, 41 11, 41 10, 41

73000 55000 46100 37100 34100

0.02 – 14, 23 10, 23 8, 23 7, 23 6, 24

44300 32300 26300 23300 20400

0.03 – 10, 16 7, 16 6, 16 5, 16 5, 16

31600 22600 19600 16600 16600

0.04 – 8, 12 6, 12 5, 12 4, 13 4, 13

25200 19200 16200 13300 13300

0.05 – 6, 10 5, 10 4, 10 3, 10 3, 10

19000 16000 13000 10000 10000

0.06 – 5, 8 4, 8 3, 8 3, 8 3, 8

15800 12800 9800 9800 9800

0.07 – 5, 7 4, 7 3, 7 2, 7 2, 7

15700 12700 9700 6700 6700

0.08 – 4, 6 3, 6 3, 6 2, 6 2, 6

12600 9600 9600 6600 6600

0.09 – 3, 5 3, 5 3, 5 2, 6 2, 6

9500 9500 9500 6600 6600

0.1 – 3, 5 3, 5 2, 5 2, 5 2, 5

9500 9500 6500 6500 6500

uration. Third, these experiments can help designers

understand what realistic performance guarantees they

can provide to their users (e.g., a delay bound of 300 s

will be difficult to provide together with a blocking

probability bound of 0.01).

7. Conclusion

With the present design of the Web, whenever the

client makes a request for accessing a Web site they

are often refused connection after waiting for a

significant amount of time. This leads to a growing

frustration among the users. From an electronic

commerce point of view, this is a tremendous loss

for sites that are conducting business over the Web

because these unsatisfied customers are not likely to

return to these sites again. A better way to handle

this situation would be to configure the Web site a

priori based on knowledge about bounds for average

waiting time and blocking probability. In this paper,

we provide an optimization-based formulation of the

problem of minimizing the cost of Web site design

when a given delay bound and blocking probability


bound is to be guaranteed to all users. The solution

obtained shows that for higher arrival rates and

lower service rates it will become increasingly diffi-

cult to satisfy stringent delay bounds. Also, when

the service distribution is hyperexponential with a

higher coefficient of variation, the optimal design

turns out to be more expensive under identical

operating conditions than that of the Erlang-2 service

distribution.

Future research can be conducted by extending this

model to the case where the low level implementation

details of the HTTP and TCP/IP are accounted for in

the model. Another extension can be to model the

situation where there are proxy caches in between the

requesting client and the server and hence a large

number of requests do not reach the server itself but

are satisfied by the intermediate caches. In fact, in an

extensive study conducted by considering the end-to-

end traffic between client sites distributed worldwide

and 700 servers to which majority of the traffic is

targeted, Krishnamurthy and Wills [12] has showed

that caching and multi-server content distribution can

improve performance of Web servers significantly if

done in an effective manner. One effective way to

improve performance of proxy caches is to use

prefetching. The prefetching technique relies on the

ability of the proxy cache server to predict which

cached documents a user might reference next, and

takes advantage of the idle time between user requests

to push or pull the documents to the user. Future

extensions should incorporate the idea of prefetching

as well as caching in the model of a Web server. In

recent literature such as Crovella and Bestavros [6], it

has been reported that the Web traffic often tend to be

self-similar and bursty and hence the ‘memoryless’

property of the exponential inter-arrival distribution of

a Poisson arrival process may or may not hold at all

times. The arrival process in this paper is assumed to

be Poisson but the model can be extended to the case

where the arrival process is modeled by a heavy-tailed

distribution to more accurately model the real-time

Web traffic.

Acknowledgements

The authors would like to thank the two anony-

mous referees for their various useful and important

suggestions that have helped improve the quality of

the paper to a great extent.

References

[1] V.A.F. Almeida, J.M. de Almeida, C.S. Murta, Performance

analysis of a web server, Proceedings of the 22nd International

Conference for the Resource Management and Performance

Evaluation of Enterprise Computing Systems, San Diego, CA,

1996, pp. 829–838.

[2] M. Arlitt, C.L. Williamson, Internet web servers: workload

characterization and performance implications, IEEE/ACM

Transactions on Networking 5 (1997) 631–645.

[3] L.M. Applegate, C.W. Holsapple, R. Kalakota, F.J. Rader-

macher, A.B. Whinston, Electronic commerce: building blocks

of new business opportunity, Journal of Organizational Com-

puting and Electronic Commerce 6 (1996) 1–10.

[4] T. Berners-Lee, R. Caillian, A. Luotonen, H.F. Nielsen, A.

Secret, The World-Wide Web, Communications of the ACM

37 (1994) 77–82.

[5] H.K. Bhargava, S. Sridhar, Design issues in configuring

servers on the World Wide Web, Proceedings of the First

INFORMS Conference on Information Systems and Technol-

ogy, Washington, DC, USA, 1996, pp. 204–208.

[6] M.E. Crovella, A. Bestavros, Self-similarity in World Wide

Web traffic: evidence and possible causes, IEEE/ACM Trans-

actions on Networking 5 (1997) 835–846.

[7] K. Eschenfelder, S.K. Wyman, J.C. Bertot, W.E. Moen, C.R.

McClure, Using log files to assess web-enabled information

systems usage, Proceedings of the Third Americas Confer-

ence on Information Systems, Indianapolis, IN, USA, 1997,

pp. 869–871.

[8] M.J. Fischer, D.A. Garbin, A. Gharakhanian, Performance

modeling of distributed automatic call distribution systems,

Telecommunication Systems 9 (1998) 133–152.

[9] R. Hariharan, P. Reeser, R. Van der Mei, Web server per-

formance modeling, Proceedings of the 4th INFORMS Con-

ference on Telecommunication, Boca Raton, FL, USA, 1998,

pp. 43–44.

[10] A. Iyengar, E. MacNair, T. Nguyen, An analysis of web

server performance, Proceedings of the IEEE Global Tele-

communications Conference, Phoenix, AZ, USA, 1997,

pp. 1943–1947.

[11] T. Kimura, A transform-free approximation for the finite

capacity M/G/s queue, Operations Research 44 (1996)

984–988.

[12] B. Krishnamurthy, C.E. Wills, Analyzing factors that influ-

ence end-to-end Web performance, Computer Networks 33

(2000) 17–32.

[13] C. Lu, T.F. Abdelzaher, J.A. Stankovic, S.H. Son, A feedback

control approach for guaranteeing relative delays in web serv-

ers, Proceedings of the 7th Real-Time Technology and Appli-

cations Symposium Taipei, Taiwan, 2001, pp. 51–62.

[14] N.J. Lightner, I. Bose, G. Salvendy, What is wrong with

the World-Wide-Web?: a diagnosis of some problems and


prescription of some remedies, Ergonomics 39 (1996)

995–1004.

[15] P.R. Selvidge, B.S. Chaparro, G.T. Bender, The world wide

wait: effects of delays on user performance, International Jour-

nal of Industrial Ergonomics 29 (2002) 15–20.

[16] L.P. Slothouber, A model of web server performance. Pro-

ceedings of the 5th International World Wide Web Confer-

ence, Paris, France.

[17] W. Whitt, Improving service by informing customers about

anticipated delays, Management Science 45 (1999) 192–207.

Indranil Bose is an Assistant Professor of Decision and Information

Sciences at the Warrington College of Business Administration,

University of Florida. His degrees include BTech (Electrical Engi-

neering) from Indian Institute of Technology, MS (Electrical and

Computer Engineering) from University of Iowa, MS (Industrial

Engineering) and PhD (Management Information Systems) from

Purdue University. He has research interests in telecommunications

design and policy issues, data mining and artificial intelligence,

electronic commerce, applied operations research and supply chain

management. His teaching interests are in telecommunications,

database management, systems analysis and design, and data

mining. His publications have appeared in Computers and Oper-

ations Research, Decision Support Systems, Ergonomics, European

Journal of Operational Research, Information and Management and

in the proceedings of numerous international and national confer-

ences.

Kemal Altinkemer received his PhD in Computers and Information

Systems from William E. Simon School of Business Administra-

tion, University of Rochester, Rochester, NY 14627 in 6/87. He is

currently an Associate Professor and the area coordinator for MIS at

the Krannert School of Management, Purdue University. His re-

search interests are Infrastructure for E-commerce and pricing of

information goods, bidding with intelligent software agents, strategy

from Brickandmortar to Clickandmortar business model, design and

analysis of local area networks, local access computer networks and

backbone networks, infrastucture development such as ATM, LEOS

systems such as TELEDESIC, distribution of priorities by using

pricing as a tool, and time restricted priority routing. He has

published numerous articles in journals such as Management

Science, Operations Research, INFORMS Journal on Computing,

EJOR and Transactions of the ACM.

Documents

Design of a web site for guaranteed delay and blocking probability bounds