Upload
indranil-bose
View
221
Download
5
Embed Size (px)
Citation preview
www.elsevier.com/locate/dsw
Decision Support Systems 38 (2004) 131–140
Design of a web site for guaranteed delay and blocking
probability bounds
Indranil Bosea,*, Kemal Altinkemerb,1
aDepartment of Decision and Information Sciences, Warrington College of Business Administration, University of Florida, 351 Stuzin Hall,
PO Box 117169, Gainesville, FL 32611, USAbKrannert Graduate School of Management, Purdue University, West Lafayette, IN 47907, USA
Received 1 February 2002; accepted 1 July 2002
Available online 28 June 2003
Abstract
A new mathematical programming model is proposed for minimizing the cost of design of a Web site by optimally
determining the number of servers and buffers when a performance guarantee in terms of the average waiting time and loss
probability is to be provided to users. The Web site is modeled as an M/G/c/N queuing system where requests for connections
represent arriving customers and the browsing of Web sites represents service received by customers. Numerical experiments
are conducted with different choices of problem parameters and the optimal design cost, and optimal number of servers and
buffers are obtained for these cases.
D 2003 Elsevier B.V. All rights reserved.
Keywords: Finite capacity; Loss probability; Queuing model; Waiting time; Web server
1. Introduction purposes such as providing useful context sensitive
The World-Wide-Web has helped in the sharing of
information between Internet users throughout the
world. It has become synonymous with a mega ware-
house of information. From Ref. [4] we know that,
though it was initially started as a project for enabling
easy exchange of information between researchers
who were geographically distant from each other, it
has now taken the role of an international information
superhighway. The Web has been used for different
0167-9236/$ - see front matter D 2003 Elsevier B.V. All rights reserved.
doi:10.1016/S0167-9236(03)00080-0
* Corresponding author. Tel.: +1-352-392-0648; fax: +1-352-
392-5438.
E-mail addresses: [email protected] (I. Bose),
[email protected] (K. Altinkemer).1 Tel.: +1-765-494-9009; fax: +1-765-494-1526.
information, allowing exchange of information within
and between organizations, and lately, for advertising,
selling and buying merchandise, which has been
referred to as electronic commerce, as in Ref. [3].
Whatever be the use, there is no denying that the Web
has already become a part of everyday life in a large
part of the world.
With the advent of user friendly browsers, the Web
has become a technology easy to use and understand.
However, in spite of the apparent simplicity of use,
users have reported several problems in the use of the
Web. Among the top three problems reported in Ref.
[14] are searching for specific information, speed of
data access and locating and navigating sites. More
recently, Selvidge et al. [15] studied the variable impact
I. Bose, K. Altinkemer / Decision Support Systems 38 (2004) 131–140132
of Web delays on user lostness, frustration and propor-
tion of task completion. Current technology does not
allow users to get an estimate of the amount of time
they would have to wait when downloading a docu-
ment but provides a real-time measure of the amount of
content that has been downloaded. Sometimes users
wait a significant amount of time and are then refused
connection. This is quite frustrating for the users and
may lead to inertia about visiting a specific site. Hence,
for Web site designers it is important to design the Web
site in a manner that neither the waiting time for a
requested connection nor is the chance that the user is
refused connection is too high.
2. Motivation
From Ref. [5], we know that the basic mechanism
of operation of the Web is the same as that of a client
server system. The three main components are a client
site that requests specific information, a Web site with
servers and buffers and a network connection that
allows communication between the client and the site.
The underlying network may be a corporate intranet
or it can be the Internet itself. Fig. 1 gives a schematic
representation of the overall configuration of a Web
server. In case of the client server model based on the
Internet the problem of communication is extremely
complex as it involves a large variety of client sites
Fig. 1. Client– server m
that request information from the server site. The
delay experienced by a user when requesting infor-
mation is a function of the client, the server and the
network. In most cases, the site provider has no
control over the network or the client sites.
In most cases, the configuration of the Web site is
done on an ad-hoc basis. The final configuration of
the site usually depends on the objectives of the site
provider. Some goals might be minimizing the cost of
server and buffer installation and operation, minimiz-
ing the number of lost requests for access to the site or
improving the response time experienced by the users
when downloading information from the site. Some of
the major issues that need to be considered in order to
meet these objectives are the number and type of
requests for service, the service time for a request, and
the load on the servers. At the same time, the Web site
administrator needs to have knowledge about the
minimum level of service guarantee to be provided
to the users.
We propose a new mathematical programming
formulation of the problem of optimally designing a
Web site by deciding on the optimal number of servers
and buffers to be installed at the site. The goal of the
design is to minimize the cost of installation of servers
and buffers such that certain service related perfor-
mance bounds are satisfied. The contribution of this
research is to show the applicability of queuing theory
in modeling a Web site and using the queuing model
odel of the Web.
I. Bose, K. Altinkemer / Decision Support Systems 38 (2004) 131–140 133
in a mathematical programming framework for effi-
cient design of a Web site.
The organization of the paper is as follows. In
Section 3, we provide a brief literature review on the
use of finite capacity queuing models for some
applications and list some research that uses queuing
theory for modeling performance of Web servers. In
Section 4, the optimization model for the design of the
Web server is described. Section 5 provides a deriva-
tion of the approximate expression for average wait-
ing time of customers in case of an M/G/c/N queuing
system, which is required to solve the optimization
problem developed in Section 4. Section 6 details the
numerical experiments conducted to obtain the opti-
mal cost and configuration of Web sites for different
service distributions and expected arrival rates and
service times. The conclusion and directions for future
research appear in Section 7.
3. Literature review
In the area of performance evaluation and charac-
terization of Web servers, simulation and statistical
analyses have been used predominantly. Using vari-
ous statistical measures, Arlitt and Williamson [2]
obtained document size distribution, document type
distribution, document referencing behavior and geo-
graphic distribution of requests using six different data
sets from various educational and commercial site
providers. Iyengar et al. [10] used simulation to
develop different tradeoffs between delay experienced
by the users and the percentage of requests that are
lost, under conditions of heavy traffic, from the Web
servers. A benchmarking method based on WebStone,
for understanding the performance metrics of a Web
server is discussed in Almeida et al. [1]. The use of
queuing theory for understanding the performance of
Web servers is reported in Slothouber [16]. In this
high level model that ignores the details of the HTTP
protocol, Web servers are modeled as an open queuing
network and the effect of various parameters such as
file size, server speed and network bandwidth on the
server response time is studied using analytical pro-
cedures. In an attempt to model the low level details
and interactions between the HTTP and the TCP/IP
protocols, Hariharan et al. [9] model Web servers as a
tandem queuing model consisting of three interacting
components and study the dependence and interaction
between these sub-components using simulation.
There is some similarity between this research and
that of Fischer et al. [8] where M/G/c/c queues are
used to model call arrivals at Automatic Call Distri-
bution centers and an alternate expression for calcu-
lating loss probabilities is suggested that requires
significantly less computation time. Whitt [17] dis-
cussed an interesting application of a telephone call
center where customer satisfaction was improved by
informing the customers about anticipated delays
before joining the M/M/c/N queuing system. Lu et
al. [13] have addressed the problem of management of
delays for different service classes on a Web server.
They used a feedback control theory-based approach
for designing the adaptive architecture for Web serv-
ers operating under HTTP 1.1, which could provide
relative delay guarantees for different service classes.
The above review shows that queuing theory has
proved to be a useful technique for analyzing Web
sites. Our paper takes a unique approach by embed-
ding a queuing theory-based model of a Web site in a
new mathematical programming-based formulation
for solving the problem of optimal configuration of
a Web site. Using the known results on approximate
analysis of M/G/c/N queues, we are able to determine
the optimal number of Web servers and buffers that a
designer should install at a Web site at a minimum
cost while providing a guaranteed level of service.
4. Model
In this research, a Web site is represented as a
queuing system, where the requests for connections
represent arriving customers and time spent browsing a
particular Web site is defined as service. This is a finite
capacity queuing model since the Web site can handle
only a limited number of requests. We can model the
requests for connections approximately as a Poisson
process with exponential inter-arrival time distribution.
The users request a connection and after getting a
connection, spend time browsing pages within a Web
site. Once their purpose is served they quit the system
(which may be defined as end of service). Different
users spend different amounts of time on the Web sites
browsingHTML pages and embeddedmultimedia files
of various sizes. Hence, the service can be assumed to
I. Bose, K. Altinkemer / Decision Support Systems 38 (2004) 131–140134
follow a ‘general’ distribution. Since each browsing
activity involves browsing a number of Web pages and
the time spent browsing each page is random, the total
time spent in browsing all the Web pages during a
single visit can be assumed to follow a general distri-
bution as well (since the sum of general distributions is
a general distribution). Every Web site has a fixed
number of servers and also has a limit on the number
of connections that it can store in its buffer for future
service. Once all servers become busy, all subsequent
requests for connections are buffered in the TCP/IP
listen queue and they wait for the server to be free. If the
waiting spaces are all occupied, then the incoming
requests for connection are refused. This is known as
blocking of the Web site. The queuing model is an M/
G/c/N queuing system where customers are lost from
the system once all servers as well as the waiting spaces
in the system become busy. The queuing model of a
Web site is depicted in Fig. 2.
The goal of the Web site designer is to minimize
the total cost of installation of the Web site and to
provide a desired performance guarantee to the users.
The performance is guaranteed in terms of the average
waiting time of a connection and the blocking prob-
ability of the connection.
Notation:
c=Number of available Web servers on a site
N =Number of buffers
Z =Maximum allowable average waiting time of a
connection specified by the designer (s)
X =Maximum allowable blocking probability of a
connection specified by the designer
Fig. 2. A queuing represen
a =Cost of a server (US$)
b =Cost of a buffer (US$)
W=Average time spent to fulfill a request for
connection including waiting and connection (s)
D =Blocking probability of a connection.
Problem P: Minimize ac + bN.
Subject to:
WVZ ð1Þ
DVX ð2Þ
cz0 ð3Þ
Nz0 ð4Þ
c, N are integers.
In order to solve this problem, the designer has to
obtain closed form expressions for W and D. To the
best of our knowledge, no exact closed-form expres-
sions are available for the average waiting time and
blocking probability of customers in case of an M/G/
c/N queue. In the next section, we provide a deriva-
tion for approximate closed-form analytical expres-
sions for W and D.
5. Average waiting time and blocking probability
In this section, we derive an approximate expres-
sion for the waiting time of customers and the blocking
probability in case of an M/G/c/N queue with a single
class of service and under heavy traffic condition. This
tation of a Web site.
I. Bose, K. Altinkemer / Decision Support Systems 38 (2004) 131–140 135
is required for solving Problem P described in the
earlier section. Multiserver queues with ‘general’ ser-
vice are difficult to analyze. Although closed form
expressions are available for the M/M/c/N queues, the
‘general’ service distribution for an M/G/c/N queue
makes it difficult to obtain exact analytical results.
Additional notation:
k =Arrival rate of incoming requests for connec-
tion to a Web site (requests/s)
s = Service time for each incoming connection (s)
E(s) =Average time spent by a connection at a Web
site for browsing activity (s)
Pj = Probability that there are j servers busy at a
Web site where j= 1, 2, . . ., cLq =Average number of requests for connection
waiting to be serviced in the buffers
Wq = Average waiting time experienced by a
request for a connection (s)
q = kE(s).
According to Kimura [11], the steady-state proba-
bility that j servers among c servers remain busy at
any time in case of an M/G/c/N queue is given by:
Pj ¼
ðcqÞsP0
j!; j ¼ 1; . . . ; c� 1
ðcqÞc
c!
1� n1� q
nj�cP0; j ¼ c; cþ 1; . . . ; cþ N � 1
ðcqÞc
c!nNP0; j ¼ cþ N
8>>>>>>><>>>>>>>:
ð5Þ
where
P0 ¼Xc�1
k¼0
ðcqÞk
k!þ ðcqÞc
c!
1� qnN
1� q
( )�1
ð6Þ
n ¼ kEðsÞRG
c� fkEðsÞ þ kEðsÞRGgð7Þ
RG ¼ Expected waiting time for general service distribution
Expected waiting time for exponential service
ð8Þ
Exact expressions for RG are difficult to obtain.
However, an asymptotic result on RG under the
condition of heavy traffic (i.e., q! 1) is available
from Kimura [11]. It states that
Limq!1
RG ¼ 1þ c2v2
ð9Þ
where cv is the coefficient of variation of the under-
lying service distribution. It is stated in Kimura [11]
that ‘‘the approximation is exact for the cases with
either no extra waiting space, exponential service-time
distribution, or a certain two-parameter family of
service-time distribution’’. Hence, the approximation
is valid in our case with the only limiting condition
that there is heavy traffic in the system and q! 1.
The value for Pj from Eqs. (5) and (6) can be used
together with the Poisson Arrivals See Time Averages
(PASTA) property to calculate the number of requests
for connection that are lost (D) because the servers
and the buffers remain busy. This is same as the
blocking probability of a connection. We find,
D ¼ PcþN ð10Þ
The expression for D is used in constraint (2) of the
optimization Problem P described in Section 4. The
information conveyed by this formula is important if
the Web site provider needs to keep track of the
number of connections that are lost. Every site will
possibly have a known percentage of ‘lost customers’
that they can tolerate due to unavailability of buffers.
Once D exceeds that value, this might give a signal
that the site is getting ‘too busy’.
Another figure of merit is the average number of
requests for connection waiting to be serviced in the
buffers. Using simple algebra, the expression for Lq is
given by
Lq ¼XcþN
n¼cþ1
ðn� cÞPn ð11Þ
Using Little’s Law, the average waiting time of the
request for connection is given by
Wq ¼Lq
kð1� PcþN Þð12Þ
Again, the total time spent by the request for waiting
in the buffer as well as completing the service of
downloading the required Web pages is then easily
I. Bose, K. Altinkemer / Decision Support Systems 38 (2004) 131–140136
obtained by adding the average queuing time to the
average service time. In other words,
W ¼ Wq þ EðsÞ ð13Þ
This expression for W is used in constraint (1) for
solving the optimization Problem P detailed in Sec-
tion 4.
6. Numerical experiments
The goal of the numerical experiments reported in
this section is to obtain the solution to the nonlinear
optimization Problem P in terms of the total cost of
installation of servers and buffers, optimal number of
servers and optimal number of buffers. The designer
provides known bounds for the blocking probability
and average waiting time of a connection. We exper-
iment with two different distributions—Erlang-2 (with
coefficient of variation 0.5) and hyperexponential
(with coefficient of variation 1.0) to represent the
‘general’ service distribution of the model and for
two pairs of values for E(s) and k. Physically, the Web
server is a commercially available computer server
that can host a Web site and hence we assume the unit
cost of server (a) to be US$3000. The buffer is
equivalent to a hard disk that can be used for storage
of requests and the unit cost of buffer (b) is assumed
to be US$100.
Since Problem P is a nonlinear optimization prob-
lem it could not be solved using any commercially
available software. We used an indirect approach for
solving this problem. Given a known delay bound and
a blocking probability bound we first completely
enumerated the combination of c and N values (inte-
gers) that satisfy constraints (1) and (2). Next we
choose the pair of values that resulted in the minimum
value of the objective function as the optimal solution.
The program for obtaining the solution was coded
using MATLAB 5.3 and was run on a Pentium III 650
MHz personal computer.
An important step in conducting the numerical
experiments is to estimate the parameters k and E(s).
Estimates of these parameters can be obtained from
the log-files associated with a Web site. As noted by
Eschenfelder et al. [7], the estimates can be obtained
from the access log file that lists the IP address of the
user, data and time of the access and user action taken
during the access period including the timestamp of
the last activity of the user on the Web site. For our
numerical experiments, we studied the log-files of
several Web sites to obtain realistic values for the
parameters k and E(s).
In the first experiment, we assume a heavy traffic
load on the server and let k= 170 requests/s and
E(s) = 175 s. The service distribution is Erlang-2.
The blocking probability is varied from 0.01 to 0.1
and the average waiting time is varied from 300 to
1000 s. No feasible solution to the optimization
problem can be found if the blocking probability is
less than 0.01 and the average waiting time is less than
300 s. The most expensive scenario for design
involves 23 servers and 40 buffers, with a total design
cost of US$73000. The least cost of design, i.e.,
US$3500, is obtained when the blocking probability
is 0.1 and the average waiting time varies between
600 and 1000 s. The results of this experiment are
shown in Table 1.
In the second experiment, all parameter values are
kept unchanged except the service distribution is
chosen to be hyperexponential. It is to be noted that
the specific distribution function is not used in calcu-
lation of the problem parameters. The coefficient of
variation of the service distribution is used for com-
putation of D and W. In Table 2, no optimal solution
can be obtained for a blocking probability value of
0.01. This is so because there is no feasible solution
available that satisfies a loss probability bound of 0.01
as well as a delay bound between 300 and 1000 s.
This goes to show that a loss probability bound of
0.01 (i.e., only 1% of the requests are rejected) is
extremely stringent and generally cannot be provided
by a Web server while providing a delay bound within
a tolerable limit. It is also observed from the second
experiment that the optimal design turns out to be
more expensive in the case of the hyperexponential
service distribution than that of the Erlang-2 distribu-
tion.
The third set of experiments is conducted for
k = 340 requests/s, E(s) = 350 s and for Erlang-2 and
hyperexponential distributions and the results are
reported in Tables 3 and 4, respectively. From Tables
3 and 4, we see that when the arrival rate is doubled
and the service rate is halved it becomes increasingly
difficult to obtain an optimal solution to the optimi-
Table 1
Number of servers, number of buffers and total installation cost (US$) for various average delay bounds (s) and loss probability bounds for
Erlang-2 service distribution with E(s) = 175 s and k= 170/s
Loss Delay
probability300 400 500 600 700 800 900 1000
0.01 23, 40 14, 41 10, 41 8, 41 6, 42 5, 42 5, 42 4, 42
73000 46100 34100 28100 22200 19200 19200 16200
0.02 14, 23 8, 23 6, 24 5, 24 4, 24 4, 24 3, 24 3, 24
44300 26300 20400 17400 14400 14400 11400 11400
0.03 4, 39 6, 16 5, 16 4, 17 3, 17 3, 17 2, 17 2, 27
15900 19600 16600 13700 10700 10700 7700 7700
0.04 4, 35 5, 12 4, 13 3, 13 2, 13 2, 13 2, 13 2, 13
15500 16200 13300 10300 7300 7300 7300 7300
0.05 4, 33 4, 10 3, 10 2, 10 2, 10 2, 10 2, 10 2, 10
15300 13000 10000 7000 7000 7000 7000 7000
0.06 3, 31 3, 8 3, 8 2, 9 2, 9 2, 9 2, 9 1, 9
12100 9800 9800 6900 6900 6900 6900 3900
0.07 3, 31 3, 7 2, 7 2, 7 2, 7 2, 7 1, 8 1, 8
12100 9700 6700 6700 6700 6700 3800 3800
0.08 3, 30 3, 6 2, 6 2, 6 2, 6 1, 7 1, 7 1, 7
12000 9600 6600 6600 6600 3700 3700 3700
0.09 3, 5 3, 5 2, 6 2, 6 1, 6 1, 6 1, 6 1, 6
9500 9500 6600 6600 3600 3600 3600 3600
0.1 3, 5 2, 5 2, 5 1, 5 1, 5 1, 5 1, 5 1, 5
9500 6500 6500 3500 3500 3500 3500 3500
Table 2
Number of servers, number of buffers and total installation cost (US$) for various average delay bounds (s) and loss probability bounds for
hyperexponential service distribution with E(s) = 175 s and k= 170/s
Loss Delay
probability300 400 500 600 700 800 900 1000
0.01 – – – – – – – –
0.02 21, 35 12, 36 9, 37 7, 37 6, 37 5, 38 5, 38 4, 38
66500 39600 30700 24700 21700 18800 18800 15800
0.03 5, 46 9, 25 7, 25 5, 26 4, 26 4, 26 3, 26 3, 26
19600 29500 23500 17600 14600 14600 11600 11600
0.04 4, 39 5, 49 5, 19 4, 20 4, 20 3, 20 3, 20 2, 20
15900 19900 16900 14000 14000 11000 11000 8000
0.05 4, 35 6, 15 4, 16 4, 16 3, 16 3, 16 2, 16 2, 16
15500 19500 13600 13600 10600 10600 7600 7600
0.06 4, 33 5, 12 4, 13 3, 13 3, 13 2, 13 2, 13 2, 13
15300 16200 13300 10300 10300 7300 7300 7300
0.07 4, 32 4, 11 3, 11 3, 11 2, 11 2, 11 2, 11 2, 11
15200 13100 10100 10100 7100 7100 7100 7100
0.08 4, 31 4, 9 3, 9 2, 10 2, 10 2, 10 2, 10 2, 10
15100 12900 9900 7000 7000 7000 7000 7000
0.09 4, 31 3, 8 3, 8 2, 9 2, 9 2, 9 2, 9 1, 9
15100 9800 9800 6900 6900 6900 6900 3900
0.1 3, 30 3, 7 3, 7 2, 8 2, 8 2, 8 1, 8 1, 8
12000 9700 9700 6800 6800 6800 3800 3800
I. Bose, K. Altinkemer / Decision Support Systems 38 (2004) 131–140 137
Table 4
Number of servers, number of buffers and total installation cost
(US$) for various average delay bounds (s) and loss probability
bounds for hyperexponential service distribution with E(s) = 350 s
and k= 340/s
Loss Delay
probability500 600 700 800 900 1000
0.01 – – – – – –
0.02 – 21, 35 16, 35 12, 36 11, 36 9, 37
66500 51500 39600 36600 30700
0.03 – 12, 49 11, 25 9, 25 7, 25 7, 25
40900 35500 29500 23500 23500
0.04 – 10, 41 9, 18 7, 19 6, 19 5, 19
34100 28800 22900 19900 16900
0.05 – 9, 14 7, 15 6, 15 5, 15 4, 16
28400 22500 19500 16500 13600
0.06 – 8, 12 6, 12 5, 12 4, 13 4, 13
25200 19200 16200 13300 13300
0.07 – 6, 10 5, 10 4, 11 4, 11 3, 11
19000 16000 13100 13100 10100
0.08 – 6, 9 4, 9 4, 9 3, 9 3, 9
18900 12900 12900 9900 9900
0.09 – 5, 8 4, 8 3, 8 3, 8 3, 8
15800 12800 9800 9800 9800
0.1 – 5, 7 4, 7 3, 7 3, 7 3, 7
15700 12700 9700 9700 9700
I. Bose, K. Altinkemer / Decision Support Systems 38 (2004) 131–140138
zation problem for stricter delay bounds. This is the
reason why our solution method is not able to find
any solution for an average delay bound < 600 s.
Also, as observed in the paragraph above, we again
note that the optimal design turns out to be more
expensive in the case of the hyperexponential service
distribution.
These results can be of use to a Web site designer
in a number of ways. First, if the designer is operating
under a given budget, (s)he can decide how many
servers and buffers to procure to provide the best
quality of service. For some designers, blocking
probability will be of more concern than average
delay and (s)he can choose to operate with a stringent
bound for blocking probability and loose bound for
average delay. Depending on the criterion, (s)he will
be able to operate on different cells of Table 1, 2, 3 or
4. Second, an important aspect of these experiments is
that even if the designer has no knowledge about the
service distribution of the various users, (s)he can still
use the results of the hyperexponential and Erlang-2
distributions to solve the problem of Web site design
to get an idea about the approximate cost of config-
Table 3
Number of servers, number of buffers and total installation cost
(US$) for various average delay bounds (s) and loss probability
bounds for Erlang-2 service distribution with E(s) = 350 s and
k= 340/s
Loss Delay
probability500 600 700 800 900 1000
0.01 – 23, 40 17, 40 14, 41 11, 41 10, 41
73000 55000 46100 37100 34100
0.02 – 14, 23 10, 23 8, 23 7, 23 6, 24
44300 32300 26300 23300 20400
0.03 – 10, 16 7, 16 6, 16 5, 16 5, 16
31600 22600 19600 16600 16600
0.04 – 8, 12 6, 12 5, 12 4, 13 4, 13
25200 19200 16200 13300 13300
0.05 – 6, 10 5, 10 4, 10 3, 10 3, 10
19000 16000 13000 10000 10000
0.06 – 5, 8 4, 8 3, 8 3, 8 3, 8
15800 12800 9800 9800 9800
0.07 – 5, 7 4, 7 3, 7 2, 7 2, 7
15700 12700 9700 6700 6700
0.08 – 4, 6 3, 6 3, 6 2, 6 2, 6
12600 9600 9600 6600 6600
0.09 – 3, 5 3, 5 3, 5 2, 6 2, 6
9500 9500 9500 6600 6600
0.1 – 3, 5 3, 5 2, 5 2, 5 2, 5
9500 9500 6500 6500 6500
uration. Third, these experiments can help designers
understand what realistic performance guarantees they
can provide to their users (e.g., a delay bound of 300 s
will be difficult to provide together with a blocking
probability bound of 0.01).
7. Conclusion
With the present design of the Web, whenever the
client makes a request for accessing a Web site they
are often refused connection after waiting for a
significant amount of time. This leads to a growing
frustration among the users. From an electronic
commerce point of view, this is a tremendous loss
for sites that are conducting business over the Web
because these unsatisfied customers are not likely to
return to these sites again. A better way to handle
this situation would be to configure the Web site a
priori based on knowledge about bounds for average
waiting time and blocking probability. In this paper,
we provide an optimization-based formulation of the
problem of minimizing the cost of Web site design
when a given delay bound and blocking probability
I. Bose, K. Altinkemer / Decision Support Systems 38 (2004) 131–140 139
bound is to be guaranteed to all users. The solution
obtained shows that for higher arrival rates and
lower service rates it will become increasingly diffi-
cult to satisfy stringent delay bounds. Also, when
the service distribution is hyperexponential with a
higher coefficient of variation, the optimal design
turns out to be more expensive under identical
operating conditions than that of the Erlang-2 service
distribution.
Future research can be conducted by extending this
model to the case where the low level implementation
details of the HTTP and TCP/IP are accounted for in
the model. Another extension can be to model the
situation where there are proxy caches in between the
requesting client and the server and hence a large
number of requests do not reach the server itself but
are satisfied by the intermediate caches. In fact, in an
extensive study conducted by considering the end-to-
end traffic between client sites distributed worldwide
and 700 servers to which majority of the traffic is
targeted, Krishnamurthy and Wills [12] has showed
that caching and multi-server content distribution can
improve performance of Web servers significantly if
done in an effective manner. One effective way to
improve performance of proxy caches is to use
prefetching. The prefetching technique relies on the
ability of the proxy cache server to predict which
cached documents a user might reference next, and
takes advantage of the idle time between user requests
to push or pull the documents to the user. Future
extensions should incorporate the idea of prefetching
as well as caching in the model of a Web server. In
recent literature such as Crovella and Bestavros [6], it
has been reported that the Web traffic often tend to be
self-similar and bursty and hence the ‘memoryless’
property of the exponential inter-arrival distribution of
a Poisson arrival process may or may not hold at all
times. The arrival process in this paper is assumed to
be Poisson but the model can be extended to the case
where the arrival process is modeled by a heavy-tailed
distribution to more accurately model the real-time
Web traffic.
Acknowledgements
The authors would like to thank the two anony-
mous referees for their various useful and important
suggestions that have helped improve the quality of
the paper to a great extent.
References
[1] V.A.F. Almeida, J.M. de Almeida, C.S. Murta, Performance
analysis of a web server, Proceedings of the 22nd International
Conference for the Resource Management and Performance
Evaluation of Enterprise Computing Systems, San Diego, CA,
1996, pp. 829–838.
[2] M. Arlitt, C.L. Williamson, Internet web servers: workload
characterization and performance implications, IEEE/ACM
Transactions on Networking 5 (1997) 631–645.
[3] L.M. Applegate, C.W. Holsapple, R. Kalakota, F.J. Rader-
macher, A.B. Whinston, Electronic commerce: building blocks
of new business opportunity, Journal of Organizational Com-
puting and Electronic Commerce 6 (1996) 1–10.
[4] T. Berners-Lee, R. Caillian, A. Luotonen, H.F. Nielsen, A.
Secret, The World-Wide Web, Communications of the ACM
37 (1994) 77–82.
[5] H.K. Bhargava, S. Sridhar, Design issues in configuring
servers on the World Wide Web, Proceedings of the First
INFORMS Conference on Information Systems and Technol-
ogy, Washington, DC, USA, 1996, pp. 204–208.
[6] M.E. Crovella, A. Bestavros, Self-similarity in World Wide
Web traffic: evidence and possible causes, IEEE/ACM Trans-
actions on Networking 5 (1997) 835–846.
[7] K. Eschenfelder, S.K. Wyman, J.C. Bertot, W.E. Moen, C.R.
McClure, Using log files to assess web-enabled information
systems usage, Proceedings of the Third Americas Confer-
ence on Information Systems, Indianapolis, IN, USA, 1997,
pp. 869–871.
[8] M.J. Fischer, D.A. Garbin, A. Gharakhanian, Performance
modeling of distributed automatic call distribution systems,
Telecommunication Systems 9 (1998) 133–152.
[9] R. Hariharan, P. Reeser, R. Van der Mei, Web server per-
formance modeling, Proceedings of the 4th INFORMS Con-
ference on Telecommunication, Boca Raton, FL, USA, 1998,
pp. 43–44.
[10] A. Iyengar, E. MacNair, T. Nguyen, An analysis of web
server performance, Proceedings of the IEEE Global Tele-
communications Conference, Phoenix, AZ, USA, 1997,
pp. 1943–1947.
[11] T. Kimura, A transform-free approximation for the finite
capacity M/G/s queue, Operations Research 44 (1996)
984–988.
[12] B. Krishnamurthy, C.E. Wills, Analyzing factors that influ-
ence end-to-end Web performance, Computer Networks 33
(2000) 17–32.
[13] C. Lu, T.F. Abdelzaher, J.A. Stankovic, S.H. Son, A feedback
control approach for guaranteeing relative delays in web serv-
ers, Proceedings of the 7th Real-Time Technology and Appli-
cations Symposium Taipei, Taiwan, 2001, pp. 51–62.
[14] N.J. Lightner, I. Bose, G. Salvendy, What is wrong with
the World-Wide-Web?: a diagnosis of some problems and
I. Bose, K. Altinkemer / Decision Support Systems 38 (2004) 131–140140
prescription of some remedies, Ergonomics 39 (1996)
995–1004.
[15] P.R. Selvidge, B.S. Chaparro, G.T. Bender, The world wide
wait: effects of delays on user performance, International Jour-
nal of Industrial Ergonomics 29 (2002) 15–20.
[16] L.P. Slothouber, A model of web server performance. Pro-
ceedings of the 5th International World Wide Web Confer-
ence, Paris, France.
[17] W. Whitt, Improving service by informing customers about
anticipated delays, Management Science 45 (1999) 192–207.
Indranil Bose is an Assistant Professor of Decision and Information
Sciences at the Warrington College of Business Administration,
University of Florida. His degrees include BTech (Electrical Engi-
neering) from Indian Institute of Technology, MS (Electrical and
Computer Engineering) from University of Iowa, MS (Industrial
Engineering) and PhD (Management Information Systems) from
Purdue University. He has research interests in telecommunications
design and policy issues, data mining and artificial intelligence,
electronic commerce, applied operations research and supply chain
management. His teaching interests are in telecommunications,
database management, systems analysis and design, and data
mining. His publications have appeared in Computers and Oper-
ations Research, Decision Support Systems, Ergonomics, European
Journal of Operational Research, Information and Management and
in the proceedings of numerous international and national confer-
ences.
Kemal Altinkemer received his PhD in Computers and Information
Systems from William E. Simon School of Business Administra-
tion, University of Rochester, Rochester, NY 14627 in 6/87. He is
currently an Associate Professor and the area coordinator for MIS at
the Krannert School of Management, Purdue University. His re-
search interests are Infrastructure for E-commerce and pricing of
information goods, bidding with intelligent software agents, strategy
from Brickandmortar to Clickandmortar business model, design and
analysis of local area networks, local access computer networks and
backbone networks, infrastucture development such as ATM, LEOS
systems such as TELEDESIC, distribution of priorities by using
pricing as a tool, and time restricted priority routing. He has
published numerous articles in journals such as Management
Science, Operations Research, INFORMS Journal on Computing,
EJOR and Transactions of the ACM.